How can I change the status of my nodes from NotReady or Unknown status to Ready status?
My Amazon Elastic Kubernetes Service (Amazon EKS) worker nodes are in NotReady or Unknown status. I want to get my worker nodes back in Ready status again.
Short description
You can't schedule pods on a node that's in NotReady or Unknown status. You can schedule pods only on a node that's in Ready status.
The following resolution addresses nodes in NotReady or Unknown status.
If your node is in the MemoryPressure, DiskPressure, or PIDPressure status, then you must manage your resources to allow additional pods to be scheduled on the node. If your node is in NetworkUnavailable status, then you must properly configure the network on the node. For more information, see Node status on the Kubernetes website.
Note: For information on managing pod evictions and resource limits, see Node-pressure eviction on the Kubernetes website.
Resolution
Check the aws-node and kube-proxy pods to see why the nodes are in NotReady status
A node in NotReady status isn't available for pods to be scheduled on.
The managed node group stopped attaching the Container Network Interface (CNI) policy to the node role's Amazon Resource Name (ARN) to improve the security posture. This causes nodes to change to NotReady status because of a missing CNI policy.
1. To see if the aws-node pod is in the error state, run the following command:
$ kubectl get pods -n kube-system -o wide
To resolve this issue, follow the guidelines to set up IAM Roles for Service Accounts (IRSA) for aws-node DaemonSet.
2. To check the status of your aws-node and kube-proxy pods, run the following command:
$ kubectl get pods -n kube-system -o wide
3. Check the status of the aws-node and kube-proxy pods by reviewing the output from step 1.
Note: The aws-node and kube-proxy pods are managed by a DaemonSet. This means that each node in the cluster must have one aws-node and kube-proxy pod running on it. If no aws-node or kube-proxy pods are listed, skip to step 4. For more information, see DaemonSet on the Kubernetes website.
If your node status is normal, then your aws-node and kube-proxy pods should be in Running status. For example:
$ kubectl get pods -n kube-system -o wide NAME READY STATUS RESTARTS AGE IP NODE aws-node-qvqr2 1/1 Running 0 4h31m 192.168.54.115 ip-192-168-54-115.ec2.internal kube-proxy-292b4 1/1 Running 0 4h31m 192.168.54.115 ip-192-168-54-115.ec2.internal
If either pod is in a status other than Running, run the following command:
$ kubectl describe pod yourPodName -n kube-system
4. To get additional information from the aws-node and kube-proxy pod logs, run the following command:
$ kubectl logs yourPodName -n kube-system
The logs and the events from the describe output can show why the pods aren't in Running status. For a node to change to Ready status, both the aws-node and kube-proxy pods must be Running on that node.
Note: The name of the pods can differ from aws-node-qvqr2 and kube-proxy-292b4, as shown in the preceding examples.
5. If the aws-node and kube-proxy pods aren't listed after running the command from step 1, then run the following commands:
$ kubectl describe daemonset aws-node -n kube-system
$ kubectl describe daemonset kube-proxy -n kube-system
6. Search the output of the commands in step 4 for a reason why the pods can't be started.
Tip: You can search the Amazon EKS control plane logs for information on why the pods can't be scheduled.
7. Confirm that the versions of aws-node and kube-proxy are compatible with the cluster version based on AWS guidelines. For example, you can run the following commands to check the pod versions:
$ kubectl describe daemonset aws-node --namespace kube-system | grep Image | cut -d "/" -f 2 $ kubectl get daemonset kube-proxy --namespace kube-system -o=jsonpath='{$.spec.template.spec.containers[:1].image}'
Note: To update the aws-node version, see Managing the Amazon VPC CNI plugin for Kubernetes add-on. To update the kube-proxy version, follow step 4 in Update the Kubernetes version for your Amazon EKS cluster.
In some scenarios, the node can be in Unknown status. This means that the kubelet on the node can't communicate with the control plane with the correct status of the node.
To troubleshoot nodes in Unknown status, complete the steps in the following sections:
- Check the network configuration between nodes and the control plane
- Check the status of the kubelet
- Check that the Amazon EC2 API endpoint is reachable
Check the network configuration between nodes and the control plane
1. Confirm that there are no network access control list (ACL) rules on your subnets blocking traffic between the Amazon EKS control plane and your worker nodes.
2. Confirm that the security groups for your control plane and nodes comply with minimum inbound and outbound requirements.
3. (Optional) If your nodes are configured to use a proxy, confirm that the proxy is allowing traffic to the API server endpoints.
4. To verify that the node has access to the API server, run the following netcat command from inside the worker node:
$ nc -vz 9FCF4EA77D81408ED82517B9B7E60D52.yl4.eu-north-1.eks.amazonaws.com 443 Connection to 9FCF4EA77D81408ED82517B9B7E60D52.yl4.eu-north-1.eks.amazonaws.com 443 port [tcp/https] succeeded!
Important: Replace 9FCF4EA77D81408ED82517B9B7E60D52.yl4.eu-north-1.eks.amazonaws.com with your API server endpoint.
5. Check that the route tables are configured correctly to allow communication with the API server endpoint through either an internet gateway or NAT gateway. If the cluster makes use of PrivateOnly networking, verify that the VPC endpoints are configured correctly.
Check the status of the kubelet
1. Use SSH to connect to the affected worker node.
2. To check the kubelet logs, run the following command:
$ journalctl -u kubelet > kubelet.log
Note: The kubelet.log file contains information on kubelet operations that can help you find the root cause of the node status issue.
If the logs don't provide information on the source of the issue, then run the following command to check the status of the kubelet on the worker node:
$ sudo systemctl status kubelet kubelet.service - Kubernetes Kubelet Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled) Drop-In: /etc/systemd/system/kubelet.service.d └─10-eksclt.al2.conf Active: inactive (dead) since Wed 2019-12-04 08:57:33 UTC; 40s ago
If the kubelet isn't in the Running status, then run the following command to restart the kubelet:
$ sudo systemctl restart kubelet
Check that the Amazon EC2 API endpoint is reachable
1. Use SSH to connect to one of the worker nodes.
2. To check if the Amazon Elastic Compute Cloud (Amazon EC2) API endpoint for your AWS Region is reachable, run the following command:
$ nc -vz ec2.<region>.amazonaws.com 443 Connection to ec2.us-east-1.amazonaws.com 443 port [tcp/https] succeeded!
Important: Replace us-east-1 with the AWS Region where your worker node is located.
Check the worker node instance profile and the ConfigMap
1. Confirm that the worker node instance profile has the recommended policies.
2. Confirm that the worker node instance role is in the aws-auth ConfigMap. To check the ConfigMap, run the following command:
$ kubectl get cm aws-auth -n kube-system -o yaml
The ConfigMap should have an entry for the worker node instance AWS Identity and Access Management (IAM) role. For example:
apiVersion: v1 kind: ConfigMap metadata: name: aws-auth namespace: kube-system data: mapRoles: | - rolearn: <ARN of instance role (not instance profile)> username: system:node:{{EC2PrivateDNSName}} groups: - system:bootstrappers - system:nodes
Related videos

Relevant content
- asked 4 years agolg...
- asked 7 months agolg...
- asked 2 months agolg...
- asked a year agolg...
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 4 months ago
- AWS OFFICIALUpdated 4 months ago
- AWS OFFICIALUpdated a year ago