Knowledge Center Monthly Newsletter - July 2025
Stay up to date with the latest from the Knowledge Center. See all new Knowledge Center articles published in the last month, and re:Post’s top contributors.
How do I troubleshoot EKS Auto Mode built-in node pools with Unknown Status
Resolve issues with EKS Auto Mode built-in node pools showing an Unknown status after disabling and re-enabling them.
Short description
When working with Amazon EKS (Elastic Kubernetes Service) Auto Mode clusters, you may encounter a situation where built-in node pools show an "Unknown" status after disabling and re-enabling them. This article provides steps to troubleshoot and resolve this issue.
Prerequisites
- An Amazon EKS cluster using Auto Mode
- Install kubectl
- Install and configure the latest version of the AWS Command Line Interface (AWS CLI).
- Install eksctl
- Necessary IAM permissions to manage EKS clusters and node groups
Resolution
When troubleshooting EKS Auto Mode built-in node pools with Unknown status, consider the following possible causes:
- Failed resolving NodeClass
- NodeClass is Terminating
- NodeClass/NodePool remaining in "AwaitingReconciliation" state
- EKS Auto Mode Node IAM Role missing necessary permission
- EKS Auto Mode Cluster IAM Role missing necessary permission
To resolve the issue, you will need to safely delete the built-in node pools and the default node class. Follow these steps to troubleshoot and resolve with their respective solutions:
Failed resolving NodeClass or NodeClass is Terminating:
Verify the node pool status:
- Use the
kubectl
to check the status of your node pools:
kubectl get nodepool
- If the built-in node pools exists, describe the node pools to view the events causing the error using the following command:
kubectl describe nodepool general-purpose
kubectl describe nodepool system
- Check for any errors or misconfigurations relating to the default NodeClass.
- Check the status of the NodeClass:
kubectl describe nodeclass default
(Optional) If the default built-in node pools does not exist, manually create node pools to detect the possible events causing the error using the following command:
- Copy and paste the sample general-purpose node pool manifest. Name the file general-purpose.yaml
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
labels:
app.kubernetes.io/managed-by: eks
name: general-purpose
spec:
disruption:
budgets:
- nodes: 10%
consolidateAfter: 30s
consolidationPolicy: WhenEmptyOrUnderutilized
template:
metadata: {}
spec:
expireAfter: 336h
nodeClassRef:
group: eks.amazonaws.com
kind: NodeClass
name: default
requirements:
- key: karpenter.sh/capacity-type
operator: In
values:
- on-demand
- key: eks.amazonaws.com/instance-category
operator: In
values:
- c
- m
- r
- key: eks.amazonaws.com/instance-generation
operator: Gt
values:
- "4"
- key: kubernetes.io/arch
operator: In
values:
- amd64
- key: kubernetes.io/os
operator: In
values:
- linux
terminationGracePeriod: 24h0m0s
- Copy and paste the sample system node pool manifest. Name the file system.yaml
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
labels:
app.kubernetes.io/managed-by: eks
name: system
spec:
disruption:
budgets:
- nodes: 10%
consolidateAfter: 30s
consolidationPolicy: WhenEmptyOrUnderutilized
template:
metadata: {}
spec:
expireAfter: 336h
nodeClassRef:
group: eks.amazonaws.com
kind: NodeClass
name: default
requirements:
- key: karpenter.sh/capacity-type
operator: In
values:
- on-demand
- key: eks.amazonaws.com/instance-category
operator: In
values:
- c
- m
- r
- key: eks.amazonaws.com/instance-generation
operator: Gt
values:
- "4"
- key: kubernetes.io/arch
operator: In
values:
- amd64
- arm64
- key: kubernetes.io/os
operator: In
values:
- linux
taints:
- effect: NoSchedule
key: CriticalAddonsOnly
terminationGracePeriod: 24h0m0s
- Create the node pools
kubectl apply -f general-purpose.yaml
kubectl apply -f system.yaml
- Repeat the steps above to describe the node pools to view the events causing the error.
- Review the Status in the output. See similar output below:
Status:
Conditions:
Last Transition Time: 2025-05-22T12:50:00Z
Message: NodeClassReady=False
Observed Generation: 1
Reason: UnhealthyDependents
Status: False
Type: Ready
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning 11s (x46 over 90m) karpenter Failed resolving NodeClass
- Delete the built-in Node pools
*%* kubectl delete nodepool general-purpose
*%* kubectl delete nodepool system
- Disable the built-in Node Pools using the command below:
aws eks update-cluster-config \
--name <cluster-name> \
--compute-config '{
"enabled": true,
"nodePools": []
}' \
--kubernetes-network-config '{
"elasticLoadBalancing":{"enabled": true}}' \
--storage-config '{
"blockStorage":{"enabled": true}
}'
Replace cluster-name with your own variable.
- Delete the default NodeClass used by the built-in node pools:
kubectl delete NodeClass default
- Check the status of the NodeClass:
kubectl describe nodeclass default
- If the NodeClass is stuck in Terminating state, you need to force delete it:
`kubectl patch nodeclass <nodeclass-name> -p '{"metadata":{"finalizers":null}}' --`*`type`*`=merge
`
- Wait for the deletion to complete, then enable the built-in node pools again
aws eks update-cluster-config \
--name <cluster-name> \
--compute-config '{
"nodeRoleArn": "arn:aws:iam::1111122222:role/AmazonEKSAutoNodeRole",
"nodePools": ["general-purpose", "system"],
"enabled": true
}' \
--kubernetes-network-config '{
"elasticLoadBalancing":{"enabled": true}
}' \
--storage-config '{
"blockStorage":{"enabled": true}
}'
Replace nodeRoleArn with your own variable.
- Check the status of the built-in NodePools and NodeClass:
kubectl get nodeclass
NAME ROLE READY AGE
default AmazonEKSAutoNodeRole True 9m32s
kubectl get nodepool
NAME NODECLASS NODES READY AGE
general-purpose default 1 True 44m
system default 0 True 44m
EKS Auto Mode Node IAM Role missing necessary permissions:
Check CloudTrail events history:
- Review the CloudTrail events by filtering Event Source using eks.amazonaws.com for any error messages or warnings that might indicate the cause of the Unknown status.
Verify the Amazon EKS Auto Mode node IAM role (AmazonEKSAutoNodeRole
) and policies:
- Ensure that the IAM role associated with your built-in node pools have the necessary permissions.
- Add any missing permissions to the role. Refer to Amazon EKS Auto Mode node IAM role
EKS Auto Mode Cluster IAM Role missing necessary permissions:
Check CloudTrail events history:
- Review the CloudTrail events by filtering Event Source using eks.amazonaws.com for any error messages or warnings that might indicate the cause of the Unknown status.
Check the cluster's IAM role AmazonEKSAutoClusterRole
:
- Ensure the role has the required permissions.
- Add any missing permissions to the cluster role. Refer to Amazon EKS Auto Mode cluster IAM role
- Topics
- ContainersCompute
- Language
- English
Relevant content
- asked 2 months ago
- AWS OFFICIALUpdated 3 months ago