Eks Terraform apply - NodeCreationFailure

0

Hello, i have a kube instance on AWS since few years. I have updated it on 1.30 and now, when i make "terraform apply", he destroying my node=>Ok. But when he creating news, i have then error after 23 minutes: │ Error: waiting for EKS Node Group (uat-eks-o*****t-cn:uat-eks-o*****t-ngn) create: unexpected state 'CREATE_FAILED', wanted target 'ACTIVE'. last error: i-0cb56d12652d69848, i-0b38365b78c3620e0, i-0a3c8df9fe6d9179f: NodeCreationFailure: Unhealthy nodes in the kubernetes cluster │ │ with aws_eks_node_group.o*****t_ng, │ on eks.tf line 117, in resource "aws_eks_node_group" "o*****t_ng": │ 117: resource "aws_eks_node_group" "o****t_ng" {

And my application is down.

What can i check ?

Regards

dd
asked 11 days ago32 views
1 Answer
0

This error suggests that there's an issue with the creation of new nodes in your EKS cluster after updating to version 1.30. The NodeCreationFailure error indicates that the nodes are being created but are unable to join the cluster successfully. There are several things you can check to troubleshoot this issue:

  1. IAM Role Permissions: Ensure that the IAM role associated with your node group has the necessary permissions. The role should have policies that allow it to interact with EKS and other required AWS services.

  2. VPC and Subnet Configuration: Verify that your VPC and subnet settings are correct and that the nodes have proper network access. The nodes need to be able to communicate with the EKS control plane.

  3. Security Group Settings: Check if the security group associated with your nodes allows the required inbound and outbound traffic for EKS.

  4. Node Instance Type: Confirm that the instance type you're using for your nodes is compatible with EKS 1.30 and has sufficient resources.

  5. AMI Version: Make sure you're using an up-to-date Amazon EKS-optimized AMI that's compatible with EKS 1.30.

  6. CloudWatch Logs: If you have enabled CloudWatch logging for your EKS cluster, check the logs for any specific error messages that might provide more details about the failure.

  7. Outbound Internet Access: Ensure that your nodes have outbound internet access or access to the required VPC endpoints if you're running in a private subnet.

  8. Kubernetes Configuration: Verify that the Kubernetes configuration (kubeconfig) is correctly set up and that you can access the cluster using kubectl.

  9. AWS CLI and Terraform Version: Make sure you're using the latest versions of the AWS CLI and Terraform, as older versions might have compatibility issues with newer EKS versions.

If you've checked these areas and are still encountering issues, you may need to review the Terraform configuration files, particularly the eks.tf file mentioned in the error message, to ensure all settings are correct for EKS 1.30. Additionally, you might want to try creating a new node group manually through the AWS console to see if the issue persists outside of Terraform.

Remember to always test changes in a non-production environment before applying them to your production cluster to minimize downtime of your applications.

Sources
AWS EKS configuration issues | AWS re:Post
Troubleshoot problems with Amazon EKS clusters and nodes - Amazon EKS

profile picture
answered 11 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions