Why does my EKS cluster fails init on "Nodegroup... failed to stabilize"?

0

I launched a cluster using the "Getting Started" https://docs.aws.amazon.com/eks/latest/userguide/getting-started.html docs as below. It fails each time on ManagedNodeGroup: CREATE_FAILED – "Nodegroup standard-workers failed to stabilize: Internal Failure".

The same thing happens from the AWS SDK for Go: https://docs.aws.amazon.com/sdk-for-go/api/service/eks/#EKS.CreateCluster

How can I diagnose and overcome this?

 eksctl create cluster \
--name mycluster \
--region us-west-2 \
--nodegroup-name standard-workers \
--node-type t3.medium \
--nodes 1 \
--nodes-min 1 \
--nodes-max 1 \
--ssh-access \
--ssh-public-key joshuakey.pub  \
--managed
[ℹ]  eksctl version 0.11.1
[ℹ]  using region us-west-2
[ℹ]  setting availability zones to [us-west-2a us-west-2d us-west-2c]
[ℹ]  subnets for us-west-2a - public:192.168.0.0/19 private:192.168.96.0/19
[ℹ]  subnets for us-west-2d - public:192.168.32.0/19 private:192.168.128.0/19
[ℹ]  subnets for us-west-2c - public:192.168.64.0/19 private:192.168.160.0/19
[ℹ]  using SSH public key "joshuakey.pub" as "eksctl-mycluster-nodegroup-standard-workers-f5:cd:16:16:31:34:89:4f:10:ea:5b:74:85:e5:9b:2b"
[ℹ]  using Kubernetes version 1.14
[ℹ]  creating EKS cluster "mycluster" in "us-west-2" region with managed nodes
[ℹ]  will create 2 separate CloudFormation stacks for cluster itself and the initial managed nodegroup
[ℹ]  if you encounter any issues, check CloudFormation console or try 'eksctl utils describe-stacks --region=us-west-2 --cluster=mycluster'
[ℹ]  CloudWatch logging will not be enabled for cluster "mycluster" in "us-west-2"
[ℹ]  you can enable it with 'eksctl utils update-cluster-logging --region=us-west-2 --cluster=mycluster'
[ℹ]  Kubernetes API endpoint access will use default of {publicAccess=true, privateAccess=false} for cluster "mycluster" in "us-west-2"
[ℹ]  2 sequential tasks: { create cluster control plane "mycluster", create managed nodegroup "standard-workers" }
[ℹ]  building cluster stack "eksctl-mycluster-cluster"
[ℹ]  deploying stack "eksctl-mycluster-cluster"
[ℹ]  building managed nodegroup stack "eksctl-mycluster-nodegroup-standard-workers"
[ℹ]  deploying stack "eksctl-mycluster-nodegroup-standard-workers"
[✖]  unexpected status "ROLLBACK_IN_PROGRESS" while waiting for CloudFormation stack "eksctl-mycluster-nodegroup-standard-workers"
[ℹ]  fetching stack events in attempt to troubleshoot the root cause of the failure
[✖]  AWS::EKS::Nodegroup/ManagedNodeGroup: CREATE_FAILED – "Nodegroup standard-workers failed to stabilize: Internal Failure"
[ℹ]  1 error(s) occurred and cluster hasn't been created properly, you may wish to check CloudFormation console
[ℹ]  to cleanup resources, run 'eksctl delete cluster --region=us-west-2 --name=mycluster'
[✖]  waiting for CloudFormation stack "eksctl-mycluster-nodegroup-standard-workers": ResourceNotReady: failed waiting for successful resource state
[✖]  failed to create cluster "mycluster"

Edited by: JoshuaFox on Apr 28, 2020 4:51 AM

Edited by: JoshuaFox on Apr 28, 2020 7:02 AM

gefragt vor 4 Jahren1553 Aufrufe
5 Antworten
0

Same problem here:

[ℹ]  deploying stack "eksctl-Schulungen-nodegroup-Roboters"
[✖]  unexpected status "ROLLBACK_IN_PROGRESS" while waiting for CloudFormation stack "eksctl-Schulungen-nodegroup-Roboters"
[ℹ]  fetching stack events in attempt to troubleshoot the root cause of the failure
[✖]  AWS::EKS::Nodegroup/ManagedNodeGroup: CREATE_FAILED – "Nodegroup Roboters failed to stabilize: Internal Failure"
[ℹ]  1 error(s) occurred and cluster hasn't been created properly, you may wish to check CloudFormation console
[ℹ]  to cleanup resources, run 'eksctl delete cluster --region=us-east-2 --name=Schulungen'
[✖]  waiting for CloudFormation stack "eksctl-Schulungen-nodegroup-Roboters": ResourceNotReady: failed waiting for successful resource state
Error: failed to create cluster

Any Ideas?

klicki
beantwortet vor 4 Jahren
0

+1 Same problem

JCM1
beantwortet vor 4 Jahren
0

Hint found in https://www.talkingquickly.co.uk/2020/04/nodegroup-failed-to-stabilize-internal-failure/

The problem stated on the above mentioned website was tackled in version 0.17.0 of eksctl. Today I downloaded 0.18.0 and the process worked. Problem fixed!

Edited by: klicki on May 4, 2020 4:57 AM

klicki
beantwortet vor 4 Jahren
0

Same issue with the EKS QS Template for existing VPC:
https://raw.githubusercontent.com/aws-quickstart/quickstart-amazon-eks/master/templates/amazon-eks-master-existing-vpc.template.yaml

EKSNodegroup CREATE_FAILED Nodegroup EKSNodegroup-xxxxxxxxxx failed to stabilize: Internal Failure

khanas
beantwortet vor 4 Jahren
0

Thank you @klicki. Upgrading eksctl was all it took

beantwortet vor 4 Jahren

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen