Namespace error creating an EKS cluster

0

I have the IAM role and policy setup per https://eksctl.io/usage/minimum-iam-policies/. When I create the cluster all the CF stacks complete with no errors at all. But I am getting this on the screen. The acutal error is at the very bottom.

2023-03-30 11:51:22 [▶]  completed task: create IAM role for serviceaccount "kube-system/aws-node"
2023-03-30 11:51:22 [▶]  started task: create serviceaccount "kube-system/aws-node"
2023-03-30 11:51:22 [ℹ]  waiting for CloudFormation stack "eksctl-tmdev-us1-pipe-prod-addon-iamserviceaccount-kube-system-AS-cluster-autoscaler"
2023-03-30 11:51:52 [▶]  failed task: create serviceaccount "kube-system/aws-node" (will not run other sequential tasks)
2023-03-30 11:51:52 [▶]  failed task:
    2 sequential sub-tasks: {
        create IAM role for serviceaccount "kube-system/aws-node",
        create serviceaccount "kube-system/aws-node",
    }
 (will continue until other parallel tasks are completed)
2023-03-30 11:51:52 [▶]  failed task:
    4 parallel sub-tasks: {
        2 sequential sub-tasks: {
            create IAM role for serviceaccount "kube-system-LB/aws-lb-controller",
            create serviceaccount "kube-system-LB/aws-lb-controller",
        },
        2 sequential sub-tasks: {
            create IAM role for serviceaccount "kube-system-DNS/external-dns",
            create serviceaccount "kube-system-DNS/external-dns",
        },
        create IAM role for serviceaccount "kube-system-AS/cluster-autoscaler",
        2 sequential sub-tasks: {
            create IAM role for serviceaccount "kube-system/aws-node",
            create serviceaccount "kube-system/aws-node",
        },
    }
 (will not run other sequential tasks)
2023-03-30 11:51:52 [▶]  failed task:
    2 sequential sub-tasks: {
        4 sequential sub-tasks: {
            wait for control plane to become ready,
            associate IAM OIDC provider,
            4 parallel sub-tasks: {
                2 sequential sub-tasks: {
                    create IAM role for serviceaccount "kube-system-LB/aws-lb-controller",
                    create serviceaccount "kube-system-LB/aws-lb-controller",
                },
                2 sequential sub-tasks: {
                    create IAM role for serviceaccount "kube-system-DNS/external-dns",
                    create serviceaccount "kube-system-DNS/external-dns",
                },
                create IAM role for serviceaccount "kube-system-AS/cluster-autoscaler",
                2 sequential sub-tasks: {
                    create IAM role for serviceaccount "kube-system/aws-node",
                    create serviceaccount "kube-system/aws-node",
                },
            },
            restart daemonset "kube-system/aws-node",
        },
        create managed nodegroup "jenkins-pipeline-nodegroup",
    }
 (will not run other sequential tasks)
2023-03-30 11:51:52 [!]  1 error(s) occurred and cluster hasn't been created properly, you may wish to check CloudFormation console
2023-03-30 11:51:52 [ℹ]  to cleanup resources, run 'eksctl delete cluster --region=us-east-1 --name=tmdev-us1-pipe-prod'
2023-03-30 11:51:52 [✖]  failed to create service account kube-system/aws-node: checking whether namespace "kube-system" exists: Get "https://XXXXXXXXXXB2A140B1DB492834D6A69A.gr7.us-east-1.eks.amazonaws.com/api/v1/namespaces/kube-system": dial tcp 172.11.111.111:443: i/o timeout
Error: failed to create cluster "tmdev-us1-pipe-prod"

When I go to EKS in the AWS Console and click on this cluster I do see a strange error at the top. Not sure if its related or not.

Error loading GenericResourceCollection/namespaces
질문됨 일 년 전1431회 조회
4개 답변
0

As per architecture of EKS, the user who has created the cluster is the "Admin" of the cluster and the user will have "system:masters" permissions to the cluster without having to be added to the aws-auth config map. Now looking after the above error, I suspect the User "<USER_NAME>" is not the creator of this cluster "tmdev-us1-pipe-prod".

To solve the issue, you need to add your user in the auth-config file so that it can access the cluster. You can refer the doc here [1] or else you can also use below command which will give "admin"/"system:masters" access to the cluster. Based on your need you can update the command below.

eksctl create iamidentitymapping --cluster <cluster-name> --arn <arn of the role/user> --group system:masters --username <username> --region <region-code>

Post updating the cluster don't forget to update the auth-config file using below command.

aws eks update-kubeconfig --name cluster name --region region

Additionally, you can also check [2] for more details about cluster role-based access control details.

Refernces:

[1]. https://docs.aws.amazon.com/eks/latest/userguide/add-user-role.html
[2]. https://kubernetes.io/docs/reference/access-authn-authz/rbac/

AWS
답변함 일 년 전
0

Let's focus on the following error (the root cause):

failed to create service account kube-system/aws-node: checking whether namespace "kube-system" exists: Get "https://XXXXXXXXXXB2A140B1DB492834D6A69A.gr7.us-east-1.eks.amazonaws.com/api/v1/namespaces/kube-system": dial tcp 172.11.111.111:443: i/o timeout

This suggests the following dependent events have failed: While trying to create service account (kube-system/aws-node), eksctl have failed to get response for the HTTP request "https://XXXXXXXXXXB2A140B1DB492834D6A69A.gr7.us-east-1.eks.amazonaws.com/api/v1/namespaces/kube-system" (in order to check if kube-system namespace exists), due to the socket 172.11.111.111:443 timed out. (I'm assuming the IP address has been obfuscated)

In short, eksctl could not reach the API endpoint for your cluster WHILE it was creating the cluster.

This usually happens when you define your vpc configuration with clusterEndpoints having private access only, like in the snipped below:

vpc:
  clusterEndpoints:
    publicAccess:  false
    privateAccess: true

However, doing so will cut eksctl's access to the cluster during creation. See [1] - https://eksctl.io/usage/vpc-cluster-access/ :

EKS does allow creating a configuration that allows only private access to be enabled, but eksctl doesn't support it during cluster creation as it prevents eksctl from being able to join the worker nodes to the cluster.

Please confirm you run eksctl from a workstation having continuous access to your cluster endpoint during creation.

AWS
지원 엔지니어
Janko
답변함 일 년 전
0

So yes the instance I am on and running eksctl does have full access to the cluster. I have checked its SG and made sure there is a rule allowing inbound to the instance from anything in the VPC CIDR and out bound to 0.0.0.0/0. I also have a SG defined in my cluster creation yaml and I made sure there is a rule in there allowing full access inbound from the VPC CIDER and outbound is full access.

답변함 일 년 전
0

As a side note. If I am defining the VPC the cluster will be in and that VPC is a private VPC does it matter if I create the cluster as Private or Public? Its ultimately locked down by the VPC. Is that not correct?

답변함 일 년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠