Namespace error creating an EKS cluster

0

I have the IAM role and policy setup per https://eksctl.io/usage/minimum-iam-policies/. When I create the cluster all the CF stacks complete with no errors at all. But I am getting this on the screen. The acutal error is at the very bottom.

2023-03-30 11:51:22 [▶]  completed task: create IAM role for serviceaccount "kube-system/aws-node"
2023-03-30 11:51:22 [▶]  started task: create serviceaccount "kube-system/aws-node"
2023-03-30 11:51:22 [ℹ]  waiting for CloudFormation stack "eksctl-tmdev-us1-pipe-prod-addon-iamserviceaccount-kube-system-AS-cluster-autoscaler"
2023-03-30 11:51:52 [▶]  failed task: create serviceaccount "kube-system/aws-node" (will not run other sequential tasks)
2023-03-30 11:51:52 [▶]  failed task:
    2 sequential sub-tasks: {
        create IAM role for serviceaccount "kube-system/aws-node",
        create serviceaccount "kube-system/aws-node",
    }
 (will continue until other parallel tasks are completed)
2023-03-30 11:51:52 [▶]  failed task:
    4 parallel sub-tasks: {
        2 sequential sub-tasks: {
            create IAM role for serviceaccount "kube-system-LB/aws-lb-controller",
            create serviceaccount "kube-system-LB/aws-lb-controller",
        },
        2 sequential sub-tasks: {
            create IAM role for serviceaccount "kube-system-DNS/external-dns",
            create serviceaccount "kube-system-DNS/external-dns",
        },
        create IAM role for serviceaccount "kube-system-AS/cluster-autoscaler",
        2 sequential sub-tasks: {
            create IAM role for serviceaccount "kube-system/aws-node",
            create serviceaccount "kube-system/aws-node",
        },
    }
 (will not run other sequential tasks)
2023-03-30 11:51:52 [▶]  failed task:
    2 sequential sub-tasks: {
        4 sequential sub-tasks: {
            wait for control plane to become ready,
            associate IAM OIDC provider,
            4 parallel sub-tasks: {
                2 sequential sub-tasks: {
                    create IAM role for serviceaccount "kube-system-LB/aws-lb-controller",
                    create serviceaccount "kube-system-LB/aws-lb-controller",
                },
                2 sequential sub-tasks: {
                    create IAM role for serviceaccount "kube-system-DNS/external-dns",
                    create serviceaccount "kube-system-DNS/external-dns",
                },
                create IAM role for serviceaccount "kube-system-AS/cluster-autoscaler",
                2 sequential sub-tasks: {
                    create IAM role for serviceaccount "kube-system/aws-node",
                    create serviceaccount "kube-system/aws-node",
                },
            },
            restart daemonset "kube-system/aws-node",
        },
        create managed nodegroup "jenkins-pipeline-nodegroup",
    }
 (will not run other sequential tasks)
2023-03-30 11:51:52 [!]  1 error(s) occurred and cluster hasn't been created properly, you may wish to check CloudFormation console
2023-03-30 11:51:52 [ℹ]  to cleanup resources, run 'eksctl delete cluster --region=us-east-1 --name=tmdev-us1-pipe-prod'
2023-03-30 11:51:52 [✖]  failed to create service account kube-system/aws-node: checking whether namespace "kube-system" exists: Get "https://XXXXXXXXXXB2A140B1DB492834D6A69A.gr7.us-east-1.eks.amazonaws.com/api/v1/namespaces/kube-system": dial tcp 172.11.111.111:443: i/o timeout
Error: failed to create cluster "tmdev-us1-pipe-prod"

When I go to EKS in the AWS Console and click on this cluster I do see a strange error at the top. Not sure if its related or not.

Error loading GenericResourceCollection/namespaces
4 Answers
0

As per architecture of EKS, the user who has created the cluster is the "Admin" of the cluster and the user will have "system:masters" permissions to the cluster without having to be added to the aws-auth config map. Now looking after the above error, I suspect the User "<USER_NAME>" is not the creator of this cluster "tmdev-us1-pipe-prod".

To solve the issue, you need to add your user in the auth-config file so that it can access the cluster. You can refer the doc here [1] or else you can also use below command which will give "admin"/"system:masters" access to the cluster. Based on your need you can update the command below.

eksctl create iamidentitymapping --cluster <cluster-name> --arn <arn of the role/user> --group system:masters --username <username> --region <region-code>

Post updating the cluster don't forget to update the auth-config file using below command.

aws eks update-kubeconfig --name cluster name --region region

Additionally, you can also check [2] for more details about cluster role-based access control details.

Refernces:

[1]. https://docs.aws.amazon.com/eks/latest/userguide/add-user-role.html
[2]. https://kubernetes.io/docs/reference/access-authn-authz/rbac/

AWS
answered a year ago
0

Let's focus on the following error (the root cause):

failed to create service account kube-system/aws-node: checking whether namespace "kube-system" exists: Get "https://XXXXXXXXXXB2A140B1DB492834D6A69A.gr7.us-east-1.eks.amazonaws.com/api/v1/namespaces/kube-system": dial tcp 172.11.111.111:443: i/o timeout

This suggests the following dependent events have failed: While trying to create service account (kube-system/aws-node), eksctl have failed to get response for the HTTP request "https://XXXXXXXXXXB2A140B1DB492834D6A69A.gr7.us-east-1.eks.amazonaws.com/api/v1/namespaces/kube-system" (in order to check if kube-system namespace exists), due to the socket 172.11.111.111:443 timed out. (I'm assuming the IP address has been obfuscated)

In short, eksctl could not reach the API endpoint for your cluster WHILE it was creating the cluster.

This usually happens when you define your vpc configuration with clusterEndpoints having private access only, like in the snipped below:

vpc:
  clusterEndpoints:
    publicAccess:  false
    privateAccess: true

However, doing so will cut eksctl's access to the cluster during creation. See [1] - https://eksctl.io/usage/vpc-cluster-access/ :

EKS does allow creating a configuration that allows only private access to be enabled, but eksctl doesn't support it during cluster creation as it prevents eksctl from being able to join the worker nodes to the cluster.

Please confirm you run eksctl from a workstation having continuous access to your cluster endpoint during creation.

AWS
SUPPORT ENGINEER
Janko
answered a year ago
0

So yes the instance I am on and running eksctl does have full access to the cluster. I have checked its SG and made sure there is a rule allowing inbound to the instance from anything in the VPC CIDR and out bound to 0.0.0.0/0. I also have a SG defined in my cluster creation yaml and I made sure there is a rule in there allowing full access inbound from the VPC CIDER and outbound is full access.

answered a year ago
0

As a side note. If I am defining the VPC the cluster will be in and that VPC is a private VPC does it matter if I create the cluster as Private or Public? Its ultimately locked down by the VPC. Is that not correct?

answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions