Skip to content

EKS cannot issue certificate to kubelet after node pool creation

0

Same as problem here https://stackoverflow.com/questions/72415947/why-eks-cant-issue-certificate-to-kubelet-after-nodepool-creation. Cluster nodes are running bottlerocket. They are joining the cluster and show as running. However, kubectl logs and exec fail with "remote error: tls: internal error". I am also assuming a role to create the cluster, and using that same role for the cluster itself as well as the nodes. With "kubectl get csr" all CSR's show as pending. Multiple "csr-??? ?m kubernetes.io/kubelet-serving kubernetes-admin <none> Pending" entries. Have checked that EKS and EC2 are trusted, and that the "aws-auth" config map does not have duplicates and looks correct. I can also get into the instance with ssm, and in shelte run "journalctl -u kubelet" which shows many entries like "http: TLS handshake error from 10.84.26.220:53588: no serving certificate available for the kubelet".

  • please accept the answer if it was helpful

asked 2 years ago2.4K views
2 Answers
0
Accepted Answer

Hello,

Greetings for the day!!

From your correspondence I can understand that whenever you are creating an EKS cluster using an assumed role then when you are trying to run the "kubectl exec ..." or "kubectl logs ..." command then you are observing issues. You have also mentioned that you are observing pending csr objects and you need insights into this. Please correct me if I misunderstood anything.

As I do not have access to your EKS cluster, I cannot say what exactly is the configuration or the exact IAM roles, policies configired in your case but I believe I may have figured it out. You can start your troubleshooting by taking a look at the control plane logs and checking the difference between approved and non approved CSR.

You will notice that the the ones that are getting approved will have the fields "responseObject.spec.username" and "user.username" as "system:node:ip-xx-xx-xx-xx.<region-code>.compute.internal"

Where as you will observe that the above value is different for the ones that are not getting approved. You can use the following control plane logs insights query to filter the results:

fields @timestamp, requestObject.spec.signerName, responseObject.metadata.name, responseObject.spec.username, user.username, responseObject.spec.extra.canonicalArn.0, user.extra.sessionName.0
| filter @message like "CertificateSigningRequest"  

By design EKS does not issue certificates for CSRs with signerName "kubernetes.io/kubelet-serving" unless the CSR was actually requested by a kubelet. EKS's custom signer validates this by checking that the requested SANs for CSRs with signerName kubernetes.io/kubelet-serving match an actual EC2 instance's IPs/DNS names.

If the common name does not match then it will fail to approve.

Please refer the below 3 scenarios that we have tested:

Scenario 1: Cluster creator, Cluster role and role attached to the Node Group- All same

The username in this case is kubernetes-admin kubectl logs will error out- remote error: tls: internal error, CSR will fail, as mentioned in previous comments, kubernetes.io/kubelet-serving should match an actual EC2 instance's IPs/DNS names.

Scenario 2: Cluster Creator, Node Role- same ; Cluster role is different

The username is an IAM role , eg: arn:aws:sts::<account-id>:assumed-role/EC2-user/i-<instance-id> kubectl logs will error out- remote error: tls: internal error, CSR will fail - similar reason as above.

Scenario 3: Cluster Creator is different, Cluster role and Node role is the same

The username is EC2 instance's IPs/DNS name eg: system:node:ip-xx-xx-xx-xx.ec2.internal This works as it satisfies the kubernetes.io/kubelet-serving condition.

Based on the above scenarios, it looks like you either belong to scenario 1 or scenario 2 and this is the reason you are observing the said issue.

As per our document https://docs.aws.amazon.com/eks/latest/userguide/create-node-role.html "You can't use the same role that is used to create any clusters." Hence, from the console when creating node group you will not see the cluster role, but from cli it is possible like I did in scenario use the same role for cluster role and node role.

Have a fantastic day ahead!!

AWS
answered 2 years ago
EXPERT
reviewed 2 years ago
  • Scenarios 1 & 2 above were my issue. My IT department gave me all permissions except for IAM permissions, which has consumed an incredible amount of time going back and forth trying to figure out which ones are needed when learning EKS. We were trying to reuse roles, and the cluster creator role was the same as the control plane role, and maybe the same as the node role. Never would have guessed different roles were required, irrespective of whether they have the correct permissions or not. The three principals are distinct now, and everything works.

0

Approve any pending certificate signing requests (CSRs) if necessary.

kubectl certificate approve <csr-name>
EXPERT
answered 2 years ago
EXPERT
reviewed 2 years ago
  • This was not my issue. In my case I could approve the cert, but then it would never be issued, and kubectl logs/exec would still not work.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.