EKS: ARM64 nodes fail to become ready due to CNI error

0

Hi folks -

k8s version: 1.21 Platform version: eks.4

I'm trying to get an autoscaling nodegroup using a1.metal instances working. Nodes that are spawned never become ready due to the following error:

container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized

The aws-node pod on the node fails with the following error:

Readiness probe failed: {"level":"info","ts":"2021-12-29T04:56:34.384Z","caller":"/usr/local/go/src/runtime/proc.go:225","msg":"timeout: failed to connect service \":50051\" within 5s"}

The kube-proxy pod on the node fails to start with the following log:

standard_init_linux.go:228: exec user process caused: exec format error

Which leads me to believe that at least the kube-proxy pod is likely pulling an x86 image instead of arm64.

I appreciate any assistance; please let me know which other information I can provide.

thanks,

/-Will

  • 50051 port represents ipamd agent (part of aws-node daemonset). Could you post the ipamd logs from following location on ARM workernode.

    cat /var/log/aws-routed-eni/ipamd.log

  • How did you set up your worker nodes? Is it a Managed Nodegroup, or are you using your own unmanaged worker nodes?

    Are these new nodes attached to an older cluster, or was this a freshly-created cluster?

    How did you install the VPC CNI controller? Did you let EKS do it, or did you update/install it yourself?

    Can you please post the output of kubectl -n kube-system get daemonset aws-node -o yaml ?

asked 2 years ago1893 views
1 Answer
0

Hi Will,

Thanks for your post here.

As per the reference Amazon EKS optimized Arm Amazon Linux AMIs [1]


If your cluster was deployed before August 17, 2020, you must do a one-time upgrade of critical cluster add-on manifests. This is so that Kubernetes can pull the correct image for each hardware architecture in use in your cluster. For more information about updating cluster add-ons, see To update the Kubernetes version for your Amazon EKS cluster . If you deployed your cluster on or after August 17, 2020, then your coredns, kube-proxy, and Amazon VPC CNI Plugin for Kubernetes add-ons are already multi-architecture capable.

to validate that, please share the output of the below command

kubectl get ds kube-proxy -n kube-system -o yaml

and you can check for below parameters

  • key: "beta.kubernetes.io/arch" operator: In values: - amd64 - arm64


the same is also as referred here in the reference document [2], Updating the kube-proxy self-managed add-on.

(Optional) If you're using x86 and Arm nodes in the same cluster and your cluster was deployed before August 17, 2020. Then, edit your kube-proxy manifest to include a node selector for multiple hardware architectures with the following command. This is a one-time operation. After you've added the selector to your manifest, you don't need to add it each time you update. If your cluster was deployed on or after August 17, 2020, then kube-proxy is already multi-architecture capable.

kubectl edit -n kube-system daemonset/kube-proxy

Add the following node selector to the file in the editor and then save the file. This enables Kubernetes to pull the correct hardware image based on the node's hardware architecture.

  • key: "beta.kubernetes.io/arch" operator: In values: - amd64 - arm64

after you reviewing the above information, if you find that there is need to to update the kube-proxy, please follow the steps from reference document[2].

If you do not see any issue with kube-proxy configuration, you can review reference document [3], where we discussed about other possible reasons and troubleshooting steps for node not joining the cluster.

based on the nature of this issue, if you need more assistance and we need to perform deep dive, you can create a case with premium support where we can get the logs reviewed and suggest specific troubleshooting steps if required.

References:

  1. https://docs.aws.amazon.com/eks/latest/userguide/eks-optimized-ami.html#arm-ami
  2. https://docs.aws.amazon.com/eks/latest/userguide/managing-kube-proxy.html#updating-kube-proxy-add-on
  3. https://aws.amazon.com/premiumsupport/knowledge-center/eks-worker-nodes-cluster/
AWS
SUPPORT ENGINEER
Kiran_K
answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions