kube-proxy failing after update to 1.16+
I've recently updated one of our clusters from version 1.15 to 1.16 and then to 1.17. Before this one I updated 8 other clusters with no issues whatsoever. However, for some reason, when I update kube-proxy to bring it in line with the new Kubernetes version the pods fail. The logs are empty and the reason why the pods are terminated is "Error", which isn't informative at all.
- Server Version: v1.17.17-eks-087e67
- Nodes version: v1.17.17-eks-ac51f2
- Working kube-proxy version: v1.15.11-eksbuild.1
kube-proxy versions that cause the problem: 1.16.13-eksbuild.1, v1.17.9-eksbuild.1
At first I thought it could just be that particular version of kube-proxy so I decided to keep updating. Now I assume it's something else.
Running kubectl logs -f podName doesn't help. It doesn't return anything.
State: Waiting Reason: CrashLoopBackOff Last State: Terminated Reason: Error Exit Code: 1 Started: Mon, 18 Oct 2021 12:32:56 +0200 Finished: Mon, 18 Oct 2021 12:32:56 +0200
Type Reason Age From Message
Normal Scheduled 66s default-scheduler Successfully assigned kube-system/kube-proxy-bh7r4 to ip-xxxxxx.eu-west-1.compute.internal
Normal Pulling 65s kubelet, ip-xxxxxx.eu-west-1.compute.internal Pulling image "602401143452.dkr.ecr.eu-west-1.amazonaws.com/eks/kube-proxy:v1.17.9-eksbuild.1"
Normal Pulled 63s kubelet, ip-xxxxxx.eu-west-1.compute.internal Successfully pulled image "602401143452.dkr.ecr.eu-west-1.amazonaws.com/eks/kube-proxy:v1.17.9-eksbuild.1"
Normal Created 23s (x4 over 62s) kubelet, ip-xxxxxx.eu-west-1.compute.internal Created container kube-proxy
Normal Started 23s (x4 over 62s) kubelet, ip-xxxxxx.eu-west-1.compute.internal Started container kube-proxy
Normal Pulled 23s (x3 over 62s) kubelet, ip-xxxxxx.eu-west-1.compute.internal Container image "602401143452.dkr.ecr.eu-west-1.amazonaws.com/eks/kube-proxy:v1.17.9-eksbuild.1" already present on machine
Warning BackOff 8s (x6 over 61s) kubelet, ip-xxxxxx.eu-west-1.compute.internal Back-off restarting failed container
Can you please advise? I'm quite confused here. I'm comparing what I did with our other clusters and I took the exact same steps. Not sure why this is not working.
Thanks a lot!
Edited by: twgdavef on Oct 18, 2021 4:24 AM
Upgrading to version 1.18 and using the add-ons solved the issue.
kube-proxy failing after update to 1.16+asked 7 months ago
PROBLEM Cloudformation Stuckasked 3 years ago
Force New Task Definition Revision to runasked a month ago
Cannot deploy to Fargate with 4 tasks - Limit reached for concurrent tasksasked a year ago
Cloud 9 environment updating title and description fail to persisttasked 2 years ago
CDK EMR Cluster Phased Deploymentsasked 2 months ago
drop table not working because illegal characters in nameAccepted Answerasked a year ago
Cannot upgrade minor version RDS Aurora MySQL ClustersAccepted Answerasked 4 months ago
After Upgrade to 1.0.6630 view is failing with Assert errorasked 3 years ago
EC2 site down after php update.asked 2 years ago