- Newest
- Most votes
- Most comments
According to [1] [2] [3] each AWS ec2 instance type has limitation for Maximum Network Interfaces and IPv4 Addresses per Interface. That defines a limit of total allocated IPs ec2 instance type has when you use AWS CNI.
As per this third party article [4] I can see that it says for setting --use-max-pods false
in bootstrap.sh has occasionally caused kubernetes to assign more pods to a node than it could support.
Also you can use different CNI pluggins (not AWS VPC CNI). The AWS VPC CNI is the only one that is supported to run on AWS, but you could use a different CNI [5]. To name a few of the more popular alternatives - flannel [6], cilium (which utilizes a new kernel technology called BPF and is designed to address the scaling issues with iptables) [7], calico [8] which is a networking provider and network policy engine, weave-net as suggested for ease of installation in the Medium article [9], or if you would like to run multiple CNIs to better handle your workload requirements, you could use cni-genie [10] which enables kubernetes to seamlessly connect to multiple CNI plugins based on the pod's configuration. The process for changing the CNI is not simple, and things do have the potential to go sideways if you are not careful. You will always want to be sure of your actions before you perform them in the production cluster.
The reason for the limitation of the pod/node density is a fundamental limitation of kubernetes architecture and associated SIGs (Special Interest Groups) [11].
Such limitations include:
Sizing the node instance appropriately - make sure that nodes are not oversized. No two workloads are the same, and so maybe in you case, more smaller nodes fit better. For a portion of the total workload requirements, you could also make use of Spot instances to reduce overall cost [12]. When coupled with the kubernetes Cluster Autoscaler (to dynamically change Auto Scaling Group parameters), and the Spot Instance interrupt handler (daemonset on all nodes with the label lifecycle=EC2Spot
), you can gracefully scale nodegroups which contain Spot instances with zero service interruption. As well, when determining the best instance size, you always want to be sure that nodes have adequate system reservations in addition to workload compute requirements.
There is ongoing work to increase interface and IP address density/instance, and when it's ready there will be a new CNI supporting it, but the current limits are what we must live with for this implementation. I recommend you to track this github issue for the latest release [13].
[1] https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-eni.html
[2] https://github.com/aws/amazon-vpc-cni-k8s#eni-allocation
[3] https://github.com/awslabs/amazon-eks-ami/blob/master/files/eni-max-pods.txt
[4] https://medium.com/@swazza85/dealing-with-pod-density-limitations-on-eks-worker-nodes-137a12c8b218
[5] https://kubernetes.io/docs/concepts/cluster-administration/networking
[6] https://github.com/coreos/flannel#flannel
[7] https://github.com/cilium/cilium
[8] https://github.com/projectcalico/calico
[9] https://github.com/weaveworks/weave
[10] https://github.com/huawei-cloudnative/CNI-Genie
[11] https://static.sched.com/hosted_files/kccna18/92/Kubernetes%20Scalability_%20A%20multi-dimensional%20analysis.pdf
[12] https://aws.amazon.com/blogs/compute/run-your-kubernetes-workloads-on-amazon-ec2-spot-instances-with-amazon-eks
[13] https://github.com/aws/containers-roadmap/issues/398
Relevant content
- asked 2 years ago
- asked 2 years ago
- asked a year ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 8 months ago
- AWS OFFICIALUpdated 3 months ago
- AWS OFFICIALUpdated 2 months ago
I've found the problem. I used instance type for Autoscaling Group: capacity_type = ["t3a.medium", "t3.medium", "t3.small", "t3a.small"].
They have different max available IP for node. When I've separated it to 3 different Autoscalling Groups, I've gotten correct Available Podes. eks_managed_node_groups = { one = { name = "ng-8-pod" instance_types = ["t3a.small"]} two = { name = "ng-11-pod" instance_types = ["t3.small", "t2.small"] } three = { name = "ng-17-pod" instance_types = ["t3.medium", "t3a.medium", "t2.medium"] }