- 最新
- 最多得票
- 最多評論
In the AWS documentation they tell you two ways to work with node scaling in your EKS cluster:
- Karpenter
- Cluster Autoscaler
https://docs.aws.amazon.com/eks/latest/userguide/autoscaling.html
I would recommend working with karpenter, node scaling works much better
Found a solution that works for us. Systems Manager has an automation document called "AWS-UpdateEKSManagedNodeGroup" You can run this and modify the NodeGroupDesiredSize field and scale up/down during a maintenance window. No need to use Karpenter or CA.
Hello,
The problem with using only auto-scaling groups to scale nodes up and down for EKS clusters is that the autoscaling group is not Kubernetes aware. For example - You could write a rule that if the underlying EKS nodes hit 60% of cpu capacity, increase the desired count of the autoscaling by 1, but the problem could arise if say you have 3 nodes each with 4vcpu and at 50% cpu utilisation(2 vpcu available out of 4) and a Kubernetes deployment spin up a pod with the request of 3 vcpu, then that pod will always remain in the pending state, because kubernetes scheduler won't have any nodes with 3 vcpu available. Hence you need a solution that is Kubernetes aware and works on scaling your nodes up and down by looking at the "pending" pods in your cluster and then scaling your cluster up. Similar issues will arise when scaling down the cluster, for example - Since autoscaling group is not Kubernetes aware, it won't respect say, the Pod Disruption Budgets for example, and hence your pods availability will be negatively affected for critical applications.
Currently the 2 solution to provide you with compute capacity based on pending pods are:
- Karpenter
- Cluster Autoscaler (CAS)
CAS works by creating nodegroups, an abstract Kubernetes concept, and are backed by AWS autoscaling groups. To scale up, they look at pending pods and increase the desired count of the autoscaling group to provide capacity to the Kubernetes scheduler to place the pods in.
Karpenter, works directly with the AWS EC2 fleet to spin up nodes based on pending pods and is more flexible and faster as compared to CAS. With the 0.29.0 release of Karpenter it now supports Windows workload as well.
Please also have a look at the best practices when using CAS here - https://aws.github.io/aws-eks-best-practices/cluster-autoscaling/
and the best practices when using Karpenter - https://aws.github.io/aws-eks-best-practices/karpenter/
PS: Also, if you are using only Fargate with Kubernetes, you don't need to think about autoscaling your cluster, as Fargate pods will be provided the underlying capacity (microVM) by Fargate. If you are using Fargate, you need to think about autoscaling your pods based on say Horizontal Pod autoscaler and when your pods scale-out Fargate will spin up the capacity automatically.
Please let me know in case of any queries.
Thanks, Manish
相關內容
- AWS 官方已更新 3 年前
- AWS 官方已更新 1 年前
- AWS 官方已更新 2 年前
- AWS 官方已更新 3 年前
Looking more for an AWS based solution, rather than an outside party. Also, CA requires Kubernetes v1.3.0, unfortunately our environment is only at 1.23
If your nodes are managed node groups, then for AWS to manage it uses Cluster Autoscaler as per this doc
https://docs.aws.amazon.com/eks/latest/userguide/managed-node-groups.html
You can check this doc for cluster Autoscaler specific configuration
https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/aws/README.md
If you want to manage autoscaling of the worker nodes, then it is going to managing the autoscaling group for which you would need to setup dynamic scaling policies, alarms to invoke the scaling policies, metric for these alarms and a way to calculate the metrics via lambda or any other server less service to perform the task. I would rather not recommend this route.