EKS node capacity settings with Karpenter

1

Currently we use EKS with managed node groups. In the node group configuration you have to configure the min, max and desired capacity of the nodes.

With the use of the classic Cluster Autoscaler the desired node capacity of the autoscaling group will be updated. Now we switched to the AWS developed autoscaler Karpenter (https://karpenter.sh/). Karpenter has a different handling and does not work with the autoscaling group or node group.

So my question is, with the use of Karpenter, which is the best setting for the EKS min, max and desired capacity? Should we set all these to "0" for example?

asked 2 years ago3444 views
1 Answer
1

Karpenter doesn't actually scale out nodes within the Managed Node Group. While you can use Karpenter alongside managed node groups, Karpenter defines its own set of limits (provisioner.spec.limits.resources). Kube scheduler will schedule to existing nodes in a managed node group if they exist. But when scaling, it will spin up new nodes that are not part of any Autoscaling Group or Managed Node Group. Hopefully that helps!

AWS
Jeff_W
answered 2 years ago
  • +1

    Karpenter by design is different than cluster autoscaler. It aims at rapid scale-out for unscheduled pods and also quick scale-in (ttl based) when nodes are empty. The focus is now back on making compute available on-demand and fast.

  • Ideally, Karpenter should work in harmony with existing node groups (managed or not) by manipulating ASGs directly rather than creating unmanaged instances. Perhaps an alternate Provisioner could be developed to do this. It would be less complex than the existing Provisioner since there would be no need for subnetSelector, securityGroupSelector, InstanceProfile, etc. - all can be determined directly from the existing ASG and LaunchTemplate.

  • @kennyk It's actually not quite so easy to work with existing node groups because Karpenter would essentially need to duplicate the logic that Cluster Autoscaler already has to determine which group to scale up and all the additional logic around monitoring that group (which does essentially what you described). There are notable performance improvements in a group-less model, Justin does a better job than I of explaining it https://www.youtube.com/watch?v=3QsVRHVdOnM

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions