EKS scheduler bin-packing

2

Hello, we are executing batch jobs on EKS as pods and we are facing issue with workloads are spread across large number of nodes after scale-up with low requests. Running jobs can't be migrated to other node so autoscaler ignores them and it prevents scale-down.

It might be helpful to binpack these job pods across the nodes similar as described https://alibaba-cloud.medium.com/the-burgeoning-kubernetes-scheduling-system-part-3-binpack-scheduling-that-supports-batch-jobs-372b4704722 or https://kubernetes.io/docs/concepts/scheduling-eviction/resource-bin-packing/#enabling-bin-packing-using-requestedtocapacityratio Is it possible to activate bin-pack scheduler on EKS? Or which approach would you recommend for this situation? thanks Martin

1개 답변
0

You might want to try using https://karpenter.sh/ for autoscaling and see if it improves your utilization. It can also consolidate pods to less nodes, but that might not always be appropriate for running jobs

AWS
dov
답변함 일 년 전
  • We are already using Karpenter and it does not solve the problem but instead makes it worse. Karpenter creates much bigger nodes during scale-up, than cluster autoscaler, which are much more underutilized after the load goes away.

    To fix the problem we need to be able to adapt the scheduling policy of the kubernetes scheduler to use its bin packing capability. Then new pods would not be spread across all nearly empty nodes but be bin packed on just some nodes which result in some empty nodes which can then be removed by karpenter.

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠