How does EMR on EKS deployment model support EMR auto-scaling/managed scaling?


EMR release 5.30 introduced managed scaling features which automatically scales the cluster based on Spark/Hive/etc workload. How does managed scaling work when EMR is deployed on EKS ? Or, how is the same feature-capability supported when EMR virtual cluster (running on EKS cluster) needs more compute/worker resources to process larger workload within the same SLA?

gefragt vor 3 Jahren855 Aufrufe
1 Antwort
Akzeptierte Antwort

The EKS cluster node scaling is entirely on the EKS cluster. EKS uses Kubernetes auto scaler to scale-out/in of the specific node group. EMR requests the Kubernetes scheduler on EKS to schedule Pods. For each job that you run, EMR on EKS creates a container. The container contains Amazon Linux 2 base image with security updates, plus Apache Spark and associated dependencies to run Spark, plus your application-specific dependencies. Each Job runs in a pod. The Pod downloads this container and starts to execute it. The Pod terminates after the job terminates.

In terms of processing larger workload, EMR on EKS allows Multi-AZ Support for jobs. (which is not present for EMR on EC2). Hence Spark job can launch executors container on nodes spanning multi-AZ. To enable Kubernetes auto scaler on EKS cluster, follow the instructions in this document .

beantwortet vor 3 Jahren

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen

Relevanter Inhalt