How to set spark configuration parameters in PySparkProcessor() in sagemaker processing job?

0

Hi folks, I'm trying to set the spark executor instances & memory, driver memory and switch of dynamic allocation. What is the correct way to do it?

1 Antwort
1

Hi! You can achieve this by passing a "configuration" dictionary to the PySparkProcessor. Have a look at the example below to see exactly how to achieve this: https://sagemaker.readthedocs.io/en/stable/amazon_sagemaker_processing.html#configuration-override

happy coding

AWS
beantwortet vor 2 Jahren
profile pictureAWS
EXPERTE
Tasio
überprüft vor 2 Jahren
  • HI! Thanks for the prompt response. I tried the approach above and here is how my configuration looks configuration = [{ "Classification": "spark-defaults", "Properties": {"spark.executor.memory":"45g", "spark.executor.instance":"45","spark.executor.cores":"6","spark.driver.memory":"30g", "spark.dynamicAllocation.enabled":"false"}, }] and couldn't update the executor instances, i.e., spark.executor.instances. To confirm passing values via "spark-deafults" is equivalent to “--conf” on an EMR spark-submit job.

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen