How to set spark configuration parameters in PySparkProcessor() in sagemaker processing job?

0

Hi folks, I'm trying to set the spark executor instances & memory, driver memory and switch of dynamic allocation. What is the correct way to do it?

1 Resposta
1

Hi! You can achieve this by passing a "configuration" dictionary to the PySparkProcessor. Have a look at the example below to see exactly how to achieve this: https://sagemaker.readthedocs.io/en/stable/amazon_sagemaker_processing.html#configuration-override

happy coding

AWS
respondido há 2 anos
profile pictureAWS
ESPECIALISTA
Tasio
avaliado há 2 anos
  • HI! Thanks for the prompt response. I tried the approach above and here is how my configuration looks configuration = [{ "Classification": "spark-defaults", "Properties": {"spark.executor.memory":"45g", "spark.executor.instance":"45","spark.executor.cores":"6","spark.driver.memory":"30g", "spark.dynamicAllocation.enabled":"false"}, }] and couldn't update the executor instances, i.e., spark.executor.instances. To confirm passing values via "spark-deafults" is equivalent to “--conf” on an EMR spark-submit job.

Você não está conectado. Fazer login para postar uma resposta.

Uma boa resposta responde claramente à pergunta, dá feedback construtivo e incentiva o crescimento profissional de quem perguntou.

Diretrizes para responder a perguntas