How to set spark configuration parameters in PySparkProcessor() in sagemaker processing job?

0

Hi folks, I'm trying to set the spark executor instances & memory, driver memory and switch of dynamic allocation. What is the correct way to do it?

1개 답변
1

Hi! You can achieve this by passing a "configuration" dictionary to the PySparkProcessor. Have a look at the example below to see exactly how to achieve this: https://sagemaker.readthedocs.io/en/stable/amazon_sagemaker_processing.html#configuration-override

happy coding

AWS
답변함 2년 전
profile pictureAWS
전문가
Tasio
검토됨 2년 전
  • HI! Thanks for the prompt response. I tried the approach above and here is how my configuration looks configuration = [{ "Classification": "spark-defaults", "Properties": {"spark.executor.memory":"45g", "spark.executor.instance":"45","spark.executor.cores":"6","spark.driver.memory":"30g", "spark.dynamicAllocation.enabled":"false"}, }] and couldn't update the executor instances, i.e., spark.executor.instances. To confirm passing values via "spark-deafults" is equivalent to “--conf” on an EMR spark-submit job.

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인