How can I modify the Spark configuration in an Amazon EMR notebook?
2 minute read
How can I customize the configuration for an Apache Spark job in an Amazon EMR notebook?
An Amazon EMR notebook is a serverless Jupyter notebook. A Jupyter notebook uses the Sparkmagic kernel as a client for interactively working with Spark in a remote EMR cluster through an Apache Livy server. You can use Sparkmagic commands to customize the Spark configuration. A custom configuration is useful when you want to do the following:
Change executor memory and executor cores for a Spark Job
Change resource allocation for Spark
Modify the current session
1. In a Jupyter notebook cell, run the %%configure command to modify the job configuration. In the following example, the command changes the executor memory for the Spark job.
2. For additional configurations that you usually pass with the --conf option, use a nested JSON object, as shown in the following example. Use this method instead of explicitly passing a conf object to a SparkContext or SparkSession.