- Newest
- Most votes
- Most comments
Hello Andrea,
In AWS Glue Notebooks, you can define custom job parameters and make them overridable when running the notebook by using the glueContext
object. You can use the getResolvedOptions
method to access the parameters and set their default values. Here's a step-by-step guide:
-
Import AWS Glue Libraries:
At the beginning of your AWS Glue Notebook, import the necessary libraries and create a
glueContext
object. This context will be used to access and resolve custom job parameters.import sys from awsglue.context import GlueContext from pyspark.context import SparkContext sc = SparkContext() glueContext = GlueContext(sc)
-
Define Custom Job Parameters:
Below the context setup, define your custom job parameters using
getResolvedOptions
. You can provide default values for these parameters. Here's an example:from awsglue.utils import getResolvedOptions # Define your custom parameters with default values args = getResolvedOptions(sys.argv, ['myParam1', 'myParam2']) myParam1 = args['myParam1'] if 'myParam1' in args else 'default_value1' myParam2 = args['myParam2'] if 'myParam2' in args else 'default_value2'
-
Access and Use the Parameters:
Now that you have defined your custom parameters, you can access and use them in your notebook as needed. For example:
print(f"Custom Parameter 1: {myParam1}") print(f"Custom Parameter 2: {myParam2}")
-
Running the Notebook:
When you run the notebook, you can override these custom parameters by providing them as command-line arguments. For example:
--myParam1 my_value1 --myParam2 my_value2
These values will override the default values you defined in the notebook.
-
Using
%%configure
Magic (Optional):While
%%configure
is typically used for configuring the Spark environment in AWS Glue Notebooks, it can also be used to set custom parameters if needed. Here's an example:%%configure -f { "myParam1": "custom_value1", "myParam2": "custom_value2" }
This will set the parameters to the specified values for the duration of the notebook's session.
By following these steps, you can define custom job parameters in AWS Glue Notebooks, set default values, and make them overridable when running the notebook by passing command-line arguments or using the %%configure
magic command.
Please give me a thumbs up if my suggestion help
A question. What does -f mean after the %%configure magic?
I don't think it does anything since the Glue configure magic doesn't have flags defined. I think people use it out of habit before on other kernels it does something with the config format
Relevant content
- asked a year ago
- asked a year ago
- asked a month ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 2 years ago
- How can I use a Lambda function to automatically start an AWS Glue job when a crawler run completes?AWS OFFICIALUpdated 2 years ago
Raise an exception:
GlueArgumentError: the following arguments are required: --myParam1, --myParam2
I guess the issue is that the arguments must be defined somewhere.
Notice the arguments are taken from sys.argv, when you run an IS, the Python engine you use is local and then forwards the commands to the cluster, on a job, the Python interpreter is inside the cluster. What I would do is detect this in the sys.argv and just build the args variable some other way, so you can test on a notebook