- Le plus récent
- Le plus de votes
- La plupart des commentaires
I understand you're encountering issues with setting up an AWS Glue 5.0 job that requires an additional Python package (PyPDF2). The "Command failed with exit code 10" error occurring before your script runs suggests there might be a problem with the job configuration or the way the additional Python modules are being installed.
Let's go through a few things to check and try:
-
Ensure your requirements.txt file is correctly formatted and contains only the package you need: PyPDF2==version_number
-
Double-check that the S3 path to your requirements.txt file is correct and that the Glue job has permissions to access it.
-
Instead of using the requirements.txt file, you could try specifying the package directly in the --additional-python-modules parameter: Key: --additional-python-modules Value: PyPDF2==version_number
-
Verify that you're using the correct job parameters for AWS Glue 5.0. The parameters you've provided look correct, but ensure they're exactly as follows: Key: --python-modules-installer-option Value: -r Key: --additional-python-modules Value: s3://aws-glue-config-bucket/requirements.txt
-
Check the CloudWatch logs for the Glue job. Even though your script isn't running, there might be error messages in the logs that could provide more insight into why the command is failing.
-
Ensure you've selected the correct Glue version (5.0) when setting up your job.
-
Try running a Glue job without any additional Python modules to ensure the basic configuration is working correctly.
-
If possible, test the job using the AWS CLI or AWS SDK to rule out any potential issues with the console interface.
If none of these suggestions resolve the issue, you may need to contact AWS support for further assistance, as there could be an underlying problem with the Glue environment or configuration that isn't immediately apparent.
Sources
Introducing AWS Glue 5.0 for Apache Spark | AWS Big Data Blog
Using Python libraries with AWS Glue - AWS Glue
Using job parameters in AWS Glue jobs - AWS Glue
I've done all checks from 1 to 7 and If the console is not working I will probably consider other service provider before checking 8
Contenus pertinents
- demandé il y a 3 mois
- demandé il y a 10 mois
- demandé il y a 10 mois

With Glue 4 I get clearer log message, but I still can't see how to fix this since I'm already using "-r" flag:
By inspecting generated commands, It seems --user flag is injected by AWS glue at the wrong place (ie after "-r"), this means the requirements.txt file path used is "--user" instead "/tmp/requirements.txt"
Here is the matching log output with Glue 5: PythonModuleInstaller: pip3.11 install -r --user https://<fully-public-url>/requirements.txt PythonModuleInstaller: ERROR: Could not open requirements file: [Errno 2] No such file or directory: '--user'
I confirm the previous statement through this test:
Also it now work with: key: --python-modules-installer-option value: -r s3://<private-bucket>/requirements.txt
Hence the --user isn't inserted in the wrong place. Would really need a fix since following the documentation is literally the way to reproduce the problem (this is why I'm not logging this as an answer since it's just a workaround)