Install dependencies which supports user defined package for Glue Spark job - " ModuleNotFoundError: No module named 'redshift_connector' "

0

**How to install python packages dependencies which supports user defined package for Glue Spark job? **

For example, I have used redshift_connector package inside my custom package. In my Spark job, the redshift_connector import statement not working and facing issue "ModuleNotFoundError: No module named 'redshift_connector'". Please refer the below screenshot. Enter image description here

Kasi
gefragt vor 9 Monaten542 Aufrufe
2 Antworten
0

To provide custom Python modules you need to put the module(s) in a zip or whl and then add it to the job as extra python files.
Verify the module is correctly defined with a init.py file and you can import it locally.

profile pictureAWS
EXPERTE
beantwortet vor 9 Monaten
  • Same thing I followed what you are mentioned in your answer, I have mentioned my custom wheel package in extra python files only. I noticed redshift connector installed successfully message in Jenkins pipeline which I have added that redshift package in --additional-python-modules. But it not consider the redshift_connector import statement. One thing I need to share you, redshift connector is used in one of python file in custom package. Please let me know the changes if any.

0
  1. Create a folder with Name zip_folder
  2. place your python file in it , lets assume the name of the file is r_file.py -> So your folder structure now is zip_folder/r_file.py
  3. Zip the folder and upload it to a s3 bucket , zipped folder name is now zip_folder.zip
  4. In the Python library path setting of Glue jobs , give the path of the zip file , eg : s3://bucket/prefix/zip_folder.zip
  5. In your glue script , import r_file ( this should be the same name as your .py file name )

Some More details : https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-python-libraries.html#glue20-modules-provided

beantwortet vor 8 Monaten

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen