AWS Glue Jupyter Notebook additional modules

0

Hi guys,

I'm trying to create a Glue job with Jupyter Notebook but I can't seem to import external modules. I installed external module following the documentation here https://docs.aws.amazon.com/glue/latest/ug/notebook-getting-started.html

%additional_python_modules s3://notebook-mod/simple_salesforce.whl

When I run my import statement cell block I get the following error: Enter image description here

Am I missing anything? Thanks!

gefragt vor 2 Jahren4958 Aufrufe
4 Antworten
2

Hello,

Can you please try to provide module names directly instead of proving the whl file. Please use below line once and let me know.

%additional_python_modules simple-salesforce,pandera

AWS
beantwortet vor 2 Jahren
  • This worked for me.

0
Akzeptierte Antwort

Hello,

You have used the correct approach to install external python modules in Glue studio Notebook which uses Glue2.0/Glue 3.0

To investigate, I have setup in my environment and used below steps:

  1. Create Glue studio Notebook (Navigate to Glue Console --> In left side panel click on Glue studio --> Select Jupyter Notebook)
  2. Downlaod the simple-salesforce.whl file from pypi (https://files.pythonhosted.org/packages/60/3c/647da942ce0e1f024dc3e188ebc60ee28972ba1254e691e3512511b9062a/simple_salesforce-1.12.1-py2.py3-none-any.whl) and upload it to s3
  3. Use below code to install simple_salesforce
%additional_python_modules s3://library/simple_salesforce-1.12.1-py2.py3-none-any.whl)

from simple_salesforce import Salesforce

It executed successfully without any issue. In your case i am suspecting you are using Sagemaker Notebook backed by Glue Devendpoint which uses Glue 1.0 and does not support additional_python_modules. Can you please check and confirm once again if you are using correct notebook or not.

Reference:

[1] https://docs.aws.amazon.com/glue/latest/ug/notebook-getting-started.html

AWS
beantwortet vor 2 Jahren
  • I'm using interactive Notebook in Glue studio. It works fine following your instruction with one module.

    Is it the same process if you do more than one additional module? If I do the below, it tells me 'ModuleNotFoundError: No module named 'simple_salesforce''

    %additional_python_modules s3://modules/simple_salesforce-1.12.1-py2.py3-none-any.whl, pandera-0.11.0-py3-none-any.whl

    from simple_salesforce import Salesforce import pandera

0

Hello,

You can provide multiple python modules using %additional_python_modules in notebook. In above example you have not provided the absolute whl file path of pandera module. Please provide the absolute path for each modue separated by comma.

%additional_python_modules s3://library/simple_salesforce-1.12.1-py2.py3-none-any.whl, s3://library/pandera-0.11.0-py3-none-any.whl
AWS
beantwortet vor 2 Jahren
0

So it works when I just did one module. I can import additional without issue. I tried the module in separate code and it works. Enter image description here

When I do more than one modules following the documentation. It doesn't work. Enter image description here

Let me know what else am I missing. Thanks for your help so far.

beantwortet vor 2 Jahren

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen