Direkt zum Inhalt

Glue PySpark vs Shell

0

I zipped my modules into zip file, uploaded to s3 and added to Pyspark and Shell jobs under Python library path parameter:

Enter image description here

In both jobs I am using the same import syntax. Pyspark job is working, Python Shell jobs raises error, saying my module not found:

ModuleNotFoundError: No module named 'expect_multicolumn_values_not_null'

Is there a difference between Pyspark and Python Shell modules import? How can I make it work on both?

gefragt vor einem Jahr381 Aufrufe
1 Antwort
0

Hi,

Based on my understanding, for the Python Shell jobs, you can consider using this approach Providing your own Python library. For your use-case you may need to create an Egg or Whl file.

You can also refer to these posts How do I use external Python libraries in my AWS Glue 2.0 ETL job? and External python libraries in a AWS Glue python shell job

Thanks, Rama

AWS
EXPERTE
beantwortet vor einem Jahr

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.