Saltar al contenido

Glue PySpark vs Shell

0

I zipped my modules into zip file, uploaded to s3 and added to Pyspark and Shell jobs under Python library path parameter:

Enter image description here

In both jobs I am using the same import syntax. Pyspark job is working, Python Shell jobs raises error, saying my module not found:

ModuleNotFoundError: No module named 'expect_multicolumn_values_not_null'

Is there a difference between Pyspark and Python Shell modules import? How can I make it work on both?

preguntada hace un año372 visualizaciones
1 Respuesta
0

Hi,

Based on my understanding, for the Python Shell jobs, you can consider using this approach Providing your own Python library. For your use-case you may need to create an Egg or Whl file.

You can also refer to these posts How do I use external Python libraries in my AWS Glue 2.0 ETL job? and External python libraries in a AWS Glue python shell job

Thanks, Rama

AWS
EXPERTO
respondido hace un año

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.