Way to split up lengthy glue job scripts?

0

There are many lengthy (> 1000 LOC) glue job scripts for our customers, and they are sharing common code blocks that are identical in all of them. The glue jobs are not executed interactively by a person, but instead triggered for execution at a particular point in time.

Is there a way how to split up these lengthy glue job scripts in several smaller scripts, in order to isolate the common code blocks as python functions? Can you give an example?

Having this common code blocks just in one place instead in all of the glue scripts would make maintenance much easier.

wkl3nk
demandé il y a 2 ans258 vues
1 réponse
1

I see that there are many repetitive blocks of code in your Glue jobs and you would like to extract these pieces of code and implement them as functions. For this purpose, I would suggest you to bundle up these code blocks into python modules and then get them imported into your Glue jobs whenever there is a need.

  1. Make a python module of your generic functions.
  2. Next, .zip the module and upload it into a s3 bucket.
  3. Add the s3 path to your module (.zip file) to the "Python library path" field present in job details section.

To know more about this, please refer this documentation.

In this way, I think you can decrease the number of lines in your Glue jobs by pushing the redundant code to s3 bucket and then fetching it whenever required.

I hope this helps you in your usecase.

profile pictureAWS
INGÉNIEUR EN ASSISTANCE TECHNIQUE
Chaitu
répondu il y a 2 ans

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions