Way to split up lengthy glue job scripts?

0

There are many lengthy (> 1000 LOC) glue job scripts for our customers, and they are sharing common code blocks that are identical in all of them. The glue jobs are not executed interactively by a person, but instead triggered for execution at a particular point in time.

Is there a way how to split up these lengthy glue job scripts in several smaller scripts, in order to isolate the common code blocks as python functions? Can you give an example?

Having this common code blocks just in one place instead in all of the glue scripts would make maintenance much easier.

wkl3nk
asked 2 years ago253 views
1 Answer
1

I see that there are many repetitive blocks of code in your Glue jobs and you would like to extract these pieces of code and implement them as functions. For this purpose, I would suggest you to bundle up these code blocks into python modules and then get them imported into your Glue jobs whenever there is a need.

  1. Make a python module of your generic functions.
  2. Next, .zip the module and upload it into a s3 bucket.
  3. Add the s3 path to your module (.zip file) to the "Python library path" field present in job details section.

To know more about this, please refer this documentation.

In this way, I think you can decrease the number of lines in your Glue jobs by pushing the redundant code to s3 bucket and then fetching it whenever required.

I hope this helps you in your usecase.

profile pictureAWS
SUPPORT ENGINEER
Chaitu
answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions