By using AWS re:Post, you agree to the Terms of Use
/Sophisticated Triggering of Glue Jobs/

Sophisticated Triggering of Glue Jobs


Is there some documentation about ways to trigger Glue jobs, that go beyond static schedules and simple conditions as explained in I heard about the possibility to trigger Glue jobs from Lambda functions, but all I can find about that is not much more sophisticated than static schedules and simple conditions.

I have a pipeline of several Glue jobs that are normally run in a sequence once per week. The last glue job in this pipeline writes out a table that contains a flag, which is used to determine records, that need to be processed in a higher frequency. So I am looking for a mechanism, that processes this output table regularly and triggers that first glue job of the pipeline again more frequently in case a certain flag is set. I need to avoid, that the entire dataset is processed at this high frequency. How would this be done?

1 Answers

AWS Step Functions provides a great way of orchestrating multiple Glue jobs into a coherent workflow. Here is a workshop that shows an example on how to build a workflow using Step Functions. It provides a visual interface and can also be defined programmatically using the Amazon States Language.
You can store the state of the different Glue jobs (as flags) in DynamoDB so you can build a fully serverless data pipelines. (Glue, Step Functions and DynamoDB are all serverless). You can also consider event-driven orchestration of the different workflows using Amazon EventBridge.

answered 17 days ago
reviewed 15 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions