Trigger AWS Glue when ever new file shows up in S3 bucket

0

Hi, First of all apologies for this simple question. I am new to AWS.

I have a job which run in EC2 machine ( scheduled using Cron) which writes 3 output files to a S3 bucket path. These 3 files all doesn't arrive at the same time. As soon as the files are arrived I need to run a glue job which will copy these files to a S3 bucket path which is in another account ( Only the Glue execution Role has been given access to the cross account S3 bucket and that is why I have to use Glue to transfer the files). As soon as each file is copied over they should be archived to a seperate folder in S3.

I was thinking of scheduling this Glue job to run every 1 hour and transfer the files and archive it once they are arrived. But then once the files are archived this job doesn't need to be run. So it looks like a waste of AWS reosurce if I schedule it to run every 1 hour. How Can I trigger this Glue Job only based on the S3 file arrival? I see that using Lambda we can achieve this. Can I get details on this and what are the additional role that I need to set up for Lambda and the other details around creating the lambda trigger function? . As of now only the Glue role has the read/write access to both the S3 paths.

SMR
asked 9 months ago2628 views
1 Answer
0

Hi, you can directly trigger a Glue Job from EventBridge when a new file is uploaded to S3.

See https://aws.amazon.com/blogs/big-data/build-a-serverless-event-driven-workflow-with-aws-glue-and-amazon-eventbridge/

This post discusses how to configure AWS Glue workflows to run based on real-time events. 
You no longer need to set schedules or build complex solutions to trigger jobs based on events; 
AWS Glue event-driven workflows manage it all for you.

It is the simplest and most cost-efficient way to achieve your goal.

Best,

Didier

profile pictureAWS
EXPERT
answered 9 months ago
  • To add to that, ask eventbridge to group events, you don't want to trigger a Glue job for each file that arrives

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions