Trigger AWS Glue when ever new file shows up in S3 bucket

0

Hi, First of all apologies for this simple question. I am new to AWS.

I have a job which run in EC2 machine ( scheduled using Cron) which writes 3 output files to a S3 bucket path. These 3 files all doesn't arrive at the same time. As soon as the files are arrived I need to run a glue job which will copy these files to a S3 bucket path which is in another account ( Only the Glue execution Role has been given access to the cross account S3 bucket and that is why I have to use Glue to transfer the files). As soon as each file is copied over they should be archived to a seperate folder in S3.

I was thinking of scheduling this Glue job to run every 1 hour and transfer the files and archive it once they are arrived. But then once the files are archived this job doesn't need to be run. So it looks like a waste of AWS reosurce if I schedule it to run every 1 hour. How Can I trigger this Glue Job only based on the S3 file arrival? I see that using Lambda we can achieve this. Can I get details on this and what are the additional role that I need to set up for Lambda and the other details around creating the lambda trigger function? . As of now only the Glue role has the read/write access to both the S3 paths.

SMR
gefragt vor 9 Monaten3016 Aufrufe
1 Antwort
0

Hi, you can directly trigger a Glue Job from EventBridge when a new file is uploaded to S3.

See https://aws.amazon.com/blogs/big-data/build-a-serverless-event-driven-workflow-with-aws-glue-and-amazon-eventbridge/

This post discusses how to configure AWS Glue workflows to run based on real-time events. 
You no longer need to set schedules or build complex solutions to trigger jobs based on events; 
AWS Glue event-driven workflows manage it all for you.

It is the simplest and most cost-efficient way to achieve your goal.

Best,

Didier

profile pictureAWS
EXPERTE
beantwortet vor 9 Monaten
  • To add to that, ask eventbridge to group events, you don't want to trigger a Glue job for each file that arrives

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen