Glue Jobs & Multiple tables

0

A customer needs to ETL multiple tables from RDS into S3 and Redshift.

Let's say they need to combine data from 6 tables to load into S3.

I tried helping them setup the Glue Jobs for this process, but it's not clear what the best and efficient way is to load these tables into S3 or Redshift: When you create a Glue Job, you can only select 1 table as a data source.

Do they need to create a Glue Job for each table or customize the generated Glue jobs to include all tables?

asked 6 years ago5306 views
1 Answer
0
Accepted Answer

Yes, they need to customize the generated Glue job to include multiple tables and join them. The Glue Job creation UI just creates a simple template job with one source and one target but in reality most jobs needs multiple sources and some need multiple targets as well.

We have Join examples here: https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-python-samples-legislators.html

AWS
answered 6 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions