Glue Jobs & Multiple tables

0

A customer needs to ETL multiple tables from RDS into S3 and Redshift.

Let's say they need to combine data from 6 tables to load into S3.

I tried helping them setup the Glue Jobs for this process, but it's not clear what the best and efficient way is to load these tables into S3 or Redshift: When you create a Glue Job, you can only select 1 table as a data source.

Do they need to create a Glue Job for each table or customize the generated Glue jobs to include all tables?

已提問 6 年前檢視次數 5355 次
1 個回答
0
已接受的答案

Yes, they need to customize the generated Glue job to include multiple tables and join them. The Glue Job creation UI just creates a simple template job with one source and one target but in reality most jobs needs multiple sources and some need multiple targets as well.

We have Join examples here: https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-python-samples-legislators.html

AWS
已回答 6 年前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南