Best practices for bulk data loading in AWS Redshift - Glue or Copy

0

What are the pros and cons when it comes to using AWS Glue over Redshift's internal functions (such as COPY and INSERT)? for bulk data loading (In terms of cost, time, and adaptability). It's really appreciated if you can provide some examples use cases.

質問済み 10ヶ月前350ビュー
1回答
0
承認された回答

Hi, AWS Glue is an ETL service: T is the key letter. If you need to transform the source data before your load into RedShift, Glue will be highly useful.

For example, Glue provides lots of wired in simple and adanced transformations that you can integrate in your Glue-Based ETL pipeline: see https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-python-transforms.html

Also, you may want to measure the quality of your data, before loading it to ensure constant quality. Then AWS Glue Data Quality may be very helpful: see https://aws.amazon.com/blogs/big-data/getting-started-with-aws-glue-data-quality-from-the-aws-glue-data-catalog/

Hope it helps,

Didier

profile pictureAWS
エキスパート
回答済み 10ヶ月前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ