Best practices for bulk data loading in AWS Redshift - Glue or Copy

0

What are the pros and cons when it comes to using AWS Glue over Redshift's internal functions (such as COPY and INSERT)? for bulk data loading (In terms of cost, time, and adaptability). It's really appreciated if you can provide some examples use cases.

posta 10 mesi fa350 visualizzazioni
1 Risposta
0
Risposta accettata

Hi, AWS Glue is an ETL service: T is the key letter. If you need to transform the source data before your load into RedShift, Glue will be highly useful.

For example, Glue provides lots of wired in simple and adanced transformations that you can integrate in your Glue-Based ETL pipeline: see https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-python-transforms.html

Also, you may want to measure the quality of your data, before loading it to ensure constant quality. Then AWS Glue Data Quality may be very helpful: see https://aws.amazon.com/blogs/big-data/getting-started-with-aws-glue-data-quality-from-the-aws-glue-data-catalog/

Hope it helps,

Didier

profile pictureAWS
ESPERTO
con risposta 10 mesi fa

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande