Best AWS tool to move/transform data from redshift to an API

0

Hi, I'm searching a AWS tool to help me to move data from an database in redshift to an target database, but the target database is only accessible via API. I need to do some transformationsn in the data before to move to the database target.

已提问 2 年前286 查看次数
2 回答
1

If you're looking to stay with native AWS services, you're options would be AWS Glue or AWS Data Pipeline.

AWS Glue is fully serverless, so you won't have to manage servers, but on the backend it's Apache Spark, so that's something to be aware of from a compatibility standpoint. AWS Data Pipeline does not restrict to Apache Spark and allows you to make use of other engines like Pig, Hive, etc. This makes it a good choice for your organization if your ETL jobs do not require the use of Apache Spark or multiple engines.

As for specific use cases, AWS Data Pipeline transforms and moves data across AWS components. It also gives you control over the compute resources that run your code and allows you to access the Amazon EMR clusters or EC2 instances. Whereas, AWS Glue is best used to transform data from its supported sources (JDBC platforms, Redshift, S3, RDS) to be stored in its supported target destinations (JDBC platforms, S3, Redshift). Again, because AWS Glue is serverless you won't have to manage compute resources, so you can focus on your ETL jobs specifically.

Both have different pricing options, so depending on your specific use case you can kick around the numbers in the AWS Pricing Calculator

If you have any more questions/information feel free to ask in the comments and I'll try to guide you to what would suit your needs best. Thanks!

AWS
AWSJoe
已回答 2 年前
0

Thank you very much for your answer. I have used a bit of Glue and my main question is if it allows me to have an API as the destination of the flow, I always wrote against another DB. I don't know Pipeline, maybe it allows using an API as a target.

已回答 2 年前
  • Understood. Unfortunately, neither services offer that ability by default. With AWS Glue, you can only select AWS Glue Data Catalog, Amazon S3, Amazon Redshift, MySQL, PostgreSQL, Oracle SQL, and Microsoft SQL Server as targets. With Data pipeline, your node options are DynamoDB, Redshift, SQL, MySQL, and S3.

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则