Difference between Amazon Glue and Amazon EMR

0

Hello,

Please share the difference between AWS Glue and AWS EMR and which one we should use and when?

Thanks,

Monica
已提问 6 个月前1214 查看次数
1 回答
2
已接受的回答

Hi, AWS Glue is a serverless data integration service that makes it easy for analytics users to discover, prepare, move, and integrate data from multiple sources. And Amazon EMR (previously called Amazon Elastic MapReduce) is a managed cluster platform that simplifies running big data frameworks, such as Apache Hadoop and Apache Spark, on AWS to process and analyze vast amounts of data.

AWS Glue Supporting Apache Spark and Amazon EMR serverless availability is what makes the overlapping between each other. Always remember that what you may recommend should depend on the user persona and use case.

From a recommendation point of view:

  • AWS Glue is our recommended service for Data Integration workloads and ETL from legacy platforms such as Informatica, Talend etc.
  • Amazon EMR is our recommended service for Big Data workloads that are traditionally run on Hadoop.

Use Amazon EMR:

  • Hadoop Migration from on-prem or other cloud providers, including Databricks migration
  • Customer has expertise beyond just Spark, for ex. Hive, Presto, Trino
  • Customer is skilled in loading their own data source connector libraries for their jobs.

Use AWS Glue:

  • Customer prefers built-in capabilities: connectors, transformations, incremental load, job monitoring, orchestration.
  • Customer wants visual and code ETL development tools
  • Migration from ETL providers such as Informatica, Talend, Matillion
profile pictureAWS
Arifc
已回答 6 个月前
AWS
支持工程师
已审核 25 天前
profile picture
专家
已审核 2 个月前
  • Thank you!!

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则