Difference between Amazon Glue and Amazon EMR

0

Hello,

Please share the difference between AWS Glue and AWS EMR and which one we should use and when?

Thanks,

Monica
已提問 6 個月前檢視次數 1215 次
1 個回答
2
已接受的答案

Hi, AWS Glue is a serverless data integration service that makes it easy for analytics users to discover, prepare, move, and integrate data from multiple sources. And Amazon EMR (previously called Amazon Elastic MapReduce) is a managed cluster platform that simplifies running big data frameworks, such as Apache Hadoop and Apache Spark, on AWS to process and analyze vast amounts of data.

AWS Glue Supporting Apache Spark and Amazon EMR serverless availability is what makes the overlapping between each other. Always remember that what you may recommend should depend on the user persona and use case.

From a recommendation point of view:

  • AWS Glue is our recommended service for Data Integration workloads and ETL from legacy platforms such as Informatica, Talend etc.
  • Amazon EMR is our recommended service for Big Data workloads that are traditionally run on Hadoop.

Use Amazon EMR:

  • Hadoop Migration from on-prem or other cloud providers, including Databricks migration
  • Customer has expertise beyond just Spark, for ex. Hive, Presto, Trino
  • Customer is skilled in loading their own data source connector libraries for their jobs.

Use AWS Glue:

  • Customer prefers built-in capabilities: connectors, transformations, incremental load, job monitoring, orchestration.
  • Customer wants visual and code ETL development tools
  • Migration from ETL providers such as Informatica, Talend, Matillion
profile pictureAWS
Arifc
已回答 6 個月前
AWS
支援工程師
已審閱 25 天前
profile picture
專家
已審閱 2 個月前
  • Thank you!!

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南