What are the most important aspects to select between data pipeline, step function, or Amazon managed workflows for Apache Airflow?

0

What are the key points to choose one of the following:

  • Data pipeline,
  • Step function
  • Amazon Managed Workflows for Apache Airflow
1개 답변
0
수락된 답변
  1. AWS Data Pipeline is a web service that helps you reliably process and move data between different AWS compute and storage services, as well as on-premises data sources, at specified intervals. With AWS Data Pipeline, you can regularly access your data where it’s stored, transform and process it at scale, and efficiently transfer the results to AWS services such as Amazon S3, Amazon RDS, Amazon DynamoDB, and Amazon EMR. READ ETL. This can be used as ETL or data processing tool. The drawbacks include
  • Limited transformation and capabilities
  • No new developments AWS Glue is way better alternative.
  1. This should resolve the ETL or data processing debate. Now coming to orchestrators or schedulers not to be confused with ETL or data processing services. These may be used to connect or chain multiple ETL or data processing services. AWS Step Functions is a server less workflow orchestrator which is very simple and very limited capabilities. Amazon Managed Workflows for Apache Airflow (MWAA) is a managed orchestration service for Apache Airflow. This is much more robust, capable, allows lot of integrations. As per AWS FAQ:-

Q: When should I use Amazon MWAA vs. AWS Step Functions?

You should use Amazon MWAA if you prioritize open source and portability. Airflow has a large and active open source community that contributes new functionality and integrations regularly. Amazon MWAA supports existing Airflow workflows and integrations without changes to code, migration is easy, and the environment is familiar.

You should use Step Functions if you prioritize cost and performance. For example, if you were processing streaming data and transforming it through multiple steps before putting it in a DynamoDB database or S3, you should use Step Functions because it has higher performance at a lower cost.

AWS
Kunal_G
답변함 2년 전
AWS
전문가
검토됨 2년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠