Ingesting data into AWS Data Lake using APIs

0

I have an AWS Data Lake that is ready to be used at the moment.

My use case for the Data Lake is to be able to, ingest data from different API connectors (coming from other data vendors and service providers). There are at least three different services such as CRM tools, data vendors and etc which I need to ingest data into AWS Data Lake.

All these services provide an API which can be used to connect to the data lake. The challenge is however to ensure that the process of ingesting/transforming and loading (ETL), must be automated and doesn't need manual interventions. Which means once connected, with every new data point that the vendors provide, it must automatically be loaded into the data lake.

My question is, what AWS services do I need to use to accomplish this ?

(1) Building an API connector from different data providers to Data lake ?

(2) Be able to do some ETL (or transform) the data before loading it into the Data Lake. Potentially this ETL service can sit between the API and the Data lake.

Thanks,

1개 답변
1

Hi, Depending on the services you are looking to extract data form, there could be easier method to extract your data from.

For example Amazon AppFlow can directly extract data from some Services and it is directly integrated with AWS Glue DataBrew for Data Preparation and transformation.

Alternatively if the SDK for the service is available in Python you could use AWS Glue to create the connector and also run the ETL and Finally write to S3. AWS Glue Studio has a marketplace that allows you to source pre-built connectors from AWS and third parties.

hope this helps,

AWS
전문가
답변함 2년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠