By using AWS re:Post, you agree to the Terms of Use
/Ingesting data into AWS Data Lake using APIs/

Ingesting data into AWS Data Lake using APIs


I have an AWS Data Lake that is ready to be used at the moment.

My use case for the Data Lake is to be able to, ingest data from different API connectors (coming from other data vendors and service providers). There are at least three different services such as CRM tools, data vendors and etc which I need to ingest data into AWS Data Lake.

All these services provide an API which can be used to connect to the data lake. The challenge is however to ensure that the process of ingesting/transforming and loading (ETL), must be automated and doesn't need manual interventions. Which means once connected, with every new data point that the vendors provide, it must automatically be loaded into the data lake.

My question is, what AWS services do I need to use to accomplish this ?

(1) Building an API connector from different data providers to Data lake ?

(2) Be able to do some ETL (or transform) the data before loading it into the Data Lake. Potentially this ETL service can sit between the API and the Data lake.


1 Answers

Hi, Depending on the services you are looking to extract data form, there could be easier method to extract your data from.

For example Amazon AppFlow can directly extract data from some Services and it is directly integrated with AWS Glue DataBrew for Data Preparation and transformation.

Alternatively if the SDK for the service is available in Python you could use AWS Glue to create the connector and also run the ETL and Finally write to S3. AWS Glue Studio has a marketplace that allows you to source pre-built connectors from AWS and third parties.

hope this helps,

answered 5 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions