Create a data ingestion pipeline in AWS?

0

Hi!

I have some heavy endpoints in my backend right now to procesate and generate that. The flow is the next one.

I get a request from the frontend side in my main microservice, I take those parameters and build a text. Then I call another microservice that holds the logic of my AI Agents that uses RAG and Langchain4j to generate a list of variations, and after returning the final response to the main microservice I take those variations and I call another AI Agent to generate the final text I need, I do that per element in the list of variations. The problem is that the final agent takes long to build those texts and in even if in my localhost it works (It takes 2 minutes to process everything) when I deploy that in my AWS I get the next error because the request takes too long Error: socket hang up. Also I dont want to have the user waiting for 2 minutes in front of the screen until his request is processed.

There is a way to send this logic to another service, process it in the background and send a message to the user like "We will send you a message when your data is ready" so that way the request is processed behind the scenes and the user can continue using the app?

Thanks in advance!

2 Answers
3
Accepted Answer

Hello,

Yes, there is a way to handle this scenario in AWS. You can use AWS Lambda functions in combination with Amazon Simple Queue Service (SQS) or Amazon Simple Notification Service (SNS) to process the requests asynchronously in the background.

How...?

1.Frontend to Main Microservice:

  • When the frontend sends a request to your main microservice, instead of processing the request directly, you can store the request data (parameters, user information, etc.) in an SQS queue or SNS topic.
  • Return a response to the frontend immediately, indicating that the request has been received and will be processed in the background

2.AWS Lambda Function:

  • Set up an AWS Lambda function that will be triggered whenever a new message is added to the SQS queue or an SNS topic.
  • In the Lambda function, retrieve the request data from the SQS queue or SNS topic.
  • Call your other microservices (AI Agents) to process the request data and generate the final text.
  • Once the processing is complete, you can store the result in a database or send a notification to the user (e.g., via SNS, email, or push notification).

3.Notification to the User:

  • If you choose to send a notification to the user, you can use AWS SNS to send an email or push notification to the user, informing them that their request has been processed and the result is available.
  • Alternatively, you can implement a mechanism in your frontend application to periodically check the status of the request and retrieve the result when it's ready.

By using this approach, your main microservice can quickly return a response to the frontend, and the actual processing will be offloaded to an AWS Lambda function running in the background. This way, the user doesn't have to wait for the lengthy processing to complete, and your application remains responsive.

Additionally, you can configure AWS Lambda functions to automatically scale based on the number of incoming requests, ensuring that your application can handle high loads without performance degradation.

Note that this is a high-level overview, and you'll need to implement the necessary components (SQS, SNS, Lambda functions, and integration with your microservices) according to your specific requirements and architecture.

Go through : https://aws.amazon.com/blogs/big-data/build-a-rag-data-ingestion-pipeline-for-large-scale-ml-workloads/

profile picture
EXPERT
answered 4 months ago
profile picture
EXPERT
reviewed 4 months ago
EXPERT
reviewed 4 months ago
  • So then my lambda function should call the microservice that manages the agents, since my microservice is in the same vpc as my lambda function I should be able to call the endpoint throught the lambda function to my backend right?

  • Yes, you can use AWS Lambda to process long-running tasks asynchronously.

    Your main microservice can send a message to an SQS queue with the request parameters. Configure a Lambda function to be triggered by new messages in the SQS queue. Within the Lambda function, you can call the endpoint of your microservice managing the AI agents (since they're in the same VPC). After processing, the Lambda can send a notification to the user. This way, the user gets an immediate response, and the heavy processing happens in the background.

  • Thanks man this is really helpfull!

  • You welcome

0

Yes, there is a way to process your data in the background and notify the user when it's ready, which can significantly improve the user experience and avoid timeout issues. You can implement this using AWS services to create a data ingestion pipeline. Here's an approach you could consider:

  1. Use AWS API Gateway to receive the initial request from your frontend.

  2. Instead of processing the request immediately, send a message to an AWS SQS (Simple Queue Service) queue.

  3. Set up an AWS Lambda function triggered by messages in the SQS queue. This Lambda function will initiate your processing workflow.

  4. Use AWS Step Functions to orchestrate your complex workflow, including calling your AI agents and managing the various steps of your process.

  5. As the final step in your Step Functions workflow, store the results in Amazon S3 or another suitable storage service.

  6. Use Amazon SNS (Simple Notification Service) to send a notification to the user when their data is ready. This could be an email, SMS, or a push notification to your app.

  7. Optionally, you could use Amazon DynamoDB to keep track of job status and results.

This approach allows you to decouple the request reception from the actual processing. The user gets an immediate response acknowledging their request, and the heavy processing happens asynchronously in the background.

For implementing the RAG (Retrieval Augmented Generation) part of your pipeline, you could use Amazon OpenSearch Service or Amazon RDS with the pgvector extension as vector data stores. You can use a Ray cluster to convert raw text data into embeddings and ingest them into these services.

By implementing this pipeline, you can handle large-scale machine learning workloads efficiently, avoid timeout issues, and provide a better user experience by not making users wait for long-running processes to complete.
Sources
Build a RAG data ingestion pipeline for large-scale ML workloads | AWS Big Data Blog
Welcome - AWS Data Pipeline

profile picture
answered 4 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions