Do I use Lambda InvokeAsync or Amazon SQS event source for extracting data from Amazon S3?

0

I want to know the best way to extract data from thousands of Amazon S3 files. I want to use the Amazon S3 Put trigger to invoke an AWS Lambda function that initiates an API call to Amazon Textract to extract data from these S3 files. This process doesn't need to be synchronous because the documents will be uploaded to S3 once a month. Because the concurrency limit of Lambda is 1000 (in some Regions), and the process can be asynchronous, I'm considering the option of using a decoupled Amazon SQS queue with a Lambda function. I'm also aware that Lambda can handle asynchronous invocations. Under what conditions do I use Amazon SQS instead of Lambda InvokeAsync?

AWS
Vincent
asked 3 years ago276 views
1 Answer
0
Accepted Answer

For your use case, you might choose to use the Amazon SQS queue to ensure better control over retries and concurrency. [Amazon Textract has relatively low API limits][1]. Therefore, if you use Amazon S3 Put triggers directly to extract the data, a burst of files might lead to throttling. Having Amazon SQS in the middle allows you to have better control and visibility over information, such as how many messages are yet to be processed and how many times the failed messages can be retried. [1]: https://docs.aws.amazon.com/general/latest/gr/textract.html

AWS
EXPERT
Adam_W
answered 3 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions