Lambda to DynamoDB throughput question

0

IHAC who sent me the following email:

I'm working to use Lambda as our primary computation environment. So far, that amounts to funneling data ingested via the API Gateway to various endpoints (often similar in effect to the AWS IoT rules engine) and using DynamoDB to store configuration data.

The obstacle I'm currently grappling with is the throughput limits on DynamoDB. In standard operation, we have a slow, steady stream of requests that don't begin to approach our limits. However, on rare occasions, I'll need to add a large data store. As things are set up, that translates to a large number of near simultaneous requests into DynamoDB. However, we don't have a latency requirement. Within reason, I don't care when this operation completes, just that it does. If I could space these requests to stay below our limits, the problem would be solved.

In essence, I want our burst response to distribute the load over time as opposed to scaling up our systems.

Initially, I was trying to setup a scheduler, a function I could call to simply say "Try this lambda function again in X.Y minutes" with CloudWatch Events. However, I ran into a different limitation there of only being able to make 5 CloudWatch API requests per second. I didn't solve the throughput issue so much as move it to a different service.

I have a couple different ways of solving this specific problem, but the overall scheduling design pattern was one I'm really interested in.

My initial thought is to introduce SQS between the API Gateway-fronted Lambda. That Lambda would write the payload to SQS, then use CloudWatch metrics to kick off an additional Lambda to process messages from the queue when the queue depth is greater than zero. If there is an issue writing to DynamoDB, the message simply not be removed from the queue and it can be processed later.

Does that make sense, or is there a better suggestion for the customer?

1 Answer
0
Accepted Answer

I would suggest that you first send the data to SQS and then from SQS you can "pool" the new ingested messages and send to DynamoDB With this system you can queue spikes of messages in SQS and then later upload to DynamoDB with a more "steady" throughput.

AWS
EXPERT
answered 7 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions