no lambda concurrency in dynamoDB stream trigger lambda

0

I have a dynamodb table for which I've configured a lambda to receive dynamodb stream triggers in order that I might accomplish the "fan-out" pattern shown below and described in this article here:

https://aws.amazon.com/blogs/database/how-to-perform-ordered-data-replication-between-applications-by-using-amazon-dynamodb-streams/

Enter image description here

In my dynamodb table, there is only a partition key (no sort key), and the partition key is made up of unique guids so we have a maximum amount of sparseness in our partitions (recommended to prevent hot partitions). When I update records in this table, though, I find that my trigger lambda is not being run concurrently even though I have set the concurrency = 5. No matter what I set my batching configurations to, I always get a single lambda running with stream batch size = 1. What's worse is if I trigger many records at once (which is my main desire), not only is this single-threaded behavior in place, but after a few minutes of running one by one, the entire stream just cancels and no more lambdas are triggered (leaving updated records without having any lambda processing). Any thoughts on what might be going on and how I might fix this to accomplish the fan-out pattern (i.e. many concurrent lambdas)? I have no requirement to maintain ordering of the shards, so maximum asynchronicity is the goal. Thanks in advance for any insight you can provide!

asked a year ago1084 views
2 Answers
2
Accepted Answer

Concurrency is achieved by 2 factors:

  1. The number of partitions in your DynamoDB table
  2. The parallelization factor of your Lambda function

I'm guessing your DynamoDB table has a single partition, which results in a single shard in the stream, and thus a single Lambda invocation. To increase concurrency you can do the following:

  1. Increase your WCU on your DynamoDB table to 40,000 WCU and when the table is active, reduce it to your working capacity needs. This will increase the partitions and ultimately increase the parallelization.
  2. Increase the Parallelization Factor to a value up to a max of 10. That is for every shard in the stream you can have 10X Lambdas invoked.
profile pictureAWS
EXPERT
answered a year ago
profile picture
EXPERT
reviewed 2 months ago
profile picture
EXPERT
reviewed 4 months ago
  • Thank you much, Leeroy - I think you are right that my assumption about having multiple partitions (due to sparse partition key) was incorrect. I tried setting the Parallelization Factor to 10 and saw concurrency happening (which I could have sworn I tried before and it didn't work). I wish there was an easy way to know in real time how many partitions my table has, especially since we are using on-demand capacity. Thanks again for your answer!

  • You can understand how many partitions you have by reading your DynamoDB Stream shards. List the shards in your stream, loop over them and count the ones which are marked as Active, this will be the number of partitions you have in your table.

0

You can use the AWS SDK for Node.js or Python to handle the stream events and invoke other Lambda functions asynchronously. And have you check to make sure your function is not meeting any memory or time limits. And I'm sure that you're already using the step function.

answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions