- Newest
- Most votes
- Most comments
Hoping to add more context about this, in addition to jzhunter's answer.
Say you have a DynamoDB table with 3 partitions, and all the 3 items you wrote simultaneously land on 3 different partitions. DynamoDB Streams creates shards based on the number of partitions on the table, so there will be 3 active shards, corresponding to the 3 partitions. Stream records are created in shards to it's respective DynamoDB table partitions.
Lambda pollers are per shard for DynamoDB streams, thus in this case when there are 3 items written simultaneously but on different partitions, there will be 3 separate Lambda invocations as they have separate pollers.
There would have been only 1 Lambda invocation if:
- All the items landed on the one DynamoDB partition
- Stream records were created "within" 4 seconds as per the batching window
To understand this behaviour you have to first understand the mapping of partitions to Lambda instances. Each partition in DynamoDB maps 1:1:1 with a shard in the stream and a Lambda instacne:
If the 3 items you write to DynamoDB have the same partition key, then you might* see the behaviour you expect. However, if they share different partition keys, the will most likely end up in different partitions, then different shards, and ultimately invoke 1 to many different Lambda instances.
DynamoDB streams only guarantee that the same item will end up in the same Lambda instance, meaning if you update the same item 3 times, you are guaranteed to see the behaviour you expect, otherwise there are no guarantees.
*DynamoDB does not guarantee items with the same partition key will be in the same physical partition. It only offers guarantees for item level (primary key, not partition key only).
Thanks for the image that describe the behavior of dynamodb stream and lambda. My question is match exactly with this image. I use record id as partition id, mean that they will be placed on different partition, leading to different shard.
You're welcome. Pictures say a thousand words.
DynamoDB sends stream records out using shards (in order to scale), and the settings you're configuring are per shard not per table. If you insert three items and they're landing in three separate shards, then you'd get the behavior you're describing.
Search for "shard" in the docs: https://docs.aws.amazon.com/lambda/latest/dg/with-ddb.html
Relevant content
- AWS OFFICIALUpdated a month ago
- AWS OFFICIALUpdated 3 months ago
- AWS OFFICIALUpdated 3 years ago
- AWS OFFICIALUpdated a year ago
Thanks for your response. This mean that lambda invocation happen only 1 if I update on the same partition key, right ? Because my table use record id as partition key, it also act as primary key, too. Since primary key cannot be duplicated , that is the reason why my function invoke separately.