Can there be duplicate events if we increase Parallelization factor for processing DDB Stream events by Lambda ?


AWS Lambda now supports Parallelization factor to process single ddb shard by multiple concurrent invocations.


In the official doc it's written that

To increase concurrency, you can also process multiple batches from each shard in parallel. Lambda can process up to 10 batches in each shard simultaneously. If you increase the number of concurrent batches per shard, Lambda still ensures in-order processing at the partition-key level.

Can there be duplicate events ?

2 Answers

DynamoDB Streams provides exactly once delivery of events to the streams. However, Lambda does not have such guarantees. Lambda provides at least once processing, meaning it can retry the same DynamoDB Stream batch more than once, giving you the illusion that there were duplicates in the stream.

When using a parallelisation factor of more than one, it does not change any of the above semantics. Each invocation works on a different batch in the stream, but again retries could happen.

If duplicates are an issue, you must ensure that your Lambda processes with idempotency in mind for your downstream consumers. Lambda Powertools has some useful Idempontency features which you can look into:

profile pictureAWS
answered 5 months ago

Lambda guarantees at least once delivery of events, so even without parallelization factor, you can get the same event twice. Saying that, PF does not introduce additional duplicates.

profile pictureAWS
answered 5 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions