Can there be duplicate events if we increase Parallelization factor for processing DDB Stream events by Lambda ?

0

AWS Lambda now supports Parallelization factor to process single ddb shard by multiple concurrent invocations.

https://aws.amazon.com/blogs/compute/new-aws-lambda-scaling-controls-for-kinesis-and-dynamodb-event-sources/

documentation

In the official doc it's written that

To increase concurrency, you can also process multiple batches from each shard in parallel. Lambda can process up to 10 batches in each shard simultaneously. If you increase the number of concurrent batches per shard, Lambda still ensures in-order processing at the partition-key level.

Can there be duplicate events ?

已提問 5 個月前檢視次數 439 次
2 個答案
2

DynamoDB Streams provides exactly once delivery of events to the streams. However, Lambda does not have such guarantees. Lambda provides at least once processing, meaning it can retry the same DynamoDB Stream batch more than once, giving you the illusion that there were duplicates in the stream.

When using a parallelisation factor of more than one, it does not change any of the above semantics. Each invocation works on a different batch in the stream, but again retries could happen.

If duplicates are an issue, you must ensure that your Lambda processes with idempotency in mind for your downstream consumers. Lambda Powertools has some useful Idempontency features which you can look into: https://aws.amazon.com/blogs/compute/handling-lambda-functions-idempotency-with-aws-lambda-powertools/

profile pictureAWS
專家
已回答 5 個月前
1

Lambda guarantees at least once delivery of events, so even without parallelization factor, you can get the same event twice. Saying that, PF does not introduce additional duplicates.

profile pictureAWS
專家
Uri
已回答 5 個月前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南