How do I troubleshoot DynamoDB Streams in my Lambda functions?

4 minute read
1

I want to use or troubleshoot Amazon DynamoDB Streams with my AWS Lambda functions.

Resolution

The following are common questions when using DynamoDB Streams with Lambda function:

Why won't my Lambda function scale when the DynamoDB stream is a trigger?

When you turn on a DynamoDB stream on a DynamoDB table, Amazon DynamoDB associates one shard for each partition. For example, if your DynamoDB table has 10 partitions and you turn on DynamoDB Streams on this table, then you have 10 shards.

If the number of partitions in your table increases, then the number of shards in the stream also increases.

Each partition on a DynamoDB table can handle up to 3000 read capacity units (RCUs), 1000 write capacity units (WCUs), and 10 GB data. Exceeding any of these parameters results in the following:

  • The addition of a new partition to the table.
  • The creation of a new shard in the DynamoDB stream.

How can I control data processing from the DynamoDB stream?

Batch size and batch window help control data processing from the stream.

Batch window: Sets how long to wait for records before invoking. The batch window provides control on data processing from the DynamoDB Stream. Note that this behavior depends on the data availability within the stream.

Batch size: Sets the maximum records in the batch.

The Lambda function isn't invoked until the following conditions are met:

  • The payload size reaches 6MB (synchronous invoke limit).
  • The batch window reaches its maximum value (60 seconds, in this example).
  • The batch size reaches its maximum value.

How is Parallelization Factor used to speed up data processing?

Parallelization Factor processes large amounts of records quickly by allowing more concurrent executions. You can set Parallelization Factor (default: 1 up to 10) to increase the number of shards that are processed. When you turn on Parallelization Factor, make sure to use random or unique partition keys to achieve the highest throughput.

Calculation: Parallelization Factor (concurrent batches per shard) * Shards = Concurrent execution

What is the BisectBatchonFunctionError setting?

If the Lambda function fails, then the batches are split in two when the BisectBatchonFunctionError option is set to true. The split batches are then retried until the problem record is found. Retries are processed based on the maximum retry and record age settings.

If the Retry attempts option is set to 0, then retries aren't attempted for the failed record. In this case, DynamoDB Stream discards failed records or sends them to the Dead Letter Queue (DLQ), if configured.

Example 1

In the following example 'p' represents the problem record and the Retry Attempts are set to 0.

Batch record: [1,2,3p,4,5p]

Split 1: [3p,4,5p]

Split 2: [3p] [4,5p] The retry discards [3p] because it's identified as the problem record. Or, it's sent to the DLQ, if configured.

Split 3: [4] is processed. [5p] is discarded or sent to the DLQ, if configured.

Example 2

In the following example, 'x' represents the problem record. Retry attempts are set to -1.

Batch records inserted: [1,2,3x,4,5x]

[3x,4,5x]

[3x,4,5x]

[3x,4,5x]

[3x,4,5x]

Why does IteratorAge in Lambda increase for my DynamoDB stream?

The following are common reasons for the IteratorAge in Lambda to increase:

  • There's a bad record in the DynamoDB stream.
  • There's a high volume of write operations (PutItem to BatchWriteItem) to the stream. The Lambda function might not be able to keep up with processing a high write volume. If this occurs, then increase the DynamoDB table provisioned capacity to increase the partition count per 1000 WCUs. Increasing the provisioned capacity increases the number of concurrent Lambda executions. For more information, see the previous section Why won't my Lambda function scale when the DynamoDB Stream is a trigger?
  • There was a drop in the number of DynamoDB partitions, such as migrating to a new account or to a new table.
  • There's throttling or function errors in the Lambda function. AWS Lambda retries records until the entire batch successfully processes or the age of the record expires. Also, the DynamoDB Streams retention period is 24 hours. To avoid data loss, it's a best practice to set up the DLQ. If the DLQ is configured, then AWS Lambda sends failed record batches to the DLQ after retires are completed or the record age expires.
    To resolve Lambda function errors, check the Amazon CloudWatch Logs for details on the error.
  • There's an increase in the Lambda function duration.
  • You must optimize error handling and Parallelization Factor.

For detailed information, see Why is my Lambda IteratorAge metric increasing, and how do I decrease it?

AWS OFFICIAL
AWS OFFICIALUpdated a year ago