How can I troubleshoot high latency on an Amazon DynamoDB table?

5 minute read

I see an increase in the response time for Amazon DynamoDB requests, but I don't know why the increase happens.

Short description

End-to-end latency involves a shared responsibility between DynamoDB service-side latency and the user client-side latency. 

The SuccessfulRequestLatency Amazon CloudWatch metric measures only the amount of time it takes for DynamoDB to fully process the API request. DynamoDB doesn't measure the time the application takes to connect to the DynamoDB endpoint or download the results from the endpoint.


When you analyze latency, it's a best practice to check the average latency and not the max latency values. Occasional spikes in latency are normal. If the average latency is high, then there might be an underlying issue.

For most atomic operations, such as GetItem and PutItem, an average latency is in single-digit milliseconds. Latency for non-atomic operations, such as Query, Scan, BatchGetItem and BatchWriteItem, depend on many factors. These factors include the size of the result set, the number of inserted records, and the complexity of query conditions and filters.

Common reasons for high latency

Infrequent access

To avoid latency, all frontend hosts in DynamoDB maintain local caches. When the rate of request is low, frontend fleets might not receive requests for a time and result in a cache time to live (TTL) expiration. If a request arrives on a host after a cache is expired, then the host has to get data from internal DynamoDB components. The host populates the cache, and then the host can respond. The time that it takes to get and populate data can cause latency to increase.

If a partition splits or leadership changes, then the cache can become stale and can cause elevated latency for the first few calls. DynamoDB references multiple caches to retrieve partition information, signature validation, and other information about each request. When the request rate is low, the caches don't stay warm and latency is usually higher for the first few requests.

If request rates are high and incoming traffic is consistent, then each request consistently reaches the frontend fleet and doesn't cause any latency spikes. To avoid issues from infrequent access, have the client send dummy traffic to the DynamoDB table.

Strongly consistent reads

Read operations such as GetItem, Query, and Scan, provide an optional ConsistentRead parameter. If you set ConsistentRead to true, then DynamoDB returns a response with the most up-to-date data. This data reflects the updates from all prior successful write operations, but can result in a higher latency.

DynamoDB architecture has one leader node and two replica nodes in a partition. DynamoDB uses the leader node to service strongly consistent read requests, but doesn't service the read replicas. Because DynamoDB has to locate the leader node and then redirect the request to service, you can experience some latency.

If your application doesn't require strongly consistent reads, then use eventually consistent reads.

Ways to reduce latency

Reduce the request timeout settings

Tune the requestTimeOut and clientExecutionTimeout client SDK parameters to time out and fail much faster, such as after 50 milliseconds. This faster timeout causes the client to abandon high latency requests after a specified time period. Then, the client sends a second request that usually completes much faster than the first. For more information about timeout settings, see Tuning AWS Java SDK HTTP request settings for latency-aware Amazon DynamoDB applications.

Reduce the distance between the client and DynamoDB endpoint

If you have globally dispersed users, then use global tables to specify the AWS Regions where you want the table to be available. You can also use DynamoDB gateway endpoints to avoid traffic across the internet.

Use caching

If your traffic is read heavy, then use a caching service to lower latency, such as Amazon DynamoDB Accelerator (DAX).

Connection reuse

When you establish a new connection, you must authenticate and validate the connection. Also, before it can process the request, DynamoDB must get the table's metadata from the internal systems. The DynamoDB frontend fleet maintains a cache that's used to store this information. If the connections are reused, then the cache is used to service the requests.

Because DynamoDB requests for AWS Key Management Service (AWS KMS) users need an additional hop to get authenticated, you can experience an increase in latency. The AWS KMS keys are refreshed every 5 minutes. If a client connection isn't reused, then a new TCP connection has to be established for every request. The connections for these requests include processes, such as the TCP handshake and acknowledgement, and contribute to the client-side latency.

Because authentication is required for every new connection, caches aren't used and cause server-side latency. To reuse connections, use the TCP keep-alive parameter. Also, use a design pattern that guarantees a single instance of the connection object. For more information, see AWS client not reused in a Lambda function on the Amazon CloudGuru website.

Related information


Understanding Amazon DynamoDB latency

Best practices for querying and scanning data

AWS OFFICIALUpdated 2 months ago