You mention things got unreliable after reaching the limit of 40,000 and that writes have become unreliable, but you have not stated how so.
The throughput limits are just guard rails and can be increased using Service Quotas to any value you deem necessary. There is not additional cost to increase limits, however, you are enabling your application to consume more capacity, which can increase costs based on usage.
When you have your limits increased, you may need to consider pre-warming the table further, to avoid any throttling while you try to achieve more throughput that your tables have done so in the past.
A key decision for DDB Autoscaling to work best is the choice of the table partition key. The right one ( i.e. with high cardinality) will allow I/Os to be spread across many servers and then you'll get smooth scaling and good performances. If all I/Os happen in same partition, you'll start get throttling and other performance bottlenecks.
You should check this post re. proper choice of that key (and maybe modify your table design accordingly): https://aws.amazon.com/blogs/database/choosing-the-right-dynamodb-partition-key/
Then, I'd suggest to read this post: https://aws.amazon.com/blogs/database/amazon-dynamodb-auto-scaling-performance-and-cost-optimization-at-any-scale/ to see how to they achieve 1 million requests per second with good performances. The traffic pattern that was chosen for the benchmark seems close to yours.
A tip for achieving best throughput on bulk writes to DynamoDB: randomize the order of the input records. I don't think this is your issue, but it can be a good practice if you're not familiar with the nuances of DynamoDB's partition behavior.
- Accepted Answerasked a year ago
- Why didn’t the capacity change on my FSx for ONTAP volume after I changed the volume tiering policy to ALL?AWS OFFICIALUpdated 6 months ago
- AWS OFFICIALUpdated 2 months ago
- AWS OFFICIALUpdated 2 years ago
- EXPERTpublished 8 months ago