Is the hot partition hard limit of 3000 RCU an d 1000 WCU per second for each partition relevant after adaptive capacity

0

https://aws.amazon.com/blogs/database/how-amazon-dynamodb-adaptive-capacity-accommodates-uneven-data-access-patterns-or-why-what-you-know-about-dynamodb-might-be-outdated/ As per this article, I understand that adaptive capacity will help in handling irregular distribution over partition by

  1. automatically increasing throughput capacity for partitions that receive more traffic.
  2. automatically resharding the partitions and increasing partitions if needed

So if my partition can be resharded then does this 1000 WCU limit apply for the partitions that I created or the partitions caused by resharding.

I have a GSI which shows high number of throttled events, but the consumed capacity seems lower than the provisioned capacity in metrics.I understand that these metrics shows minute level values and not second but with adaptive capacity shouldn't my partition be automatically resharded ? With adaptive capacity would it be still important to ensure high cardinality of partitions. If yes then what does automatic resharding mean to us currently ?

WCU usage

Throttled events

Megha
demandé il y a un mois38 vues
2 réponses
2

Hi

Adaptive capacity essentially helps use up the provisioned capacity on the table for disproportionately distributed access patterns across the partitions on the table.

Adaptive capacity does enable partitions to go beyond the simple division of table level throughput by the number of partitions. However, this is still capped at a maximum of 3,000 RCUs and 1,000 WCUs for a partition.

**Throttling will occur if a single partition consumes more than 3000 RCUs or more than 1000 WCUs. **

Adaptive capacity also "isolates" frequently accessed items on separated partitions i.e. re-shards. If your application drives consistently high traffic to a single item, adaptive capacity might rebalance your data so that a partition contains only that single, frequently accessed item. However, the partition's throughput is still capped at a maximum of 3,000 RCUs and 1,000 WCUs i.e. a single item driving all this throughput.

Read more about item isolation: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/burst-adaptive-capacity.html#isolate-frequent-access-items

You can try using the Cloudwatch Contributor Insights to identify the most accessed and throttled items in a table or GSI.

With that said, as rightly mentioned by you, ConsumedCapacity Cloudwatch metrics are only as granular as 1 minute, so it might be a case of second-level throttling. It could be that there was a sudden burst of requests which caused it to go over provisioned capacity of the table for only a few seconds in a minute and thus throttling the requests and then for most of the time it was below the provisioned capacity for that minute.

So it might "appear" overall for the minute that the consumed was less than provisioned but in fact there could have been some seconds in the minute when it was greater than the provisioned.

profile pictureAWS
EXPERT
répondu il y a un mois
profile picture
EXPERT
vérifié il y a un mois
2

Adaptive capacity tries to split hot partitions based on your throughput/access patterns. However, if you create an index with a low-cardinality key such as status then all of your writes to the base table which share the same GSI partition key will be focused on a single partition on your GSI.

Adaptive capacity should not be used as a workaround for a poorly designed data model, as it will not split your partitions for every scenario. Consider a low cardinality key as your index partition key and a monotonically increasing sort key such as time.now(). In this case adaptive capacity will not split your partitions, as it would make no sense, one partition one become cold, while the second would continue being hot no matter how many times it splits (as time monotonically increases).

profile pictureAWS
EXPERT
répondu il y a un mois
profile picture
EXPERT
vérifié il y a un mois

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions