DynamoDB how does partitioning work when sort key begins to be included

1

My question about DynamoDB is about this: https://aws.amazon.com/blogs/database/choosing-the-right-dynamodb-partition-key/

Where it says: “DynamoDB splits partitions by sort key if the collection size grows bigger than 10 GB.”

Does that mean it will use the sort key for partitioning? As in PK+SK will be used to calculate the hash? Or just that it will be sorted by SK, so once a partition fills, the next partition will pick up where the other one left off? But it’s still only the PK used for calculating the hash

I’m wondering if the cardinality of my partition key is determined by PK or PK+SK. If it’s the latter, then an operation like “Delete all items where PK=X”, would not have as big of a hot key / hot partition concern, because the SK should distribute the operation across many partitions. But if all items with the same PK are bundled up together in one partition (or a couple) then it will definitely be a problem.

AWS
已提問 2 年前檢視次數 885 次
1 個回答
4
已接受的答案

The partition key is hashed to determine in which partition the item will go. Within that partition the items in the item collection for each partition key are held in sort key order. If the item collection grows large enough, it might be split across multiple partitions, in which case items there will be a split point chosen in the sort key and the items with an SK ahead of that go into one partition and the items with an SK after that go into another one.

Analogy time. Think of DynamoDB like a set of phonebooks. The PK is like a city name and its value determines which phone book to use. The SK would be like the names in each book, held in linear order. You want good dispersion of the PKs so you can have lots of phone books, and lots of parallel processing possible. For some cities like NYC it's big enough you split the names across a few books, like A-M and N-Z.

To be a bit more detailed, the hash of the city name determines the shelf on which the phone book for that city can be found. So hash the PK, find the shelf, find the right book on the shelf, use the sorted values within the book. It's all very efficient. Some cities can take up a whole shelf or more than one shelf.

Some individual names can be so commonly read/written that to spread visitor traffic they get their own shelf just for their page.

AWS
已回答 2 年前
profile picture
專家
已審閱 1 個月前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南