DynamoDB how does partitioning work when sort key begins to be included

1

My question about DynamoDB is about this: https://aws.amazon.com/blogs/database/choosing-the-right-dynamodb-partition-key/

Where it says: “DynamoDB splits partitions by sort key if the collection size grows bigger than 10 GB.”

Does that mean it will use the sort key for partitioning? As in PK+SK will be used to calculate the hash? Or just that it will be sorted by SK, so once a partition fills, the next partition will pick up where the other one left off? But it’s still only the PK used for calculating the hash

I’m wondering if the cardinality of my partition key is determined by PK or PK+SK. If it’s the latter, then an operation like “Delete all items where PK=X”, would not have as big of a hot key / hot partition concern, because the SK should distribute the operation across many partitions. But if all items with the same PK are bundled up together in one partition (or a couple) then it will definitely be a problem.

AWS
질문됨 2년 전885회 조회
1개 답변
4
수락된 답변

The partition key is hashed to determine in which partition the item will go. Within that partition the items in the item collection for each partition key are held in sort key order. If the item collection grows large enough, it might be split across multiple partitions, in which case items there will be a split point chosen in the sort key and the items with an SK ahead of that go into one partition and the items with an SK after that go into another one.

Analogy time. Think of DynamoDB like a set of phonebooks. The PK is like a city name and its value determines which phone book to use. The SK would be like the names in each book, held in linear order. You want good dispersion of the PKs so you can have lots of phone books, and lots of parallel processing possible. For some cities like NYC it's big enough you split the names across a few books, like A-M and N-Z.

To be a bit more detailed, the hash of the city name determines the shelf on which the phone book for that city can be found. So hash the PK, find the shelf, find the right book on the shelf, use the sorted values within the book. It's all very efficient. Some cities can take up a whole shelf or more than one shelf.

Some individual names can be so commonly read/written that to spread visitor traffic they get their own shelf just for their page.

AWS
답변함 2년 전
profile picture
전문가
검토됨 한 달 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인