Does a paginated scan return pages in a deterministic or undermistic order?

0

Say my table looks like this:

                               |                attributes
 partition key: username       |        gender       |        age

I would like it to be possible for a user to go through all rows in this table (a little by a little, meaning, in a paginated way). Obviously, a paginated scan should do the job here. But now, my worry is that this will result in the first item in the table being accessed way more than the remaining items (which is bad).

Is this really the case? Will a paginated scan always return items in a deterministic manner? (From start to end of table). Or will it scan through the whole table in a random fashion?

已提問 2 年前檢視次數 704 次
2 個答案
0

Items have an internal order that's used during scans. You can't predict from the outside what the order will be because it's based on things like the hashes of PK values.

You want a semi-random item from the table? You can use the parallel scan functionality and specify up to a million segments and pick a random segment number to start your scan from. That would give you a million starting points (evenly distributed among the internal order of items).

AWS
已回答 2 年前
  • I don't know why you mention parallel scans all of a sudden, is not possible to pick a random segment number to start a scan from with a regular scan? Also, just to be clear, are you then saying that a regular scan with a random segment number to start the scan from will be what you would advice in my situation (and that it would be a good solution)?

  • To be clear, I didn't say to do a parallel scan. I said to use the parallel scan functionality which lets you split the scan into segments and you can pick a random one to read from. That was assuming you wanted a random item. I can't really give advice for your situation since you only asked about scan behaviors and didn't give your requirements. Are your requirements to pick a single random item from a table?

-1

DynamoDB paginates the results from Scan operations. With pagination, the Scan results are divided into "pages" of data that are 1 MB in size (or less). An application can process the first page of results, then the second page, and so on. A single Scan only returns a result set that fits within the 1 MB size limit.

Please read through the below links to get more information regarding the same.

Link- https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Scan.html -- [1]

https://aws.amazon.com/blogs/developer/understanding-auto-paginated-scan-with-dynamodbmapper/ -- [2]

profile pictureAWS
支援工程師
已回答 2 年前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南