Does a paginated scan return pages in a deterministic or undermistic order?

0

Say my table looks like this:

                               |                attributes
 partition key: username       |        gender       |        age

I would like it to be possible for a user to go through all rows in this table (a little by a little, meaning, in a paginated way). Obviously, a paginated scan should do the job here. But now, my worry is that this will result in the first item in the table being accessed way more than the remaining items (which is bad).

Is this really the case? Will a paginated scan always return items in a deterministic manner? (From start to end of table). Or will it scan through the whole table in a random fashion?

질문됨 2년 전578회 조회
2개 답변
0

Items have an internal order that's used during scans. You can't predict from the outside what the order will be because it's based on things like the hashes of PK values.

You want a semi-random item from the table? You can use the parallel scan functionality and specify up to a million segments and pick a random segment number to start your scan from. That would give you a million starting points (evenly distributed among the internal order of items).

AWS
답변함 2년 전
  • I don't know why you mention parallel scans all of a sudden, is not possible to pick a random segment number to start a scan from with a regular scan? Also, just to be clear, are you then saying that a regular scan with a random segment number to start the scan from will be what you would advice in my situation (and that it would be a good solution)?

  • To be clear, I didn't say to do a parallel scan. I said to use the parallel scan functionality which lets you split the scan into segments and you can pick a random one to read from. That was assuming you wanted a random item. I can't really give advice for your situation since you only asked about scan behaviors and didn't give your requirements. Are your requirements to pick a single random item from a table?

-1

DynamoDB paginates the results from Scan operations. With pagination, the Scan results are divided into "pages" of data that are 1 MB in size (or less). An application can process the first page of results, then the second page, and so on. A single Scan only returns a result set that fits within the 1 MB size limit.

Please read through the below links to get more information regarding the same.

Link- https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Scan.html -- [1]

https://aws.amazon.com/blogs/developer/understanding-auto-paginated-scan-with-dynamodbmapper/ -- [2]

profile pictureAWS
지원 엔지니어
답변함 2년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠