DynamoDB Hierarchical, Sorted Queries

0

I would like to be able to query data hierarchically and return the results ordered by another attribute. What is the most efficient way to store and query sorted hierarchical data?

For example, if I have a table with four attributes: customer_id, country, location, and last_updated_date, where location contains hierarchical information such as state:county:city, so a few records may look like:

------------|--------|-------------------|-------------|
customer_id |country |location           |last_updated |
------------|--------|-------------------|-------------|
123456      |USA     |WA:King:Seattle    |2022-03-18   |
789012      |USA     |WA:King:Kent       |2022-03-15   |
098765      |USA     |NY:Bronx:NYC       |2022-02-28   |
432109      |USA     |WA:Spokane:Spokane |2022-03-20   |

The PK of the table is the customer_id because most queries will pull information by customer_id, but there are other use cases that will want to (a) find all customers within a given location (e.g. state or county), and (b) return the results sorted (descending) by last_updated. To accomplish (a), I have a GSI, with country as the PK and location as the SK, querying the GSI using location.begins_with.

But I can't figure out how to accomplish (b). My understanding is that ordering operations are usually performed with scanIndexForward, but I'm already using the GSI for the hierarchical query. Is there a way to do both (a) and (b)?

Thanks!

已提问 2 年前519 查看次数
1 回答
0
已接受的回答

Create a GSI where the PK is Country+Location concatenated and the SK is the last_updated. Then for any given Country+Location value you can find the last updated values (and corresponding customer ids in the projection) for that location. Note you can setup 20 GSIs.

Or are you hoping to do a begins_with against the location (like for the wider value of WA:King) while also getting only the latest X many? In which case, well, you can't. You can however insert new items for each hierarchical level (like WA:King, etc) so that that value can be used as the PK. Basically pre-materialize at each level. You only get one sort key per table/index and you can't use the sort key functionality twice.

Of course, if you intend to get ALL the matching items (not just the latest X many) and just want them sorted, you can sort on the client.

AWS
已回答 2 年前
  • Thanks! I was indeed hoping to do something like a begins_with, so it looks like I'm a little out of luck. I think what I'm going to do is create a different GSI for each potential level of hierarchy (e.g. one for Country, one for State, etc.) with the location as the PK and last_updated as the SK, and query the appropriate GSI based on the level of hierarchy needed.

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则