DynamoDB Hierarchical, Sorted Queries

0

I would like to be able to query data hierarchically and return the results ordered by another attribute. What is the most efficient way to store and query sorted hierarchical data?

For example, if I have a table with four attributes: customer_id, country, location, and last_updated_date, where location contains hierarchical information such as state:county:city, so a few records may look like:

------------|--------|-------------------|-------------|
customer_id |country |location           |last_updated |
------------|--------|-------------------|-------------|
123456      |USA     |WA:King:Seattle    |2022-03-18   |
789012      |USA     |WA:King:Kent       |2022-03-15   |
098765      |USA     |NY:Bronx:NYC       |2022-02-28   |
432109      |USA     |WA:Spokane:Spokane |2022-03-20   |

The PK of the table is the customer_id because most queries will pull information by customer_id, but there are other use cases that will want to (a) find all customers within a given location (e.g. state or county), and (b) return the results sorted (descending) by last_updated. To accomplish (a), I have a GSI, with country as the PK and location as the SK, querying the GSI using location.begins_with.

But I can't figure out how to accomplish (b). My understanding is that ordering operations are usually performed with scanIndexForward, but I'm already using the GSI for the hierarchical query. Is there a way to do both (a) and (b)?

Thanks!

asked 3 years ago621 views
1 Answer
1
Accepted Answer

Create a GSI where the PK is Country+Location concatenated and the SK is the last_updated. Then for any given Country+Location value you can find the last updated values (and corresponding customer ids in the projection) for that location. Note you can setup 20 GSIs.

Or are you hoping to do a begins_with against the location (like for the wider value of WA:King) while also getting only the latest X many? In which case, well, you can't. You can however insert new items for each hierarchical level (like WA:King, etc) so that that value can be used as the PK. Basically pre-materialize at each level. You only get one sort key per table/index and you can't use the sort key functionality twice.

Of course, if you intend to get ALL the matching items (not just the latest X many) and just want them sorted, you can sort on the client.

AWS
answered 3 years ago
profile picture
EXPERT
reviewed 4 months ago
  • Thanks! I was indeed hoping to do something like a begins_with, so it looks like I'm a little out of luck. I think what I'm going to do is create a different GSI for each potential level of hierarchy (e.g. one for Country, one for State, etc.) with the location as the PK and last_updated as the SK, and query the appropriate GSI based on the level of hierarchy needed.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions