DynamoDB Hierarchical, Sorted Queries
I would like to be able to query data hierarchically and return the results ordered by another attribute. What is the most efficient way to store and query sorted hierarchical data?
For example, if I have a table with four attributes: customer_id
, country
, location
, and last_updated_date
, where location
contains hierarchical information such as state:county:city
, so a few records may look like:
------------|--------|-------------------|-------------|
customer_id |country |location |last_updated |
------------|--------|-------------------|-------------|
123456 |USA |WA:King:Seattle |2022-03-18 |
789012 |USA |WA:King:Kent |2022-03-15 |
098765 |USA |NY:Bronx:NYC |2022-02-28 |
432109 |USA |WA:Spokane:Spokane |2022-03-20 |
The PK
of the table is the customer_id
because most queries will pull information by customer_id
, but there are other use cases that will want to (a) find all customers within a given location (e.g. state
or county
), and (b) return the results sorted (descending) by last_updated
. To accomplish (a), I have a GSI
, with country
as the PK
and location
as the SK
, query
ing the GSI
using location.begins_with
.
But I can't figure out how to accomplish (b). My understanding is that ordering operations are usually performed with scanIndexForward
, but I'm already using the GSI
for the hierarchical query. Is there a way to do both (a) and (b)?
Thanks!
Create a GSI where the PK is Country+Location concatenated and the SK is the last_updated. Then for any given Country+Location value you can find the last updated values (and corresponding customer ids in the projection) for that location. Note you can setup 20 GSIs.
Or are you hoping to do a begins_with against the location (like for the wider value of WA:King) while also getting only the latest X many? In which case, well, you can't. You can however insert new items for each hierarchical level (like WA:King, etc) so that that value can be used as the PK. Basically pre-materialize at each level. You only get one sort key per table/index and you can't use the sort key functionality twice.
Of course, if you intend to get ALL the matching items (not just the latest X many) and just want them sorted, you can sort on the client.
Relevant questions
Now DynamoDB can return the throughput capacity consumed by PartiQL API calls to help you optimize your queries and throughput costs
asked 4 months agoAthena - Execute multiple query and capture output in a file on e2
asked 2 months agoSuggestion : Update multiple items based on a common query
Accepted Answerasked 3 years agoDynamoDB Update API response filter
asked 6 days agoRowCount for DynamoDB
Accepted Answerasked 3 years agoDelete whole partition or Delete using "begins_with" on the Partition Key?
asked 3 years agoObjects used in queries
Accepted Answerasked 3 years agoRetrieving Athena Query History data
asked 3 years agoHow to store the Athena Query Results in DynamoDB table?
asked 3 years agoDynamoDB Hierarchical, Sorted Queries
Accepted Answerasked 2 months ago
Thanks! I was indeed hoping to do something like a
begins_with
, so it looks like I'm a little out of luck. I think what I'm going to do is create a different GSI for each potential level of hierarchy (e.g. one for Country, one for State, etc.) with the location as the PK andlast_updated
as the SK, and query the appropriate GSI based on the level of hierarchy needed.