Hierarchical data plus condition - Table Design

0

Hi all,

I'm having problems designing my DynamoDB table to support a fairly simple access pattern. I hope you can help me a little bit :)

I have 4 different products types (A, B, C and D) that have a price and a location (country#state#city).

The access patterns are:

(1) Filter by product type
(2) Filter by product type and location
(3) Filter by product type, location and price

The problem is that a product located in USA#NY#NY must be also available in USA#NY and USA.

To be able to filter by location and price I came up with this solution, however, a lot of data gets duplicated and I am sure there must be a much better solution:

PK |     SK & GSI PK     | GSI SK | Other product details (duplicated data)
-----------------------------------------------------------------------------------
ID | TYPE                | PRICE | Image, name, etc
ID | TYPE#USA            | PRICE | Image, name, etc
ID | TYPE#USA#NY         | PRICE | Image, name, etc
ID | TYPE#USA#NY#NY      | PRICE | Image, name, etc

This solves every access pattern:

(1) GSI PK = TYPE

(2) GSI PK = TYPE#USA
GSI PK = TYPE#USA#NY
GSI PK = TYPE#USA#NY#NY

(3) GSI PK = TYPE#USA & GSI SK > 150
GSI PK = TYPE#USA#NY & GSI SK > 150

Reads are efficient but a lot of data gets duplicated (price and product details) and updating an item requires multiple writes.

Is it possible to achieve this without duplicating all the product details?

Edit: I don't want to use filter expressions on the price as it is the most common filter pattern and I'd like to make it efficient and cheap

Edited by: jexposito on Aug 1, 2019 9:53 AM

asked 5 years ago203 views
3 Answers
0
Accepted Answer

I'll show you two alternate options (below). A few things before we dive into them.
First, looking at your schema, each item has about 5 attributes, so I expect your items are average 40B-50B each? With that assumption, we know 1RCU can read 4K data (or 8K if you are okay with eventual consistency). At 1RCU per 4K, you can read, say, 100 items with 1 RCU. If on an average, each of your queries will return 100 items, then you can go with your current schema, and filter out the items you don't need.
Second, instead of duplicating the items by implementing that logic in your application code, you can leverage GSIs. See option#2 below.
With that, please review the following. Your business use case and shape of your data will drive which option works best for you.
==Alternate option#1==
Table-PK | Table-SK | GSI-PK | GSI-SK | Price
abc | xyz | TYPE | USA#NY#NY#PRICE | $x

(1) Filter by product type
GSI-PK = TYPE

(2) Filter by product type and location
GSI-PK = TYPE, GSI-SK starts with USA
GSI-PK = TYPE, GSI-SK starts with USA#NY
GSI-PK = TYPE, GSI-SK starts with USA#NY#NY

(3) Filter by product type, location and price
GSI-PK = TYPE, GSI-SK starts with USA, filter (price)
GSI-PK = TYPE, GSI-SK starts with USA#NY, filter (price)
GSI-PK = TYPE, GSI-SK > USA#NY#NY#PRICE

==Alternate option#2==
Table-PK | Table-SK | GSI-PK | GSI1-SK | GSI2-SK | GSI3-SK
abc | xyz | TYPE | USA#PRICE | USA#NY#PRICE | USA#NY#NY#PRICE

(1) Filter by product type
GSI-PK = TYPE

(2) Filter by product type and location
GSI-PK = TYPE, GSI3-SK starts with USA
GSI-PK = TYPE, GSI3-SK starts with USA#NY
GSI-PK = TYPE, GSI3-SK starts with USA#NY#NY

(3) Filter by product type, location and price
GSI-PK = TYPE, GSI1-SK > USA#PRICE
GSI-PK = TYPE, GSI2-SK > USA#NY#PRICE
GSI-PK = TYPE, GSI3-SK > USA#NY#NY#PRICE

AWS
answered 5 years ago
0

Thank you so much for your answer @padma-aws it was very helpful.

You are right about the cost of filtering and I'll definitely consider it for other use cases in my app.

However, I find the second solution much better as it also allows me to sort by price (expensive/cheap first) and the first one doesn't.

answered 5 years ago
0

Glad it is helpful. Have a nice day.

AWS
answered 5 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions