DynamoDB table design

0

A specific question about DynamoDB table design:

OPTION1:

TAGS_TABLE | tag_id (PK) | sched_id (SK) | shed_info ... | content_info ...|

OPTION2:

TAGS_TABLE | tag_id (PK) | sched_id (SK) | shed_info ... | content_id |

CONTENT_TABLE | content_id | content_info ...|

Taking into account that content_info will often be the same in many entries from TAGS_TABLE does it make sense to split table in two? However, content_info itself will not be requested separately from tag.

AWS
asked 5 years ago300 views
1 Answer
0
Accepted Answer

Generally speaking, DDB best practice is to use a single table for a given use case / app / microservice.

The two-table approach requires extra work at the application layer to query across the two tables, aggregate results, etc. You're trying to apply relational patterns to a non-relational database. If you're using multiple tables in DDB, then typically you can either combine them into one by rethinking your key selection or, perhaps, a relational database may be the better choice.

One would need a thorough understanding of your customer's specific read and write patterns and volume before being comfortable giving advice on key selection.

Customers new to DDB often fall into relational habits of giving their primary and sort key the name of a specific attribute, such tag_id or sched_id. While this is totally acceptable and fits many use cases, there are ways to think outside the box here.

For example, you could name your primary key primary_key and your sort key sort_key. By doing this, it becomes apparent than a single DDB table can hold more than 1 conceptual table from a relational world. e.g. your one table could hold customer data where the primary_key attribute has a format of cust_123 and, in the same table, hold order data where the primary_key = order_xyz.

Below is an example of a DDB table that holds both customer master data and order history. In a relational world, this would need to be (at least) two tables, but in DDB it could be one:

primary_key             (other attributes)
------------------      ------------------------------------------
cust_123                name=matt, state=AZ
cust_124                name=sam , state=NY
order_1                 customer=cust123   order_date=12/31/2015, order_items=[......]

Recommend watching Advanced Design Patterns for DynamoDB: https://www.youtube.com/watch?v=HaEPXoXVf2k

Taking into account that content_info will often be the same in many entries from TAGS_TABLE does it make sense to split table in two? However, content_info itself will not be requested separately from tag.

If content_info is stored with each item:

  • read logic is easier, just get a single item
  • consumes more storage for the duplicate content
  • requires fewer GetItem() calls to retrieve data = faster response, potentially less RCU
  • if content_info changes, more WCU to update across all affected items

if content_info stored as separate item:

  • read logic more complicated in app tier
  • consumes less storage
  • requires two GetItem() API calls to for each logical read
  • if content_info changes, only requires one write

Other questions:

  • how often is content_info changing?
  • how many tag_id items might have the same content_info?
answered 5 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions