Best partition key and sort key for dynamoDB

0

Hi! I need to store in dynamoDB information related to users. That information will be queried (I need milliseconds response time) by month and year (separately), so I was thinking of different approaches here. First of all, I was thinking of:

  • Partition key: the unique identifier for the customer (for example: 3322)
  • SortKey: the year (numeric, for example: 2023)
  • Create an index using the partition key and the month (numeric, for example: 01)

But I was also doubting about using just a single partitionkey with the syntax: customerid+_Mmonth+_Yyear. For example, if the customerid is 3322, january 2023 would be stored as: partitionkey = 3322_M01_Y2023 In this case no sort key would be needed. My main concern here is whether it will be efficient to query by "partition key ends with Y2023" when searching by year or "partitionkey contains M01" when searching by month. Which approach should I follow?

Thank you for your help!

Pascual
asked 2 years ago3138 views
2 Answers
2
Accepted Answer

But I was also doubting about using just a single partitionkey with the syntax: customerid+_Mmonth+_Yyear. For example, if the customerid is 3322, january 2023 would be stored as: partitionkey = 3322_M01_Y2023 In this case no sort key would be needed. My main concern here is whether it will be efficient to query by "partition key ends with Y2023" when searching by year or "partitionkey contains M01" when searching by month. Which approach should I follow?

You cannot achieve either of those with a partition key. In DynamoDB you must provide the full value of the partition key. The sort key can be used for begins_with or between but not for ends_with or contains. If that sound confusing to you, read up on B+ tree sorting as thats what it resembles.

profile pictureAWS
EXPERT
answered 2 years ago
profile picture
EXPERT
reviewed 4 months ago
profile picture
EXPERT
reviewed 6 months ago
  • Ok, so once that option is discarded, taking into account the first approach, I could also do:

    • partition key: customer id
    • sort key: yyyy-mm-dd This way, I could search by year, by month (as I will also know the year) and by day (as I will know the year and the month)
  • Yes, in that case you would also need the customer id. If your access pattern is Give me all the data for customer 123 between date 1 and date 2 then this will work well.

  • You mean to also include the customerid in the sortkey? Something like 3322-2023-02-10 instead of just 2023-02-10?

  • No. Easier to ask what is your primary look-up pattern?

  • I'll need the following type of queries:

    • Give me all entries for the customerid=X where month=Y and year=Z (and we'll draw a graph showing daily data)
    • Give me all entries for the customerid=X where year=Z (and we'll draw a graph showing monthly data)

    Having this information, I guess the partitionkey being the customerid is a must. But then, as we don't need to fetch specific days, we could only store the sortkey as year+month.

0

Another approach is storing the date as a Epoch time number as described here: Using Sort Keys to Organize Data in Amazon DynamoDB.

profile pictureAWS
EXPERT
kentrad
answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions