Best partition key and sort key for dynamoDB

0

Hi! I need to store in dynamoDB information related to users. That information will be queried (I need milliseconds response time) by month and year (separately), so I was thinking of different approaches here. First of all, I was thinking of:

  • Partition key: the unique identifier for the customer (for example: 3322)
  • SortKey: the year (numeric, for example: 2023)
  • Create an index using the partition key and the month (numeric, for example: 01)

But I was also doubting about using just a single partitionkey with the syntax: customerid+_Mmonth+_Yyear. For example, if the customerid is 3322, january 2023 would be stored as: partitionkey = 3322_M01_Y2023 In this case no sort key would be needed. My main concern here is whether it will be efficient to query by "partition key ends with Y2023" when searching by year or "partitionkey contains M01" when searching by month. Which approach should I follow?

Thank you for your help!

Pascual
已提問 1 年前檢視次數 2365 次
2 個答案
1
已接受的答案

But I was also doubting about using just a single partitionkey with the syntax: customerid+_Mmonth+_Yyear. For example, if the customerid is 3322, january 2023 would be stored as: partitionkey = 3322_M01_Y2023 In this case no sort key would be needed. My main concern here is whether it will be efficient to query by "partition key ends with Y2023" when searching by year or "partitionkey contains M01" when searching by month. Which approach should I follow?

You cannot achieve either of those with a partition key. In DynamoDB you must provide the full value of the partition key. The sort key can be used for begins_with or between but not for ends_with or contains. If that sound confusing to you, read up on B+ tree sorting as thats what it resembles.

profile pictureAWS
專家
已回答 1 年前
profile picture
專家
已審閱 1 個月前
  • Ok, so once that option is discarded, taking into account the first approach, I could also do:

    • partition key: customer id
    • sort key: yyyy-mm-dd This way, I could search by year, by month (as I will also know the year) and by day (as I will know the year and the month)
  • Yes, in that case you would also need the customer id. If your access pattern is Give me all the data for customer 123 between date 1 and date 2 then this will work well.

  • You mean to also include the customerid in the sortkey? Something like 3322-2023-02-10 instead of just 2023-02-10?

  • No. Easier to ask what is your primary look-up pattern?

  • I'll need the following type of queries:

    • Give me all entries for the customerid=X where month=Y and year=Z (and we'll draw a graph showing daily data)
    • Give me all entries for the customerid=X where year=Z (and we'll draw a graph showing monthly data)

    Having this information, I guess the partitionkey being the customerid is a must. But then, as we don't need to fetch specific days, we could only store the sortkey as year+month.

0

Another approach is storing the date as a Epoch time number as described here: Using Sort Keys to Organize Data in Amazon DynamoDB.

profile pictureAWS
專家
kentrad
已回答 1 年前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南