Best partition key and sort key for dynamoDB

0

Hi! I need to store in dynamoDB information related to users. That information will be queried (I need milliseconds response time) by month and year (separately), so I was thinking of different approaches here. First of all, I was thinking of:

  • Partition key: the unique identifier for the customer (for example: 3322)
  • SortKey: the year (numeric, for example: 2023)
  • Create an index using the partition key and the month (numeric, for example: 01)

But I was also doubting about using just a single partitionkey with the syntax: customerid+_Mmonth+_Yyear. For example, if the customerid is 3322, january 2023 would be stored as: partitionkey = 3322_M01_Y2023 In this case no sort key would be needed. My main concern here is whether it will be efficient to query by "partition key ends with Y2023" when searching by year or "partitionkey contains M01" when searching by month. Which approach should I follow?

Thank you for your help!

Pascual
已提问 1 年前2367 查看次数
2 回答
1
已接受的回答

But I was also doubting about using just a single partitionkey with the syntax: customerid+_Mmonth+_Yyear. For example, if the customerid is 3322, january 2023 would be stored as: partitionkey = 3322_M01_Y2023 In this case no sort key would be needed. My main concern here is whether it will be efficient to query by "partition key ends with Y2023" when searching by year or "partitionkey contains M01" when searching by month. Which approach should I follow?

You cannot achieve either of those with a partition key. In DynamoDB you must provide the full value of the partition key. The sort key can be used for begins_with or between but not for ends_with or contains. If that sound confusing to you, read up on B+ tree sorting as thats what it resembles.

profile pictureAWS
专家
已回答 1 年前
profile picture
专家
已审核 1 个月前
  • Ok, so once that option is discarded, taking into account the first approach, I could also do:

    • partition key: customer id
    • sort key: yyyy-mm-dd This way, I could search by year, by month (as I will also know the year) and by day (as I will know the year and the month)
  • Yes, in that case you would also need the customer id. If your access pattern is Give me all the data for customer 123 between date 1 and date 2 then this will work well.

  • You mean to also include the customerid in the sortkey? Something like 3322-2023-02-10 instead of just 2023-02-10?

  • No. Easier to ask what is your primary look-up pattern?

  • I'll need the following type of queries:

    • Give me all entries for the customerid=X where month=Y and year=Z (and we'll draw a graph showing daily data)
    • Give me all entries for the customerid=X where year=Z (and we'll draw a graph showing monthly data)

    Having this information, I guess the partitionkey being the customerid is a must. But then, as we don't need to fetch specific days, we could only store the sortkey as year+month.

0

Another approach is storing the date as a Epoch time number as described here: Using Sort Keys to Organize Data in Amazon DynamoDB.

profile pictureAWS
专家
kentrad
已回答 1 年前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则