Best partition key and sort key for dynamoDB

0

Hi! I need to store in dynamoDB information related to users. That information will be queried (I need milliseconds response time) by month and year (separately), so I was thinking of different approaches here. First of all, I was thinking of:

  • Partition key: the unique identifier for the customer (for example: 3322)
  • SortKey: the year (numeric, for example: 2023)
  • Create an index using the partition key and the month (numeric, for example: 01)

But I was also doubting about using just a single partitionkey with the syntax: customerid+_Mmonth+_Yyear. For example, if the customerid is 3322, january 2023 would be stored as: partitionkey = 3322_M01_Y2023 In this case no sort key would be needed. My main concern here is whether it will be efficient to query by "partition key ends with Y2023" when searching by year or "partitionkey contains M01" when searching by month. Which approach should I follow?

Thank you for your help!

Pascual
質問済み 1年前2369ビュー
2回答
1
承認された回答

But I was also doubting about using just a single partitionkey with the syntax: customerid+_Mmonth+_Yyear. For example, if the customerid is 3322, january 2023 would be stored as: partitionkey = 3322_M01_Y2023 In this case no sort key would be needed. My main concern here is whether it will be efficient to query by "partition key ends with Y2023" when searching by year or "partitionkey contains M01" when searching by month. Which approach should I follow?

You cannot achieve either of those with a partition key. In DynamoDB you must provide the full value of the partition key. The sort key can be used for begins_with or between but not for ends_with or contains. If that sound confusing to you, read up on B+ tree sorting as thats what it resembles.

profile pictureAWS
エキスパート
回答済み 1年前
profile picture
エキスパート
レビュー済み 1ヶ月前
  • Ok, so once that option is discarded, taking into account the first approach, I could also do:

    • partition key: customer id
    • sort key: yyyy-mm-dd This way, I could search by year, by month (as I will also know the year) and by day (as I will know the year and the month)
  • Yes, in that case you would also need the customer id. If your access pattern is Give me all the data for customer 123 between date 1 and date 2 then this will work well.

  • You mean to also include the customerid in the sortkey? Something like 3322-2023-02-10 instead of just 2023-02-10?

  • No. Easier to ask what is your primary look-up pattern?

  • I'll need the following type of queries:

    • Give me all entries for the customerid=X where month=Y and year=Z (and we'll draw a graph showing daily data)
    • Give me all entries for the customerid=X where year=Z (and we'll draw a graph showing monthly data)

    Having this information, I guess the partitionkey being the customerid is a must. But then, as we don't need to fetch specific days, we could only store the sortkey as year+month.

0

Another approach is storing the date as a Epoch time number as described here: Using Sort Keys to Organize Data in Amazon DynamoDB.

profile pictureAWS
エキスパート
kentrad
回答済み 1年前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ