Best partition key and sort key for dynamoDB

0

Hi! I need to store in dynamoDB information related to users. That information will be queried (I need milliseconds response time) by month and year (separately), so I was thinking of different approaches here. First of all, I was thinking of:

  • Partition key: the unique identifier for the customer (for example: 3322)
  • SortKey: the year (numeric, for example: 2023)
  • Create an index using the partition key and the month (numeric, for example: 01)

But I was also doubting about using just a single partitionkey with the syntax: customerid+_Mmonth+_Yyear. For example, if the customerid is 3322, january 2023 would be stored as: partitionkey = 3322_M01_Y2023 In this case no sort key would be needed. My main concern here is whether it will be efficient to query by "partition key ends with Y2023" when searching by year or "partitionkey contains M01" when searching by month. Which approach should I follow?

Thank you for your help!

Pascual
질문됨 일 년 전2366회 조회
2개 답변
1
수락된 답변

But I was also doubting about using just a single partitionkey with the syntax: customerid+_Mmonth+_Yyear. For example, if the customerid is 3322, january 2023 would be stored as: partitionkey = 3322_M01_Y2023 In this case no sort key would be needed. My main concern here is whether it will be efficient to query by "partition key ends with Y2023" when searching by year or "partitionkey contains M01" when searching by month. Which approach should I follow?

You cannot achieve either of those with a partition key. In DynamoDB you must provide the full value of the partition key. The sort key can be used for begins_with or between but not for ends_with or contains. If that sound confusing to you, read up on B+ tree sorting as thats what it resembles.

profile pictureAWS
전문가
답변함 일 년 전
profile picture
전문가
검토됨 한 달 전
  • Ok, so once that option is discarded, taking into account the first approach, I could also do:

    • partition key: customer id
    • sort key: yyyy-mm-dd This way, I could search by year, by month (as I will also know the year) and by day (as I will know the year and the month)
  • Yes, in that case you would also need the customer id. If your access pattern is Give me all the data for customer 123 between date 1 and date 2 then this will work well.

  • You mean to also include the customerid in the sortkey? Something like 3322-2023-02-10 instead of just 2023-02-10?

  • No. Easier to ask what is your primary look-up pattern?

  • I'll need the following type of queries:

    • Give me all entries for the customerid=X where month=Y and year=Z (and we'll draw a graph showing daily data)
    • Give me all entries for the customerid=X where year=Z (and we'll draw a graph showing monthly data)

    Having this information, I guess the partitionkey being the customerid is a must. But then, as we don't need to fetch specific days, we could only store the sortkey as year+month.

0

Another approach is storing the date as a Epoch time number as described here: Using Sort Keys to Organize Data in Amazon DynamoDB.

profile pictureAWS
전문가
kentrad
답변함 일 년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠