Secondary Index in DynamoDB

0

Hi,

I have a use case where I am fetching data on certain items (unique itemID) multiple times a day (identified by day_BatchTime) and storing them on a dyanmoDB. My composite primary key consists of itemID & day_BatchTime. I have set itemID as parition key and day_BatchTime as sort key.

But I need to report data for each day on a daily basis. So I tried setting up a global secondary index as feedDate. But running the query on global secondary index did not return any result (screenshot: https://ibb.co/kJ3p5mZ ). Please let me know if I am missing something and if there is a better way to achieve my objective of day wise items.

I am using AWS console to fetch this data for now.

Regards, Dbeings

질문됨 3년 전901회 조회
3개 답변
0

Hi,

have you verified that feedDate has the same data type in both the table and the Secondary Index?

could you please share the table and index details?

thank you

AWS
전문가
답변함 3년 전
  • Hi Fabrizio,

    The feedDate query is now working in AWS console but cant get it working through Python code in Lambda. I followed the docs and below is the snippet. My dataframe is not getting populated. Without the filter expression, it gets populated so I think there is something off in filter expression. It would be great if you could provide some pointers.

    done = False start_key = None

    scan_kwargs = {'FilterExpression': Key('feedDate').eq('15-03-2022'), 'ProjectionExpression':'feedDate, itemID, val' }

    while not done:
    if start_key: scan_kwargs['ExclusiveStartKey'] = start_key response = table.scan(**scan_kwargs) items = response.get('Items', []) page_no = page_no+1 item_i = 0 if len(items) == 0: break while item_i<len(items): for key in items[item_i]: #mystr = mystr + ":" + key + str(items[item_i][key]) df_list.loc[row_num,key] = str(items[item_i][key]) item_i = item_i+ 1 row_num = row_num+1 start_key = response.get('LastEvaluatedKey', None) done = start_key is None

    dfLen = len(df_list)
    

    Regards, Dbeings

0

Hi Fabrizio,

The feedDate is of type String in both cases. Now I am able to run scan and get the results but I am still not able to run a query. The query requires a partition key and my partition key are different item ids for each day.

Below are key elements of table and details on global secondary index: Partition key: itemID (String) Sort key: Batch (String) feedDate(String)

Global secondary index details Name feedDate-index Status  Active Partition key feedDate (String) Sort key - Read capacity Range: 1 - 10; Auto scaling at 70%; Current provisioned units: 1 Write capacity Range: 1 - 10; Auto scaling at 70%; Current provisioned units: 1 Size 2.4 megabytes Item count 8527

Regards, Dbeings

답변함 3년 전
0

DynamoDB is designed for quick LOOKUP using specific keys. It is not designed to be an analytical tool, even if you have options such as SCAN and QUERY. If you know that you need to provide aggregations (or any other type of analytical functions), it is better to either create aggregation tables in DDB or export the data through DynamoDB streams to an analytical tool such as Athena.

For example, every time that you put another record in your main table above, you can also increase a counter to a day-count table. See an example for such logic here: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GettingStarted.Python.03.html#GettingStarted.Python.03.04

You don't need to limit your LOOKUP to a single record, as you have the sort key to get a few records. However, if you see that you need to read hundreds or even dozens of records to generate a specific statistic such as count, max, mean, etc. you are using DynamoDB inefficiently.

답변함 3년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠