- Newest
- Most votes
- Most comments
Ultimately it depends on what determines a group. If it is a groupId or some other common attribute then you can create a Global Secondary Index on that attribute and timestamp as the sort key.
If the group is determined by the user knowing each of the MachineIds, then simply just send an individual query in parallel for each of the machines and combine the results on the client side and do Arr.sort() on the timestamp;
To address your requirement of listing the latest records for multiple machines sorted by date, you can indeed use a Global Secondary Index (GSI) in DynamoDB. Here's a suggested approach:
-
Keep your current table structure with machine ID as the partition key and date as the sort key.
-
Create a GSI with the following structure:
- Partition key: A new attribute, let's call it "UserGroup"
- Sort key: Date (timestamp)
- Project other necessary attributes
-
When saving task logs, add the "UserGroup" attribute to each item. This attribute should represent the group of machines a user can view.
-
To query the latest records for multiple machines:
- Use the GSI
- Specify the "UserGroup" as the partition key
- Use the Date as the sort key with a condition for the desired time range
- Sort in descending order
This setup allows you to efficiently query across multiple machines (partition keys) that belong to a specific user group, sorted by date.
For example, to get the latest records from the last hour for a specific user group:
- Use the GSI
- Set the partition key to the user's group
- Set a condition on the sort key (Date) to be greater than one hour ago
- Limit the results as needed and sort in descending order
This approach provides a flexible and efficient way to retrieve sorted data across multiple partition keys in DynamoDB, addressing your dashboard requirements without the need for expensive scan operations.
Remember to carefully consider the additional write capacity and storage costs associated with GSIs, and ensure that the "UserGroup" attribute provides a good distribution of data to avoid hot partitions.
Sources
Get started with DynamoDB - Serverless
Step 5: Create a DynamoDB data model - AWS Prescriptive Guidance
Effective data sorting with Amazon DynamoDB | AWS Database Blog
Hi, I appreciate the answers, they are very close to what we have been doing.
As Leeroy mentions, groups are associated with users having access to certain machines. Grouping in a "user to machine" way greatly restricts the way in which the schema should be generated.
I understand that making parallel queries for each machine that a particular user should see is a very simple way, not even needing to use a GSI, but it can be very inefficient (let's say we want to get the last 10 records for 10,000 machines).
That's the main problem, generating a schema that allows us to quickly obtain the results we need without making parallel queries and then filtering on the client side, which can consume a lot of memory, depending on the number of records and machines involved.
I thought about using PartiQL, but it seems to have the same limitation (single partition key).
Relevant content
- Accepted Answerasked 6 months ago
- AWS OFFICIALUpdated 8 months ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 3 months ago
- AWS OFFICIALUpdated 5 months ago