AWS Timestream get most recent entry

0

I have a timestream table that occasionally receives large bursts of entries, but otherwise can go for days/weeks without getting any new rows. I want to construct a query to efficiently return the most recent row logged to this table.

Since this table can go long periods of time without receiving any new entries, it doesn't make sense to have a WHERE time BETWEEN ... clause in my query. So my query looks like:

SELECT MAX(time), MAX_BY(other_field, time) 
FROM my_database.my_table

However, the query data usage metrics show that I have to scan through all rows to execute this query. If an analogous query was run on a standard SQL database, this would be a near-instantaneous query because time is indexed. Is there an approach I'm missing to make this query more efficient? I also tried something like:

SELECT time, other_field
FROM my_database.my_table
ORDER BY time DESC
LIMIT 1

This had similar data usage as the previous query. From what I tested, it doesn't seem like timestream queries take advantage of the indexing/partitioning on time or any other partitioned fields unless they are a part of a WHERE clause.

Is there an idiomatic way to retrieve the most recent row from a timestream table without scanning the entire table?

And as a followup question, is there a way to extend the above solution to a query like:

SELECT MAX(time), MAX_BY(other_field, time), my_partitioned_field
FROM my_database.my_table
GROUP BY my_partitioned_field

assuming my_partitioned_field is a partition key on the table?

asked 9 months ago647 views
1 Answer
-1

To get the most recent entry from an Amazon Timestream table, you can use a query that includes an ORDER BY clause and LIMIT 1 to fetch only the top (most recent) result.

see this as an example , The following gets the 10 most recently added data points for a table. (https://docs.aws.amazon.com/timestream/latest/developerguide/sample-queries.basic-scenarios.html)

SELECT * FROM <database_name>.<table_name>
ORDER BY time DESC
LIMIT 10

Yes , you can try to extend the solution to include a query with aggregation and grouping in Amazon Timestream but may be you have to extract the results and process the results as needed.

profile pictureAWS
EXPERT
answered 9 months ago
  • This approach, as I mentioned in my original post (though there was a typo, corrected now), still needs to scan the entire table to retrieve that result. Since there is an index on time, I'd expect there to be a way to query the latest row without scanning the entire table.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions