AWS Timestream get most recent entry

0

I have a timestream table that occasionally receives large bursts of entries, but otherwise can go for days/weeks without getting any new rows. I want to construct a query to efficiently return the most recent row logged to this table.

Since this table can go long periods of time without receiving any new entries, it doesn't make sense to have a WHERE time BETWEEN ... clause in my query. So my query looks like:

SELECT MAX(time), MAX_BY(other_field, time) 
FROM my_database.my_table

However, the query data usage metrics show that I have to scan through all rows to execute this query. If an analogous query was run on a standard SQL database, this would be a near-instantaneous query because time is indexed. Is there an approach I'm missing to make this query more efficient? I also tried something like:

SELECT time, other_field
FROM my_database.my_table
ORDER BY time DESC
LIMIT 1

This had similar data usage as the previous query. From what I tested, it doesn't seem like timestream queries take advantage of the indexing/partitioning on time or any other partitioned fields unless they are a part of a WHERE clause.

Is there an idiomatic way to retrieve the most recent row from a timestream table without scanning the entire table?

And as a followup question, is there a way to extend the above solution to a query like:

SELECT MAX(time), MAX_BY(other_field, time), my_partitioned_field
FROM my_database.my_table
GROUP BY my_partitioned_field

assuming my_partitioned_field is a partition key on the table?

질문됨 10달 전685회 조회
1개 답변
-1

To get the most recent entry from an Amazon Timestream table, you can use a query that includes an ORDER BY clause and LIMIT 1 to fetch only the top (most recent) result.

see this as an example , The following gets the 10 most recently added data points for a table. (https://docs.aws.amazon.com/timestream/latest/developerguide/sample-queries.basic-scenarios.html)

SELECT * FROM <database_name>.<table_name>
ORDER BY time DESC
LIMIT 10

Yes , you can try to extend the solution to include a query with aggregation and grouping in Amazon Timestream but may be you have to extract the results and process the results as needed.

profile pictureAWS
전문가
답변함 10달 전
  • This approach, as I mentioned in my original post (though there was a typo, corrected now), still needs to scan the entire table to retrieve that result. Since there is an index on time, I'd expect there to be a way to query the latest row without scanning the entire table.

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠