Duplicate timestamps are not allowed in a timeseries error when using Timestream INTERPOLATE_LOCF function

0

Hi team,

We have been using the INTERPOLATE_LOCF to be able to sample a data between a time range so we don't pull too many data points. It automatically selects N records based on a time range. For our case we get the difference between two selected dates and device that by 30. So when we run the query we always get max of 30 data points returned no matter what the date range is. We have not had any issues for a while. However recently we added some new IoT sensors that sometimes lose connection briefly and it actually ends up sending two data points to timestream with exact time stamp. So now when I try to run the INTERPOLATE_LOCF it returns the error "Duplicate timestamps are not allowed in a timeseries." . I have looked at the records in timestream and yes sometimes two of the records have the exact time stamp such as "2023-01-10 19:34:41.668000000" and "2023-01-10 19:34:41.668000000". Is there anyway we can use the INTERPOLATE_LOCF function with having duplicated timestamp in timestream table?

Below is a sample query for splitting the date range into 30 data points that are 1.7 seconds apart

SELECT INTERPOLATE_LOCF(CREATE_TIME_SERIES(time, measure_value::varchar),SEQUENCE(min(time), max(time), 1.7s)) AS interpolated_temp FROM TimeStreamTable

Thank you for your time

asked a year ago461 views
1 Answer
0
Accepted Answer

For anyone having similar issue. After some investigation I found that the error was being thrown by the CREATE_TIME_SERIES function. It does not allows duplicated time stamps. So I fixed the issue by writing a nested query to first filter out the duplicated timestamps by using group by, Distinct on the time and selecting only of the values for the measure_value. You can use MAX, AVG or MIN function to achieve this. Below is my final query :

WITH DistinctTimeDatePoints AS ( SELECT distinct time, max(measure_value::varchar) as measure_value FROM MITimeStreamDB.Table1 WHERE time BETWEEN '2022-12-1 19:31:20.000000000' AND '2023-01-10 19:31:50.000000000' GROUP BY time ) SELECT INTERPOLATE_LOCF(CREATE_TIME_SERIES(DISTINCT time, measure_value),SEQUENCE(min(time), max(time), 115201s)) AS InterpolatedSeries FROM DistinctTimeDatePoints

answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions