Skip to content

Multiple timestamp columns in AWS Timestream

0

I have a use-case where I need 2 timestamp columns in my Timestream table. My use-case is about storing some forecast values for some assets. A forecast data point is uniquely identified by its "generation_time" (the timestamp when the forecast was generated), "value_time" (the timestamp for which the forecast value applies) and "asset_id". So "generation_time" will be my main time column, i.e. Timestamp, and "asset_id" will be a Dimension. "value" (forecast value) will be a Measure. But TIMESTAMP type is not supported as a Dimension (only VARCHAR is supported), so "value_time" would have to be a Measure instead.

My problem is that then it is impossible to take into account "value_time" for deduplicating, as we can only deduplicate over Timestamp and Dimension columns, but not Measures.

Would there be a way to implement such use-case?

asked 2 years ago347 views
1 Answer
1
Accepted Answer

In AWS Timestream, you can indeed have multiple attributes of TIMESTAMP data type in a multi-measure record. However, as you’ve noted, these TIMESTAMP attributes behave like the time field except that Timestream does not index on the values of type TIMESTAMP in a multi-measure record. This means that you cannot use them for deduplication as Timestream only allows deduplication over Timestamp and Dimension columns, not Measures.

One potential workaround could be to convert your “value_time” to a string and store it as a Dimension. This way, you can use it for deduplication. However, this approach has its own drawbacks as you would lose the ability to perform time-based operations on “value_time”.

EXPERT
answered 2 years ago
EXPERT
reviewed 2 years ago
EXPERT
reviewed 2 years ago
EXPERT
reviewed 2 years ago
  • Thanks. I had already thought about your proposed workaround, but I did not retain it, as I was thinking that it can lead to a very high cardinality on such Dimension column (with timestamp converted as string). I assume that it would lead to performance issues.

    I will try it out instead with the following adapted workaround:

    • a Dimension column with the time delta between "value_time" and "generation_time" converted as string (so low cardinality)
    • keep "value_time" TIMESTAMP as Measure, which allow me to perform time-based operations over it, as you mentioned above

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.