Updating old data in Timestream

0

You can't delete records from Timestream, but sometimes mistakes get in there. For what I'm doing, it's probably the right thing to do to leave those records there, anyway. Since you can't delete them, what I would like to do is to go back to those old records and selectively add another measurement field to existing my multi-measurement records. This field would be named 'ignore' and would be a bool. Then, all my report generators would just add a clause where ignored != true (or equivalent syntax).

To do that, my thought was this: first, do a SELECT to find the records needing updating, then do a batch write to upsert 'ignored' into those records. Since there is no UPDATE syntax, this is the best I can come up with. I haven't actually tried this yet.

I don't yet fully understand what happens when you do an upsert. This document says:

Upsert is an operation that inserts a record in to the system when the record does not exist or updates the record, when one exists.

That still leaves me wondering what the expected behavior is, especially if we are talking multi-measurement records. The record will be defined by measureName, Time, and the dimensions. If the record's content is MULTI do the individual fields get merged, or do I need to supply the entire original MULTI record and it will be completely replaced?

Is this a good strategy? It seems less intrusive than any of the other options I can think of. Dropping the table and re-adding it makes a mess in CDK. I could UNLOAD the entire table, set the retentions to 1 day, wait until tomorrow, and then put all that data back. That makes data unavailable for a time, though, and the timing will be tricky - and customers will see an outage.

Really, the ideal thing would be a way to DELETE records. Most TSDB's do this by marking records 'deleted' (so you're still paying for the storage) then some background garbage collection process reclaims that space some time later, by rewriting blocks without the deleted records. It really seems like this is something Timestream should have as an option - in fact, both UPDATE ...WHERE and DELETE ... WHERE. Can I request this as a new feature?

profile picture
wz2b
asked 8 months ago635 views
1 Answer
0
Accepted Answer

Thank you for reaching out on re:Post.

Timestream update existing records in an idempotent manner as it would dedupe and accept the data, if sent with same value and version.

You can use the Version parameter in a WriteRecords[1] request to update data points. Timestream tracks a version number with each record. Version defaults to 1 when it's not specified for the record in the request. Timestream updates an existing record’s measure value along with its Version when it receives a write request with a higher Version number for that record. When it receives an update request where the measure value is the same as that of the existing record, Timestream still updates Version, if it is greater than the existing value of Version. You can update a data point as many times as desired, as long as the value of Version continuously increases.

[1] Refer more about upserts here : https://docs.aws.amazon.com/timestream/latest/developerguide/API_WriteRecords.html

Having said that, You can use this for Multi-measure records where there are multiple records with the same dimensions, timestamps, and measure names as long as the value of Version continuously increases.

Additionally, I would also like to mention that we already have a Feature Request in place for Timestream - SQL delete functionality. However, we currently do not have an ETA for the same. We appreciate the feedback and have already noted down your request for the same.

profile pictureAWS
SUPPORT ENGINEER
answered 8 months ago
  • When you upset a multi-measure record do you have to write the entire measure over again, or can you upset just a single measurement within the multi-measure record and have it leave the rest ?

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions