- Newest
- Most votes
- Most comments
Hi wz2b. If your devices are connecting directly to IoT Core (as opposed to a SiteWise gateway ingesting directly to SiteWise), then I think it makes sense to use IoT rules to route data to SiteWise and to your preferred cold storage option (if you don't want to use SiteWise cold tier).
What I gain is that the cold data is in a real time-series database, rather than something that's mainly useable only through something like Athena (or a pile of custom code).
Fair enough, but please be aware there are lots of different ways to consume the data once it's in S3. A few examples:
- https://aws.amazon.com/blogs/iot/create-insights-by-contextualizing-industrial-equipment-data-using-aws-iot-sitewise-part-2/
- https://github.com/aws-samples/aws-iot-sitewise-cold-tier-repartitioning
- https://aws.amazon.com/blogs/iot/collecting-organizing-monitoring-and-analyzing-industrial-data-at-scale-using-aws-iot-sitewise-part-3/
Sometimes ETL is used to land the data somewhere like Redshift.
I really like the multi-tiered approach to SiteWise. There are just two things I would improve. One is that I would like a way to query data across tiers without knowing where it is. That's really the whole reason I started thinking about this. The other is I would like other cold storage options, maybe even custom ones where when data is being expired sitewise just invokes a lambda and you can use custom logic to figure out where to put the data. Accessing cold storage files with Athena is great, but it's a little slow, a little pricey, and a little less flexible. It's a bit odd to me that SiteWise doesn't just integrate with Timestream out of the box.
I can't speak to roadmap here, but we hear you, and your patience will likely be rewarded.
Relevant content
- Accepted Answerasked 8 months ago
- asked 9 months ago
- Accepted Answerasked 8 months ago
- AWS OFFICIALUpdated 7 months ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 2 years ago
Hey thanks for the response, that's all great info. You're right, I wasn't really considering what other things someone might want to do with the data, like ETL or feeding it into other AWS services; I kind of think I can do all those things by gluing to timestream, too, but I think I'd lose the connection back to the structure (models etc). which is a good reason to do it this way.
I think the main thing is figuring out how to unify the querying. If you're looking at SiteWise data using Grafana, for example, with a hot tier retention of 30 days, it work great but when you zoom out past the retention period there's nothing there.