Kinesis Data Stream ingesting from IoT Core and API Gateway and activating Lambda function

0

Hello Community,

I am architecting a solution and I would like some advice in best practices and AWS services limitations.

My final goal is to have a PLC that will send its readings to the cloud and get in response the control signals for the components it is controlling. The cloud scheme I have so far would work like this:

  1. PLC sends sensor data via MQTT to IoT Core
  2. An IoT Rule forwards the data to a Kinesis Data Stream
  3. An API using API Gateway makes a call to weather data and passes to the same Kinesis Data Stream
  4. The Stream is passed to a Firehose for raw data storage in S3
  5. The Stream is passed to Kinesis Analytics for some feature engineering
  6. The stream then triggers a lambda function that hold an ML model 6.1 The lambda function also needs static data stored in a S3 bucket
  7. The lambda function uses as input the sensor data from the PLC, the weather data from the API, and the features engineered by Analytics, and outputs the control signals for the PLC
  8. Finally the IoT Core publishes the lambda output back to the PLC

Main questions:

  • Can the same stream hold data from 2 different sources, the IoT Core and API Gateway?
  • How can I ensure the data is coupled in the same timestamp, meaning the sensor data at 12:00:00 is coupled with the weather data at 12:00:00?
  • Is there any service that I should be using or that I'm using in excess?

Cloud scheme

1 Answer
0

I can help answer the questions related to AWS IoT Core and the Kinesis Data Stream (KDS) action. As the synthesis of data is downstream of IoT Core and API Gateway, I assume you are looking for correlated timestamps there. So to answer the IoT side of this question:

How can I ensure the data is coupled in the same timestamp, meaning the sensor data at 12:00:00 is coupled with the weather data at 12:00:00?

Depending upon the resolution needed, if the sending devices have good a time source such as NTP (~1ms from upstream stratums), you could inject the timestamp as part of the MQTT application message (payload). That can then parsed by Kinesis Data Anayltics (KDA) or applications reading from Kinesis Data Firehouse for correlation.

If the devices do not have a precise time source, you can use the IoT Rules Engine to add a timestamp key value pair. Note: there will be short duration of 10s to 100s of milliseconds added time due to the publish from the device to AWS IoT Core, and then delivery to the Rules Engine. Something to take into consideration if correlation <1sec is needed.

Adding the timestap is easy if the payload is JSON. You can use something like SELECT *, timestamp() as my_timestamp as part of the rule., If it isn't JSON and some form of binary, you can still create a new payload format in JSON where the binary payload is base64 encoded. In might look something like this:

{
  "timestamp": 1481825251155,
  "payload":  "eyAidGVtcGVyYXR1cmUiOiAzMyB9Cg=="
}

I'm not a Kinesis expert, so will defer to others on best practices for 1 v 2 streams and what a KDA application might look like.

Hope this helps, please let me know if you have follow up questions!

AWS
Gavin_A
answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions