[IoT Core][Kinesis] How much ingestion stream to determine Kinesis or IoT Core?

Question

Need to compare preferred choice between  Kinesis Firehose and IoT Core  for streaming data ingestion.

From my observation,

* IoT Core is more suitable to infrequent, bi-directional and network limited IoT data ingestion from many IoT devices.
* Kinesis Firehose is more suitable for ingesting constant and quite big stream of data to AWS cloud.

Then my question is "how big"?
For example, if a customer's IoT end device is making 1MB/s or 5MB/s (or even arbitrary X MB/s) of read-only constant sensor data stream which needs to be ingested to AWS Cloud. Should they consider IoT Core or Kinesis Firehose/Data-Stream?
Let's assume that the data to be sent will be quite well JSON formatted and AWS Cloud will save that data to S3 directly.

What's the threshold X MB/s value to determine whether to use IoT Core or Kinesis?

Thanks!

Accepted Answer

For volumes > 512KB/s, I am assuming you are going to compare HTTPS API for IoT vs Firehose. MQTT/TLS will not be able to handle more than 512 KB/s per connection, introducing additional complexity in managing multiple connections.

For volumes < 512 KB/s, using MQTT/TLS has the advantage of a lower overhead compared to using HTTPS which might be a significant factor if you have expensive connectivity (customer told me that the difference in cost is about 25% just due to overhead).

Assuming data cost is not an issue, and we are using HTTPS APIs in both scenarios Assuming data is to be ingested to S3. Assuming you are using IoT Credentials provider to get the STS tokens to call Firehose

These are the two set up:

device -> Firehose -> S3

or

device -> IoT Core (Basic Ingest) -> Rule -> Firehose Action -> Firehose ->  S3

From a cost perspective the first option can save at least 0.30 USD/mln events, but there might be other factors to consider:

how large are the record you are ingesting? IoT core has a limit of 128Kb per publish, Firehose 1000KiB per record and 4 MiB per DirectPut call.
do you need to filter the data before sending it to Firehose?
do you need to enrich the data based on contextual values obtained from the connection?
do you need a strong authentication and authorization for the device sending you the data?

The latter is probably the critical factor: using Firehose API directly means that you are going to use STS tokens for authentication and IAM policies for authorization. The drawback of this approach is that you might lose the ability to identify your sources. You can create one role alias per device, but there is a 100 aliases limits. Or you can create dedicated Firehose per device (but there is a default limit of 50 delivery streams per region) and use the credential provider policy variables to control access. In both cases this is not as straightforward as using AWS IoT, and limited to < 100s of devices.

If you have data volumes > 512KB/s from a device, you are probably also able to run Greengrass and Stream Manager + IoT Analytics instead, which provides a fully managed solution.

[IoT Core][Kinesis] How much ingestion stream to determine Kinesis or IoT Core?

관련 콘텐츠