Kafka for single packet IoT cluster


I have some IoT devices that send raw data which I am planning to store in Kafka and then process it with the consumer and save it to a database.


IoT devices send sensor data in form of packets (1 packet per sec) and you need 10 raw packets to decode its value (I have a C binary that takes 10 packets, I can enforce in IoT to send 10 packets combined as a single packet but I want IoT to focus on single packet only so that in future I can modify the algorithm to work for 20, 30 packets on the cloud). If I use the same topic for all IoT then consumers have to wait until it has received 10 packets to start processing, which I am thinking can cause some problems. (Problem of filtering packet of IoT, as single packets, can mix up in single topic)


Initially, I thought of creating a single topic with partitions for each IoT device, but Kafka doesn't support dynamic partitions. (More IoT devices can be added or removed during runtime.)

Another solution can be to create a topic for each IoT device with 1 partition.

Also, the above approach of 1 topic for each IoT device may enforce to use of one consumer for each IoT device added (can this be improved?)

I am looking for a better and more efficient implementation if possible for this problem.

The number of IoT devices is dynamic and can go up to 100+.

1 Answer

Hi there,

Sorry I know a little about IoT knowledges, I'm thinking about of if you could add identifer about device(i.e. deviceId), and one topis will suit them all. First, you can send the sensor data to AWS IoT Core, and then route the sensor data to Kafka based on certain rules configured on the IoT Core side.

For the analysis part, you might could partation the data based on the deviceId or rackId etc.


answered 2 years ago
  • Thanks for the reply, Yes you are correct I can use IoT Core, but what I am most confused about is consumers in Kafka. Confusion 1: If I use a single partition and each IoT sends a single packet, then there will be packet mix-up in that single partition (This has to be handled by consumers). Confusion 2: Let's say a consumer read 20 packets from the partition now and you only got 5 packets of IoT1, 5 packets of IoT2, 5 packets of IoT3, and 5 packets of IoT4, and I can't run my algorithm here as I need 10 packets from 1 IoT, how can I ensure a consumer read 10 packets from one IoT and so on.

  • I think the problem is that since "Each packet can't be handled independently" (you need 10 packets available). How to build a solution for such a problem?

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions