Different instances of same application reading data using KCL 1.X


Hi: All, I am using a one shard kinesis data stream, and bit confused/curious about the behavior of AWS KCL.

I use KPL version 0.13.1 to write to the stream and KCL version 1.11.2 to read from the stream. According to Kinesis doc https://docs.aws.amazon.com/streams/latest/dev/kinesis-record-processor-implementation-app-java.html "All workers associated with this application name are assumed to be working together on the same stream."

I got it, but what I have noticed that when you fire 2 instances of consumer app using KCL with same name working on the same stream with one shard, only first instance received the data from the stream, and the other instance sits idle. The second instance only receives data, if I kill or stop the first instance. Is this the expected behavior? If yes is there any way of telling the KCL to load balance the data to different instances automatically? If it is not possible for one shard, is it even possible for multiple shards?

asked 4 years ago812 views
2 Answers

This is expected behavior. If you had more than one shard, you will notice that the KCL will load balance across those two instances. Kinesis shards are usually meant to be consumed in a serial fashion by a single thread per shard. This helps with maintaining ordering guarantees within a shard, and makes things like the checkpointing logic simple and co-ordination free.

answered 4 years ago

Thanks Rohit for confirming, this is the exact behavior observed by me. So once I have increased the shard size from 1 to 2, both of my instances that belong to same application starts receiving the messages from each shard. I still think documentation should be more clear for these real world edge cases.

answered 4 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions