Different instances of same application reading data using KCL 1.X

0

Hi: All, I am using a one shard kinesis data stream, and bit confused/curious about the behavior of AWS KCL.

I use KPL version 0.13.1 to write to the stream and KCL version 1.11.2 to read from the stream. According to Kinesis doc https://docs.aws.amazon.com/streams/latest/dev/kinesis-record-processor-implementation-app-java.html "All workers associated with this application name are assumed to be working together on the same stream."

I got it, but what I have noticed that when you fire 2 instances of consumer app using KCL with same name working on the same stream with one shard, only first instance received the data from the stream, and the other instance sits idle. The second instance only receives data, if I kill or stop the first instance. Is this the expected behavior? If yes is there any way of telling the KCL to load balance the data to different instances automatically? If it is not possible for one shard, is it even possible for multiple shards?

已提問 4 年前檢視次數 812 次
2 個答案
0

This is expected behavior. If you had more than one shard, you will notice that the KCL will load balance across those two instances. Kinesis shards are usually meant to be consumed in a serial fashion by a single thread per shard. This helps with maintaining ordering guarantees within a shard, and makes things like the checkpointing logic simple and co-ordination free.

AWS
已回答 4 年前
0

Thanks Rohit for confirming, this is the exact behavior observed by me. So once I have increased the shard size from 1 to 2, both of my instances that belong to same application starts receiving the messages from each shard. I still think documentation should be more clear for these real world edge cases.

已回答 4 年前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南