How does data flow as you scale up Kinesis Data Steams and add KPU's to Kinesis Data Analytics.

0

I have an event processing architecture that's basically KPL Producer -> Kinesis Data Streams(Raw) -> KDA(Flink Aggregation) -> Kinesis Data Streams(Agg) -> Lambda -> Timestream . With a basic setup of 10 Raw Shards, 2 KPU's on the Flink App, and 5 Agg Shards(On Demand) and maybe 100 requests per second, my stack processes the data with maybe a delay of minute going through the whole workflow to Timestream. As I pre-scale things up for a much higher workload(that won't be arriving for several days), I notice that my current delay goes from 1 minute to 30 minutes to arrive in Timestream. The question I need to understand is where in my workflow is the latency being added. My current stack is now at 200 Raw Shards, 25 KPU's, and 15 Agg Shards(On Demand). I know the delay is most likely in Flink as I can see bursts of records being written out every 30 minutes using the incoming record count of my Agg KDS but my question is why? I know that it will clear up when my requests per second go thru the roof but I always wondered what underneath the hood is occurring in Flink to add this latency.

profile picture
已提问 2 年前259 查看次数
1 回答
0
已接受的回答

Do you have any idle Kinesis shards? This would make the Apache Flink application watermark stall because it cant progress. This can be avoided by setting SHARD_IDLE_INTERVAL_MILLIS in the Kinesis Consumer config.

profile pictureAWS
flomair
已回答 2 年前
  • I'm sure I do. I see that this setting has been deprecated with the ".withIdleness" on the watermarking strategy. Thank you for pointing me in the right direction.

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则