How does data flow as you scale up Kinesis Data Steams and add KPU's to Kinesis Data Analytics.

0

I have an event processing architecture that's basically KPL Producer -> Kinesis Data Streams(Raw) -> KDA(Flink Aggregation) -> Kinesis Data Streams(Agg) -> Lambda -> Timestream . With a basic setup of 10 Raw Shards, 2 KPU's on the Flink App, and 5 Agg Shards(On Demand) and maybe 100 requests per second, my stack processes the data with maybe a delay of minute going through the whole workflow to Timestream. As I pre-scale things up for a much higher workload(that won't be arriving for several days), I notice that my current delay goes from 1 minute to 30 minutes to arrive in Timestream. The question I need to understand is where in my workflow is the latency being added. My current stack is now at 200 Raw Shards, 25 KPU's, and 15 Agg Shards(On Demand). I know the delay is most likely in Flink as I can see bursts of records being written out every 30 minutes using the incoming record count of my Agg KDS but my question is why? I know that it will clear up when my requests per second go thru the roof but I always wondered what underneath the hood is occurring in Flink to add this latency.

profile picture
已提問 2 年前檢視次數 259 次
1 個回答
0
已接受的答案

Do you have any idle Kinesis shards? This would make the Apache Flink application watermark stall because it cant progress. This can be avoided by setting SHARD_IDLE_INTERVAL_MILLIS in the Kinesis Consumer config.

profile pictureAWS
flomair
已回答 2 年前
  • I'm sure I do. I see that this setting has been deprecated with the ".withIdleness" on the watermarking strategy. Thank you for pointing me in the right direction.

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南