How does data flow as you scale up Kinesis Data Steams and add KPU's to Kinesis Data Analytics.

0

I have an event processing architecture that's basically KPL Producer -> Kinesis Data Streams(Raw) -> KDA(Flink Aggregation) -> Kinesis Data Streams(Agg) -> Lambda -> Timestream . With a basic setup of 10 Raw Shards, 2 KPU's on the Flink App, and 5 Agg Shards(On Demand) and maybe 100 requests per second, my stack processes the data with maybe a delay of minute going through the whole workflow to Timestream. As I pre-scale things up for a much higher workload(that won't be arriving for several days), I notice that my current delay goes from 1 minute to 30 minutes to arrive in Timestream. The question I need to understand is where in my workflow is the latency being added. My current stack is now at 200 Raw Shards, 25 KPU's, and 15 Agg Shards(On Demand). I know the delay is most likely in Flink as I can see bursts of records being written out every 30 minutes using the incoming record count of my Agg KDS but my question is why? I know that it will clear up when my requests per second go thru the roof but I always wondered what underneath the hood is occurring in Flink to add this latency.

profile picture
demandé il y a 2 ans238 vues
1 réponse
0
Réponse acceptée

Do you have any idle Kinesis shards? This would make the Apache Flink application watermark stall because it cant progress. This can be avoided by setting SHARD_IDLE_INTERVAL_MILLIS in the Kinesis Consumer config.

profile pictureAWS
flomair
répondu il y a 2 ans
  • I'm sure I do. I see that this setting has been deprecated with the ".withIdleness" on the watermarking strategy. Thank you for pointing me in the right direction.

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions