Why do I experience high latency issues with Kinesis Data Streams?
3 minute read
I want to know why my Amazon Kinesis Data Stream has high latency while getting data records.
GetRecords.Latency can increase if there's an increase in record count or record size for each GET request. If you restart your application while the producer is ingesting data into the stream, then records can accumulate without being consumed. This increase in the record count or amount of data to be fetched increases the value for GetRecords.Latency. If an application can't catch up with the ingestion rate, then the IteratorAge is increased.
Note: Turning on server-side encryption on your Kinesis data stream can increase your latency.
Monitor the Kinesis Data Streams service with Amazon CloudWatch. Check the CloudWatch metrics, such as GetRecords.Latency, to verify if the latency increase is continuous. If the latency increase is continuous, then check if there's also an increase in the IncomingRecords, IncomingBytes, GetRecords.Records, and GetRecords.Bytes metrics in CloudWatch. As data volumes increase, these metrics increase and cause high latency. This increase occurs because GetRecords fetches more records when there are more records available in the Kinesis data stream.
If your IteratorAge also increased, then there are likely more IncomingBytes put into the stream. Check the IncomingBytes metric in CloudWatch to verify if the number of bytes increased. You can check if fewer GetRecords calls are made to the stream. More incoming bytes indicate that each GetRecords call is retrieving more data and increases the value for GetRecords.Latency.
If you continue to observe high latency (even though there's no increase in IncomingBytes or IncomingRecords), then there might be too much incoming data. If the consumer application can't catch up with the incoming data, then the data continues to accumulate in the Kinesis data stream. Even if you restart the application, more records are fetched within each GetRecords call. The increase in records or fetched data for each GetRecords call then increases the value for GetRecords.Latency.
To resolve this issue, complete the following tasks:
Check your application to see if enough GetRecords calls are made to process the volume of incoming data. If you use the Amazon Kinesis Client Library (KCL) application or AWS Lambda as a consumer, then increase the number of shards in your stream. An increase in shard count increases the consumption rate from the delivery stream and decreases the values for the IteratorAge and GetRecords.Latency.
Increase the retention period of the Kinesis data stream to avoid any data losses. A longer retention period can help your application to catch up with the data backlog.
If you have a consumer application, then check the processing logic and record processing time.
Check the central processing unit (CPU) and memory utilization of your system to see if you need to free up more memory.