By using AWS re:Post, you agree to the AWS re:Post Terms of Use

Data Firehose - Destination error - Kinesis to S3 - Throttling error encountered when calling Kinesis

0

I have a kinesis stream with very low volume (30 small JSON messages each hour) and connected to that, a Firehose instance which is reading from the stream and storing the JSON objects to an S3 bucket. It was working fine for around 8 hours before slowing to a crawl. A windscreen washer, not a firehose.

There are no other consumers and yet the firehose instance appears to be hitting some kind of refresh limit. There are no settings that I can see to change refresh interval on the firehose instance, what's the matter? is the service buggy?

Throttling error encountered when calling Kinesis. This can be due to other applications calling the same APIs as the Firehose delivery stream, or because you have created too many Firehose delivery streams with the same Kinesis stream as the source.

asked 2 months ago96 views
2 Answers
0

Some of the quotas are quite low, such as GetRecords being limited to 5 requests per second (https://docs.aws.amazon.com/streams/latest/dev/service-sizes-and-limits.html). If you look at the "maximum" dimension of the CloudWatch metrics for the stream, do any of the throttling-related metrics show non-zero values? https://docs.aws.amazon.com/streams/latest/dev/monitoring-with-cloudwatch.html

GetRecords getting called more than five times per second, for example, should show as a non-zero value for the ProvisionedThroughputExceededException metric. In that case, without any insight into your specific environment, I'd personally probably continue troubleshooting by creating a temporary CloudTrail trail with data event logging enabled for KDS API calls (https://docs.aws.amazon.com/streams/latest/dev/logging-using-cloudtrail.html) and scan the logs with Athena (https://docs.aws.amazon.com/athena/latest/ug/create-cloudtrail-table-partition-projection.html) to see if excessive numbers of API calls are actually coming from KDS, which should know better than to do that and documentation says that it does (https://docs.aws.amazon.com/firehose/latest/dev/writing-with-kinesis-streams.html), or if there's something else going on that I didn't expect.

EXPERT
answered 2 months ago
-1

The issue you're experiencing with your Amazon Kinesis Data Firehose delivery stream is not likely due to a bug in the service, but rather a result of throttling caused by the interaction between Kinesis Data Streams and Kinesis Data Firehose.

Even though your data volume is low (30 small JSON messages per hour), Kinesis Data Firehose still needs to poll the Kinesis data stream regularly to check for new data. This polling can sometimes lead to throttling errors, especially if there are multiple delivery streams reading from the same Kinesis stream or if other applications are also accessing the Kinesis stream.

To address this issue, you have a few options:

  1. Increase the buffer size or buffer interval in your Firehose delivery stream settings. This will reduce the frequency of data retrieval attempts from the Kinesis stream, potentially alleviating the throttling issue.

  2. If possible, consider using Direct PUT to Firehose instead of using a Kinesis data stream as the source. With Direct PUT, your application can write data directly to Firehose, bypassing the Kinesis stream entirely. This approach can be more efficient for low-volume data scenarios like yours.

  3. If you must use a Kinesis data stream as the source, ensure that you don't have multiple Firehose delivery streams reading from the same Kinesis stream, as this can contribute to throttling.

  4. Check if there are any other applications or processes that might be calling Kinesis APIs frequently, as these could be contributing to the throttling.

  5. If the issue persists, you may want to contact AWS support to investigate if there are any account-specific limits or issues affecting your Firehose delivery stream.

Remember, Kinesis Data Firehose is designed to handle much larger volumes of data, so your low-volume use case shouldn't be causing issues under normal circumstances. The throttling error is likely due to the frequency of API calls rather than the volume of data being processed.
Sources
Troubleshoot errors in Amazon Data Firehose - Amazon Data Firehose
Choose source and destination for your Firehose stream - Amazon Data Firehose
Amazon Kinesis Data Firehose Features - Streaming Data Pipeline - Amazon Web Services
Introducing Amazon Data Firehose, formerly known as Amazon Kinesis Data Firehose

profile picture
answered 2 months ago
  • None of this is relevent in this situation

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions