Kinesis data streams limits

0

Hi everyone!

Could someone explain me this related to kinesis DS limits found in the documentation: https://docs.aws.amazon.com/streams/latest/dev/service-sizes-and-limits.html

  1. GetRecords can retrieve up to 10 MiB of data per call from a single shard, and up to 10,000 records per call. Each call to GetRecords is counted as one read transaction.
  2. Each shard can support up to five read transactions per second. Each read transaction can provide up to 10,000 records with an upper limit of 10 MiB per transaction.
  3. Each shard can support up to a maximum total data read rate of 2 MiB per second via GetRecords. If a call to GetRecords returns 10 MiB, subsequent calls made within the next 5 seconds throw an exception.

I find these points not very consistent and maybe contradictory. Did I miss something?

I appreciate your help.

asked 5 years ago3178 views
1 Answer
0
Accepted Answer

I think the points are clearer if you go from top to bottom, by first looking at the broader concept of Kinesis Streams and then look at the average read capacity that is available for you.

The first thing you need to understand is a Shard. Each Shard can support a maximum of 1 MB write or 2 MB read per second (on Average). The key word here is the 'Average'. Which means that if one read call returns a 10MB result, the next 5 calls on the shard will be throttled. On the other hand, if the read call returns a less than 2 MB, you can make additional read calls, up to a total number of 5 calls per second.

What happens if:

  1. Your first read call (getRecords) returns 10MB and you issue another read call over the next 5 seconds? Answer: You will get a ProvisionedThroughputExceededException.
  2. You have made 5 calls within the second returning 1 MB of data in total. You issue a 6th read call. You will get a Provisioned Throughput Exceeded exception even though you have only read 1MB of data. This is because you have exceeded the number of Reads that you can during one second interval.

You might be getting confused by the upper limit of Read capacity per Shard which is 2MB. I believe you should treat it as an "average" read capacity rather than a hard read capacity, but the average is from the point you make a call that over-consumes the 2MB limit. Hence while you can make a call which reads 10 MB (Hence getRecords 10Mb limit), but then this would mean that you have used up 5 seconds of the capacity for the remaining 5 minutes.

Please read the following: https://docs.aws.amazon.com/kinesis/latest/APIReference/API_GetRecords.html

Para 5: The size of the data returned by GetRecords varies depending on the utilisation of the shard. The maximum size of data that GetRecords can return is 10 MiB. If a call returns this amount of data, subsequent calls made within the next 5 seconds throw ProvisionedThroughputExceededException. If there is insufficient provisioned throughput on the stream, subsequent calls made within the next 1 second throw ProvisionedThroughputExceededException. GetRecords doesn't return any data when it throws an exception. For this reason, we recommend that you wait 1 second between calls to GetRecords. However, it's possible that the application will get exceptions for longer than 1 second.

I hope that helps!

MODERATOR
answered 5 years ago
profile picture
EXPERT
reviewed 21 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions