Kinesis data streams limits

0

Hi everyone!

Could someone explain me this related to kinesis DS limits found in the documentation: https://docs.aws.amazon.com/streams/latest/dev/service-sizes-and-limits.html

  1. GetRecords can retrieve up to 10 MiB of data per call from a single shard, and up to 10,000 records per call. Each call to GetRecords is counted as one read transaction.
  2. Each shard can support up to five read transactions per second. Each read transaction can provide up to 10,000 records with an upper limit of 10 MiB per transaction.
  3. Each shard can support up to a maximum total data read rate of 2 MiB per second via GetRecords. If a call to GetRecords returns 10 MiB, subsequent calls made within the next 5 seconds throw an exception.

I find these points not very consistent and maybe contradictory. Did I miss something?

I appreciate your help.

已提问 5 年前3173 查看次数
1 回答
0
已接受的回答

I think the points are clearer if you go from top to bottom, by first looking at the broader concept of Kinesis Streams and then look at the average read capacity that is available for you.

The first thing you need to understand is a Shard. Each Shard can support a maximum of 1 MB write or 2 MB read per second (on Average). The key word here is the 'Average'. Which means that if one read call returns a 10MB result, the next 5 calls on the shard will be throttled. On the other hand, if the read call returns a less than 2 MB, you can make additional read calls, up to a total number of 5 calls per second.

What happens if:

  1. Your first read call (getRecords) returns 10MB and you issue another read call over the next 5 seconds? Answer: You will get a ProvisionedThroughputExceededException.
  2. You have made 5 calls within the second returning 1 MB of data in total. You issue a 6th read call. You will get a Provisioned Throughput Exceeded exception even though you have only read 1MB of data. This is because you have exceeded the number of Reads that you can during one second interval.

You might be getting confused by the upper limit of Read capacity per Shard which is 2MB. I believe you should treat it as an "average" read capacity rather than a hard read capacity, but the average is from the point you make a call that over-consumes the 2MB limit. Hence while you can make a call which reads 10 MB (Hence getRecords 10Mb limit), but then this would mean that you have used up 5 seconds of the capacity for the remaining 5 minutes.

Please read the following: https://docs.aws.amazon.com/kinesis/latest/APIReference/API_GetRecords.html

Para 5: The size of the data returned by GetRecords varies depending on the utilisation of the shard. The maximum size of data that GetRecords can return is 10 MiB. If a call returns this amount of data, subsequent calls made within the next 5 seconds throw ProvisionedThroughputExceededException. If there is insufficient provisioned throughput on the stream, subsequent calls made within the next 1 second throw ProvisionedThroughputExceededException. GetRecords doesn't return any data when it throws an exception. For this reason, we recommend that you wait 1 second between calls to GetRecords. However, it's possible that the application will get exceptions for longer than 1 second.

I hope that helps!

审核人员
已回答 5 年前
profile picture
专家
已审核 20 天前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则