[Kinesis] Dynamic sharding

0

A customer prefers to receive IoT ingested data to Kinesis Data Stream (KDS afterward). They prefer KDS over Kinesis Firehose (KF afterward) since KDS can support Lambda while KF only supports S3 and non-computing consumers. They want to ingest the KDS so that they can show close-to-realtime analytical dashboards.

The little inconvenience with Kinesis Data Stream is that we should predict the number of shards. The IoT application is home appliance, so there will be dynamic usage fluctuation. I searched a bit and found out there is a solution to dynamically adjust the number of shards like this. https://medium.com/slalom-data-analytics/amazon-kinesis-data-streams-auto-scaling-the-number-of-shards-105dc967bed5 But, should I apply this technique? I am wondering if there are already heavy lifting done by others out there.

The peak data ingestion is about 60MB / sec. This number is calculated as following. We know that home appliances will not be actively used on night time. There will be about 60K - 100K new devices sold in future. (Since there are already 60K devices sold.) Each device produces about 2 seconds 2KB data. It makes the peak about 60K * 1KB. If we do the batch done correctly, I assume that it should consume about 60 shards.

profile pictureAWS
已提问 3 年前617 查看次数
1 回答
0
已接受的回答

I am not sure if you saw these blogs/resources on Kinesis auto-scaling, so wanted to pass them along:

  1. Automatically adding/removing shards to a KDS using AWS application auto-scaling - https://aws.amazon.com/blogs/big-data/scaling-amazon-kinesis-data-streams-with-aws-application-auto-scaling/

  2. Under the hood - scaling KDS - https://aws.amazon.com/blogs/big-data/under-the-hood-scaling-your-kinesis-data-streams/

  3. Kinesis Scaling Utility (available on Github) - https://github.com/awslabs/amazon-kinesis-scaling-utils

I have not tried these personally, but looks like these solutions will work.

AWS
已回答 3 年前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则