[AWS Bedrock] ThrottlingException occurs randomly for Claude-3.0 Runtime

0

We have been using Anthropic Claude 3.0 Sonnet model for our enterpirse use. The total input and output tokens which are being processed per minute is ~50-60k which is easily within the limit of 200k defined for claude models. Inspite of this lately we have been receiveing the below error message frequently. botocore.exceptions.EventStreamError: An error occurred (throttlingException) when calling the InvokeModelWithResponseStream operation: Too many requests, please wait before trying again. You have sent too many requests. Wait before trying again.

Can we get to know as to why this error might be coming? Is this because the 2M token capacity is not just limited to our enterprise but also across all the claude users in that region. And what's the best way to resolve this issue. Is there any way we can increase rate limit or get a dedicated computing resource for our enterprise

Sourabh
asked a month ago562 views
1 Answer
1
Accepted Answer

Hi Sourabh,

Please check the below documentation and rePost question previously answered for similar error regarding ThrottlingException and how on-demand quotas are assigned.

  1. https://repost.aws/questions/QU11DRlMZfRDy0ngHxpO1VCw/throttlingexceptions-while-using-on-demand-bedrock-runtime-for-invoking-claude-v2-1
  2. https://docs.aws.amazon.com/bedrock/latest/userguide/prov-throughput.html
AWS
answered a month ago
profile picture
EXPERT
reviewed a month ago
  • Thanks for the response. The provisioned throughput option seems to be costly for small and medium enterprises. AWS should look into increasing the rate limits of the on-demand model or offering a better upgrade plan in Bedrock

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions