[AWS Bedrock] ThrottlingException occurs randomly for Claude-2.1 Runtime

3

According to the documentation, Claude-2.1 has a Request and Token Limit "Per Minute."

However, my experience with the Bedrock's Claude-2.1 API is quite different. (Tokyo Region) The error occurs quite inconsistently, even after calling the API once after 30 minutes or few hours with small number of tokens.

An error occurred (ThrottlingException) when calling the InvokeModel operation (reached max retries: 4): Too many requests, please wait before trying again. You have sent too many requests. Wait before trying again.

Same issue persists despite the token counts and requests being way below the limits.

Also, following is my CloudWatch metric with the maximum usage (Statistics set to Sum) :

  • InputTokenCount 2,677
  • OutputTokenCount 1,293
  • Invocations 7
  • InvocationLatency 4,391
  • InvocationThrottles of 7

The number of requests and token counts seem far below the current quota of Claude requests (100 requests and 200,000 tokens).

I really hope this issue can be fixed, as I greatly appreciate the precision of Claude-2.1's performance.

Please let me know if there are any suggestions or solutions to this issue, thanks!


Update : Provisioned throughput is too costly for me at the moment, I'm looking for a Runtime (On-Demand) option

hanado
asked 3 months ago1650 views
1 Answer
-1

Hi,

As per https://docs.aws.amazon.com/bedrock/latest/userguide/quotas.html , the current quota for Claude requests is 100 requests and 200'000 tokens per minute. See https://docs.aws.amazon.com/bedrock/latest/userguide/quotas.html

My suggestion at this point: look in the CloudWatch metrics delivered by Bedrock to see if you reaching those limits or if you experience another kind of issue.

See https://docs.aws.amazon.com/bedrock/latest/userguide/monitoring-cw.html

4 metrics will be of interest for you: Invocations, InvocationThrottles, InputTokenCount, OutputTokenCount. When using them, you'll be able to see if you reach any quota or not.

Best,

Didier

profile pictureAWS
EXPERT
answered 3 months ago
  • I have updated the question with the specified CloudWatch metric, thank you.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions