[AWS Bedrock] ThrottlingException occurs randomly for Claude-2.1 Runtime

3

According to the documentation, Claude-2.1 has a Request and Token Limit "Per Minute."

However, my experience with the Bedrock's Claude-2.1 API is quite different. (Tokyo Region) The error occurs quite inconsistently, even after calling the API once after 30 minutes or few hours with small number of tokens.

An error occurred (ThrottlingException) when calling the InvokeModel operation (reached max retries: 4): Too many requests, please wait before trying again. You have sent too many requests. Wait before trying again.

Same issue persists despite the token counts and requests being way below the limits.

Also, following is my CloudWatch metric with the maximum usage (Statistics set to Sum) :

  • InputTokenCount 2,677
  • OutputTokenCount 1,293
  • Invocations 7
  • InvocationLatency 4,391
  • InvocationThrottles of 7

The number of requests and token counts seem far below the current quota of Claude requests (100 requests and 200,000 tokens).

I really hope this issue can be fixed, as I greatly appreciate the precision of Claude-2.1's performance.

Please let me know if there are any suggestions or solutions to this issue, thanks!


Update : Provisioned throughput is too costly for me at the moment, I'm looking for a Runtime (On-Demand) option

hanado
已提问 3 个月前1846 查看次数
1 回答
-1

Hi,

As per https://docs.aws.amazon.com/bedrock/latest/userguide/quotas.html , the current quota for Claude requests is 100 requests and 200'000 tokens per minute. See https://docs.aws.amazon.com/bedrock/latest/userguide/quotas.html

My suggestion at this point: look in the CloudWatch metrics delivered by Bedrock to see if you reaching those limits or if you experience another kind of issue.

See https://docs.aws.amazon.com/bedrock/latest/userguide/monitoring-cw.html

4 metrics will be of interest for you: Invocations, InvocationThrottles, InputTokenCount, OutputTokenCount. When using them, you'll be able to see if you reach any quota or not.

Best,

Didier

profile pictureAWS
专家
已回答 3 个月前
  • I have updated the question with the specified CloudWatch metric, thank you.

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则