[AWS Bedrock] ThrottlingException occurs randomly for Claude-2.1 Runtime

3

According to the documentation, Claude-2.1 has a Request and Token Limit "Per Minute."

However, my experience with the Bedrock's Claude-2.1 API is quite different. (Tokyo Region) The error occurs quite inconsistently, even after calling the API once after 30 minutes or few hours with small number of tokens.

An error occurred (ThrottlingException) when calling the InvokeModel operation (reached max retries: 4): Too many requests, please wait before trying again. You have sent too many requests. Wait before trying again.

Same issue persists despite the token counts and requests being way below the limits.

Also, following is my CloudWatch metric with the maximum usage (Statistics set to Sum) :

  • InputTokenCount 2,677
  • OutputTokenCount 1,293
  • Invocations 7
  • InvocationLatency 4,391
  • InvocationThrottles of 7

The number of requests and token counts seem far below the current quota of Claude requests (100 requests and 200,000 tokens).

I really hope this issue can be fixed, as I greatly appreciate the precision of Claude-2.1's performance.

Please let me know if there are any suggestions or solutions to this issue, thanks!


Update : Provisioned throughput is too costly for me at the moment, I'm looking for a Runtime (On-Demand) option

hanado
質問済み 3ヶ月前1816ビュー
1回答
-1

Hi,

As per https://docs.aws.amazon.com/bedrock/latest/userguide/quotas.html , the current quota for Claude requests is 100 requests and 200'000 tokens per minute. See https://docs.aws.amazon.com/bedrock/latest/userguide/quotas.html

My suggestion at this point: look in the CloudWatch metrics delivered by Bedrock to see if you reaching those limits or if you experience another kind of issue.

See https://docs.aws.amazon.com/bedrock/latest/userguide/monitoring-cw.html

4 metrics will be of interest for you: Invocations, InvocationThrottles, InputTokenCount, OutputTokenCount. When using them, you'll be able to see if you reach any quota or not.

Best,

Didier

profile pictureAWS
エキスパート
回答済み 3ヶ月前
  • I have updated the question with the specified CloudWatch metric, thank you.

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ