[AWS Bedrock] ThrottlingException occurs randomly for Claude-2.1 Runtime

3

According to the documentation, Claude-2.1 has a Request and Token Limit "Per Minute."

However, my experience with the Bedrock's Claude-2.1 API is quite different. (Tokyo Region) The error occurs quite inconsistently, even after calling the API once after 30 minutes or few hours with small number of tokens.

An error occurred (ThrottlingException) when calling the InvokeModel operation (reached max retries: 4): Too many requests, please wait before trying again. You have sent too many requests. Wait before trying again.

Same issue persists despite the token counts and requests being way below the limits.

Also, following is my CloudWatch metric with the maximum usage (Statistics set to Sum) :

  • InputTokenCount 2,677
  • OutputTokenCount 1,293
  • Invocations 7
  • InvocationLatency 4,391
  • InvocationThrottles of 7

The number of requests and token counts seem far below the current quota of Claude requests (100 requests and 200,000 tokens).

I really hope this issue can be fixed, as I greatly appreciate the precision of Claude-2.1's performance.

Please let me know if there are any suggestions or solutions to this issue, thanks!


Update : Provisioned throughput is too costly for me at the moment, I'm looking for a Runtime (On-Demand) option

hanado
질문됨 3달 전1844회 조회
1개 답변
-1

Hi,

As per https://docs.aws.amazon.com/bedrock/latest/userguide/quotas.html , the current quota for Claude requests is 100 requests and 200'000 tokens per minute. See https://docs.aws.amazon.com/bedrock/latest/userguide/quotas.html

My suggestion at this point: look in the CloudWatch metrics delivered by Bedrock to see if you reaching those limits or if you experience another kind of issue.

See https://docs.aws.amazon.com/bedrock/latest/userguide/monitoring-cw.html

4 metrics will be of interest for you: Invocations, InvocationThrottles, InputTokenCount, OutputTokenCount. When using them, you'll be able to see if you reach any quota or not.

Best,

Didier

profile pictureAWS
전문가
답변함 3달 전
  • I have updated the question with the specified CloudWatch metric, thank you.

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠