ThrottlingException while asynchronously invoking Bedrock Runtime in Lambda

1

Hey, I am using AWS Bedrock runtime client in AWS Lambda. So in python when I am trying to run multiple requests (6) to the claude2 model asynchronously using Threads, some times I get all the results and a lot of the times I get ThrottlingException saying too many attempts. The thing is, is there a way to avoid this from happening as it happens randomly, with the same input tokens and everything. Is it an infra related problem, also please guide me in fixing this. Thankyou.

1개 답변
0

You have 2 options to consume a model via Bedrock, On-demand vs Provisioned Throughput.

As per the documentation https://docs.aws.amazon.com/bedrock/latest/userguide/quotas.html#quotas-runtime, latency differs by model and is directly proportional to the following conditions.

  • The number of input and output tokens
  • The total number of ongoing on-demand requests by all customers at the time.

You can purchase Provisioned Throughput to address your issue.
https://docs.aws.amazon.com/bedrock/latest/userguide/prov-throughput.html

Please have a look at the below thread for a similar issue.
https://repost.aws/questions/QUC82MTlWlQNagsqEG2Hbxlw/aws-bedrock-throttlingexception-occurs-randomly-for-claude-2-1-runtime

profile pictureAWS
답변함 2달 전
profile pictureAWS
전문가
검토됨 21일 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠