- Newest
- Most votes
- Most comments
This error indicates you're encountering a throttling issue with Amazon Bedrock when using Strands Agents with Claude models. The "ThrottlingException" specifically mentions "Too many tokens," suggesting you've exceeded the token rate limits for the model you're using.
For Claude models on Amazon Bedrock with on-demand provisioning, there are both requests per minute and tokens per minute limits. For example, Claude models have quotas like 100 requests and 200,000 tokens per minute. When you exceed these limits, Bedrock returns a throttling exception.
To resolve this issue, you can:
-
Implement retry logic with exponential backoff in your application to handle these throttling exceptions automatically.
-
Reduce the token usage in your prompts and responses. Try to make your prompts more concise and consider limiting the output token length.
-
Space out your requests to stay within the per-minute limits.
-
Consider using provisioned throughput if you need dedicated capacity, though this is a more costly option.
-
Monitor your usage with CloudWatch metrics (Invocations, InvocationThrottles, InputTokenCount, OutputTokenCount) to better understand if you're consistently hitting these limits.
When working with Strands Agents and MCP, be mindful that complex agent interactions and tool usage can quickly consume tokens, especially with features like Claude 4's interleaved thinking which may use more tokens for reasoning.
Sources
Issue with Bedrock- Claude Sonnet 3.5 | AWS re:Post
[AWS Bedrock] ThrottlingException occurs randomly for Claude-2.1 Runtime | AWS re:Post
Using Strands Agents with Claude 4 Interleaved Thinking | AWS Open Source Blog
Relevant content
- asked a year ago
- asked 10 months ago
