ThrottlingException on AWS Bedrock when using meta.llama3-70b-instruct-v1:0

1

Hi,

When I use meta.llama3-70b-instruct-v1:0 i consistently get throttlingExceptions. I am nowhere near the limit (in fact, every request I make gets throttled). I have checked by enabling cloudwatch logs and s3 logs, and there are no requests getting through. if I switch to any other model everything works fine.

Hristo
asked 3 months ago353 views
3 Answers
0

Hello, ThrottlingException while invoking models in on-demand mode, despite requests being below the documented quota limit can arise because the on-demand mode utilizes a shared capacity pool across multiple customers. Consequently, during periods of high demand when the base model processes a substantial number of requests, throttling may occur even if you have the necessary limits in place.

It's important to note that individual accounts can be throttled below their expected rates due to the shared capacity pool being utilized by all customers during high-demand periods. The internal team is actively working on long-term solutions to expand capacity and address this issue, but a specific timeline is currently unavailable.

To mitigate this issue, you can consider implementing retry mechanisms or exponential backoffs. However, switching to provisioned throughput might be the most effective option, as it provides reserved capacity specifically for your account. This approach ensures consistent performance by avoiding the inherent peaks and valleys of the on-demand mode.

Additionally, you could try using a different AWS region to see if that alleviates the throttling issues.

If further assistance is needed please feel free to reach out to AWS Support.

zeekg
answered 3 months ago
0

Have had the same issue with Llama 3. Had to pull it from our production application because of this. No other models have been an issue. Happy to find this question after wasting 3 days with the support center. Thank you OP and Zeekg

answered 3 months ago
0

only issue here is that you cannot provision llama 3.8 capacity as of now. hopefully this gets fixed one way or the other.

Hristo
answered 3 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions