Hello,
We recently created a new AWS account and have been actively using it for just over a month. Our application relies on the Nova 2 Sonic and Nova Micro models, and everything was functioning as expected under the default quotas.
However, we have suddenly started encountering the following error for all prompts:
ThrottlingException: Too many tokens per day, please wait before trying again.
Upon further investigation, we noticed that across all AWS accounts within our organization, the TPM (tokens per minute) and RPM (requests per minute) quotas for all models and all regions appear to be set to 0.
At the same time, the system still shows the default quota values (e.g., 8,000,000 tokens). When we attempt to request a quota increase for Nova Micro, we receive the following validation error:
"Must be a number greater than your current quota value of 8000000."
This seems inconsistent with the observed behavior (i.e., effective quota being 0 and requests being throttled).
Could you please help clarify:
- Why the effective TPM/RPM quotas are showing as 0 across all accounts and regions?
- Why we are unable to request a quota increase despite encountering throttling errors?
- What steps we should take to restore normal quota functionality?
Any guidance would be greatly appreciated.
Thank you.