I want to use imported models in Amazon Bedrock, but I receive the ModelNotReadyException error.
Short description
Amazon Bedrock uses an internal eviction policy to efficiently manage resources. This policy removes models that Amazon Bedrock hasn't used for a certain period, typically an hour. If you try to use a model that hasn't been active, then you might get the ModelNotReadyException error.
If you try to use an imported model in Amazon Bedrock that the policy removed to optimize hardware utilization, then you might receive a ModelNotReadyException error message that looks like the following:
"errorMessage": "Model is not ready for inference. Wait and try your request again."
Note: There's no equivalent to Provisioned Throughput for imported models.
It's a best practice to schedule tasks to maintain a constant load on the model. Also, you can batch similar requests to minimize idle time between model invocations.
Resolution
Verify that you correctly imported the model
Use the Amazon Bedrock console or AWS API to verify that you imported the model
Use the Amazon Bedrock console
Complete the following steps:
- Open the Amazon Bedrock console.
- In the navigation pane, expand Foundation models, and then choose Imported models.
- Choose the Jobs tab.
- Select your Job name, and then review the Status for Complete.
Use AWS API
To verify that you imported the model, call the GetModelImportJob API. To confirm that you successfully imported and deployed the model, check that you have a Complete status in the Status field of the output.
Configure retries
A restoration process begins when you invoke your model for the first time after eviction. The time to restore depends on the availability of the on-demand fleet and the size of the model. If your InvokeModel or InvokeModelWithResponseStream API request returns ModelNotReadyException when the model restores, then the request automatically retries with exponential backoff by default.
To configure the maximum number of retries, see Handling ModelNotReadyException.
Implement a heartbeat strategy
Implement a heartbeat strategy to send a ping request to the model at regular intervals. The ping request tells Amazon Bedrock that the model is still in use. It's a best practice to run a warm-up request before critical operations to prevent a cold start after Amazon Bedrock evicts the model.
To implement a heartbeat strategy, complete the following steps:
- Create a AWS Lambda function that calls the InvokeModel API on the imported model.
- Create a rule schedule in Amazon EventBridge to activate within 30 to 50 minutes of the last model invocation.
- Deploy and test the Lambda function.
- Send Lambda function logs to Amazon CloudWatch Logs.
- Analyze your model's usage metrics in Amazon CloudWatch to determine the ideal heartbeat frequency.
Contact Support
If you still experience issues, then create a support ticket in the Support Center of the AWS Management Console.
Related information
Calculate the cost of running a custom model