Skip to content

Application Inference Profile for Llama 3.2 3B

0

I’m trying to create an application inference profile for the Llama 3.2 3B model, but I’m getting the following error: ValidationException: The provided foundation model does not support On Demand inference.

We already have inference profiles for other models such as Amazon Nova Lite, Amazon Nova Canva, Amazon Nova Pro, Amazon Nova Micro, and OpenAI GPT OSS, and those were created successfully. However, for the Llama model, we’re encountering this error.

When running the following command:

aws bedrock list-foundation-models --query "modelSummaries[?contains(inferenceTypesSupported, 'ON_DEMAND')].{ModelId:modelId,ModelName:modelName,Provider:providerName,InferenceTypes:inferenceTypesSupported}" --output json

we noticed that the Llama 3.2 3B model does not appear in the list of models that support ON_DEMAND inference.

Is it possible to create an inference profile for this model, or is On-Demand inference not yet supported for Llama 3.2 3B?

Also, could you please clarify what InferenceTypesSupported means, and why some models don’t have this attribute or support certain inference types? What does 'on_demand' mean?

1 Answer
0

Based on the information provided, it appears that the Meta Llama 3.2 3B model does not currently support on-demand throughput in Amazon Bedrock, which is why you're encountering the validation error when trying to create an application inference profile.

When you run the command to list foundation models that support ON_DEMAND inference, the Llama 3.2 3B model doesn't appear in the results because it doesn't support this inference type. This is consistent with the error message you're receiving.

Some models in Amazon Bedrock only support provisioned throughput rather than on-demand invocation. The Llama 3.2 models appear to fall into this category. This is why you were able to successfully create inference profiles for other models like Amazon Nova Lite, Nova Canva, Nova Pro, Nova Micro, and OpenAI GPT OSS, as these models do support on-demand inference.

To confirm which models support on-demand inference, you can use the ListFoundationModels API with the byInferenceType="ON_DEMAND" filter. This will return only those models that can be used with on-demand throughput.

Unfortunately, if the Llama 3.2 3B model doesn't support on-demand inference, you won't be able to create an application inference profile for it. You would need to use a different model that supports on-demand inference, or explore other options for using the Llama model with provisioned throughput if that's available.
Sources
Community | Find Amazon Bedrock models for immediate on-demand invocation
Community | Finding Amazon Bedrock models for immediate on-demand invocation
meta.llama3-2-11b-instruct-v1:0 not supported for on-demand throughput? | AWS re:Post

answered 12 days ago
EXPERT
reviewed 12 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.