By using AWS re:Post, you agree to the Terms of Use

Sagemaker Endpoint is not created when deploying HuggingFace Model using it.


I am trying to deploy the HuggingFace model onto sagemaker. Here is the link for the model:

I am testing in my personal account and here is the code for the same:

from sagemaker.huggingface import HuggingFaceModel
import sagemaker

sess = sagemaker.Session()
# sagemaker session bucket -> used for uploading data, models and logs
# sagemaker will automatically create this bucket if it not exists
if sagemaker_session_bucket == 'sagemaker-hugging-face-model-demo' and sess is not None:
    # set to default bucket if a bucket name is not given
    sagemaker_session_bucket = sess.default_bucket()

role = sagemaker.get_execution_role()
sess = sagemaker.Session(default_bucket=sagemaker_session_bucket)

print(f"sagemaker role arn: {role}")
print(f"sagemaker bucket: {sess.default_bucket()}")
print(f"sagemaker session region: {sess.boto_region_name}")

hub = {

huggingface_model = HuggingFaceModel(
  transformers_version="4.6.1",     # transformers version used
  pytorch_version="1.7",          # pytorch version used

# deploy model to Sagemaker Inference
predictor = huggingface_model.deploy(

When I am trying to create the sagemaker endpoint I am experiencing the error: ClientError: An error occurred (ValidationException) when calling the CreateModel operation: Requested image not found.

Also I need to create a lambda function that will invoke the SageMaker endpoint that will send a text description for which it will return a generated image. E.g. --> The text Sun is shining should be transformed to image after the lambda function invokes the sagemaker endpoint.

Also need to know what should be the ContentType for image.

1 Answers

I see you have an incorrect-looking image_uri commented-out there...

One aspect of the SageMaker Python SDK that can be a little confusing at first is there is no direct correspondence between a "model" in the SDK (e.g. HuggingFaceModel) and a "Model" in the SageMaker APIs (as shown in Inference > Models page of the AWS Console for SageMaker).

The reason for this is that SDK "Model" constructors don't collect quite all the information needed to define API Models: If image_uri is not specified, you don't know until you .deploy() or .transformer() to a particular instance_type, whether you're using a CPU or GPU instance and therefore whether you should be using the CPU or GPU container image... And a specific container image is needed before it can create the API Model. Because of this:

  • When you first ran the code (I guess with image_uri included), the Model was not actually created in SageMaker API/Console until it reached the .deploy() step
  • In some situations, the SDK might re-use the initially created API Model rather than re-creating it with the new parameters (e.g. are you specifying a specific name that you just removed for publishing the code snippet?)

So if you removed the explicit image_uri and are still seeing the error about incorrect image URI, I would go in to SageMaker Console and explicitly delete the previous Model to force your code to create it from scratch using the updated params. (Of course there are also API/SDK ways to do this e.g. huggingface_model.delete_model()). When you just use the HuggingFaceModel class provide the framework version parameters, it should be able to look up the correct URI itself.

Since AWS Lambda runtimes don't have the high-level SageMaker Python SDK installed by default, I'd probably suggest to use plain boto3 SageMakerRuntime invoke_endpoint there (rather than e.g. predictor.predict() as you'll usually see used in notebooks).

I'm not sure yet what format the default pipeline will expect for your image inputs, or even if the default model serving stack is already set up to return images nicely (since Hugging Face has historically mainly been used for text). Possibly you'll need to customize the output processing, which you can do by defining your own output_fn (and even predict_fn, model_fn, input_fn if needed) as documented here. I'd first try sending in your input as application/json similar to { "instances": ["Sun is shining"] } with an application/json Accept header as well, and see what type of response that gets you.

answered a month ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions