Error Creating Endpoint

Question

Hi! The following error happens while trying to create an endpoint from a successful trained model:

* In the web console: 
> The customer:primary container for production variant AllTraffic did not pass the ping health check. Please check CloudWatch logs for this endpoint.
 * CloudWatch logs: 
> exec: "serve": executable file not found in $PATH

Im deploying the model using a Lambda step, just as in this [notebook](https://github.com/aws/amazon-sagemaker-examples/blob/main/sagemaker-pipelines/tabular/tensorflow2-california-housing-sagemaker-pipelines-deploy-endpoint/tensorflow2-california-housing-sagemaker-pipelines-deploy-endpoint.ipynb). The Lambda step is successful, and I can see in the AWS web console that the model configuration is created with success.

The exact same error happens when I  create an endpoint for the registered model in the AWS web console, under Inference -> Models. In the console I can see that an inference container was created for the model, with the following characteristics:
* Image: 763104351884.dkr.ecr.eu-west-3.amazonaws.com/tensorflow-training:2.8-cpu-py39
* Mode: single model
* Environment variables (Key Value): 
> SAGEMAKER_CONTAINER_LOG_LEVEL	20

> SAGEMAKER_PROGRAM	inference.py

> SAGEMAKER_REGION	eu-west-3

> SAGEMAKER_SUBMIT_DIRECTORY	/opt/ml/model/code
 
I absolutely have no clue what is wrong and I could not find anything relevant online about this problem. Is it necessary to provide an custom docker image for inference or something?

For more details, please find below the pipeline model steps code. Any help would be much appreciated!
```
model = Model(
    image_uri=estimator.training_image_uri(),
    model_data=step_training.properties.ModelArtifacts.S3ModelArtifacts,
    sagemaker_session=sagemaker_session,
    role=sagemaker_role,
    source_dir='code',
    entry_point='inference.py'
)
step_model_create = ModelStep(
        name="CreateModelStep",
        step_args=model.create(instance_type="ml.m5.large")
 )

register_args = model.register(
        content_types=["*"],
        response_types=["application/json"],
        inference_instances=["ml.m5.large"],
        transform_instances=["ml.m5.large"],
        model_package_group_name="test",
        approval_status="Approved"
)
step_model_register = ModelStep(name="RegisterModelStep", step_args=register_args)
```

Accepted Answer

Hi, the problem here is that your inference model's container URI `763104351884.dkr.ecr.eu-west-3.amazonaws.com/tensorflow-training:2.8-cpu-py39` is using a **training** image, not an **inference** image for TensorFlow. Because the images are each optimized for their own function, the serving executable is not available in the training container in this case.

Usually, the framework-specific SDK classes will handle this lookup for you (for example `TensorFlowModel(...)` as used in the notebook you linked, or when calling `sagemaker.tensorflow.TensorFlow.deploy(...)` from the Estimator class.

I see here though that you're using the generic `Model`, so guess you don't know (or don't want to commit to) the framework and version at the point the Lambda function runs?

My suggestions would be:

- Can you use the Pipelines `ModelStep` to create your model before calling the Lambda deployment function? Similarly to how your linked notebook uses `CreateModelStep`. This would build your framework & version into the pipeline definition itself, but should mean that the selection of inference container image gets handled properly & automatically.
- If you really need to be dynamic, I think you might need to find a way of looking up at least the *framework* from the training job. From my testing, you can use `estimator = sagemaker.tensorflow.TensorFlow.attach("training-job-name")` and then `model = estimator.create_model(...)` to correctly infer the specific inference container *version* from a training job, but it still relies on knowing that TensorFlow is the correct framework. I'm not aware of a framework-agnostic equivalent? So could e.g. try describing the training job, manually inferring which framework it uses from that information, and then using the relevant framework estimator class' [attach()](https://sagemaker.readthedocs.io/en/stable/api/training/estimators.html#sagemaker.estimator.EstimatorBase.attach) method to figure out the specifics and create your model.

Error Creating Endpoint

관련 콘텐츠