By using AWS re:Post, you agree to the AWS re:Post Terms of Use

Unable to load trained model in Sagemaker

0

I have trained a few models in sagemaker however I am unable to load them for prediction.

I am picking model details from: Sagemaker > Inference > Models > Container 1 section: Image_uri = value in image model_data = Value in model data location

then passing these values into sagemaker Model function.

When I deploy this model, it gives error: ping health check failed for AllTraffic production variant. This error doesn't come when I train a new model and deploy it.

1 Answer
0

The cause for issues like this are due to a mismatch between the base model between the training and inference endpoints. A solution similar to below would help resolve your issue.

Github repo : https://github.com/marshmellow77/sm-extend-container/blob/main/02_extend_container.ipynb. talks about how to extend the existing Hugging Face DLCs by pulling them from the public ECR and running a simple Dockerfile on top of them that will install the latest available version of transformers.

AWS
answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions