SageMaker MultiDataModel deployment error during inference. ValueError: Exactly one .pth or .pt file is required for PyTorch models: []

0

Hello, I've been trying to deploy multiple PyTorch models on one endpoint on SageMaker from a SageMaker Notebook. First I tested deployment of single models on single endpoints, to check if everything works smoothly and it did. I would create a PyTorchModel first:

import sagemaker
from sagemaker.pytorch import PyTorchModel
from sagemaker import get_execution_role
from sagemaker.multidatamodel import MultiDataModel
from sagemaker.serializers import JSONSerializer
from sagemaker.deserializers import JSONDeserializer
import boto3

role = get_execution_role()
sagemaker_session = sagemaker.Session()

pytorch_model = PyTorchModel(
            entry_point='inference.py',
            source_dir='code',
            role=role,
            model_data='s3://***/model/model.tar.gz',
            framework_version='1.11.0',
            py_version='py38',
            name='***-model',
            sagemaker_session=sagemaker_session
        )

MultiDataModel inherits properties from Model classes, so I used the same PyTorch model that I used for single model deployment. Then I would define the MultiDataModel the following way:

models = MultiDataModel(name='***-multi-model',
                       model_data_prefix='s3://***-sagemaker/model/',
                       model=pytorch_model,
                       sagemaker_session=sagemaker_session
                       )

All it should need is the prefix to the S3 bucket of the model artifacts saved as tar.gz files (the same files used for single model deployment), the previously defined PyTorch model, a name and a sagemaker_session.

To deploy it:

models.deploy(initial_instance_count =1,
             instance_type='ml.m4.xlarge',
             serializer=JSONSerializer(),
             deserializer=JSONDeserializer(),
             endpoint_name='***-multi-model-deployment',
             )

The deployment goes well, as there are no failures and the endpoint is InService by the end of this step. However the error occurs when I try to run inference on one of the models:

import json
body = {"url":"https://***image.jpg"} #url to an image online
payload = json.dumps(body)
client = boto3.client('sagemaker-runtime')
response = client.invoke_endpoint(
    EndpointName = "***-multi-model-deployment",
    ContentType  = "application/json",
    TargetModel  = "/model.tar.gz",
    Body         = payload)

This prompts an error message:

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (500) from model with message "{
  "code": 500,
  "type": "InternalServerException",
  "message": "Failed to start workers for model ec1cd509c40ca81ffc3fb09deb4599e2 version: 1.0"
}
". See https://***.console.aws.amazon.com/cloudwatch/home?region=***#logEventViewer:group=/aws/sagemaker/Endpoints/***-multi-model-deployment in account ***** for more information.

The Cloudwatch logs show this error in particular:

22-09-26T15:51:40,494 [INFO ] W-9000-model_1.0-stdout MODEL_LOG -   File "/opt/conda/lib/python3.8/site-packages/ts/model_service_worker.py", line 210, in <module>
2022-09-26T15:51:40,494 [INFO ] W-9000-model_1.0-stdout MODEL_LOG -     worker.run_server()
2022-09-26T15:51:40,494 [INFO ] W-9000-model_1.0-stdout MODEL_LOG -   File "/opt/conda/lib/python3.8/site-packages/ts/model_service_worker.py", line 181, in run_server
2022-09-26T15:51:40,495 [INFO ] W-9000-model_1.0-stdout MODEL_LOG -     self.handle_connection(cl_socket)
2022-09-26T15:51:40,495 [INFO ] W-9000-model_1.0-stdout MODEL_LOG -   File "/opt/conda/lib/python3.8/site-packages/ts/model_service_worker.py", line 139, in handle_connection
2022-09-26T15:51:40,495 [INFO ] W-9000-model_1.0-stdout MODEL_LOG -     service, result, code = self.load_model(msg)
2022-09-26T15:51:40,495 [INFO ] W-9000-model_1.0-stdout MODEL_LOG -   File "/opt/conda/lib/python3.8/site-packages/ts/model_service_worker.py", line 104, in load_model
2022-09-26T15:51:40,495 [INFO ] W-9000-model_1.0-stdout MODEL_LOG -     service = model_loader.load(
2022-09-26T15:51:40,495 [INFO ] W-9000-model_1.0-stdout MODEL_LOG -   File "/opt/conda/lib/python3.8/site-packages/ts/model_loader.py", line 151, in load
2022-09-26T15:51:40,495 [INFO ] W-9000-model_1.0-stdout MODEL_LOG -     initialize_fn(service.context)
2022-09-26T15:51:40,495 [INFO ] W-9000-model_1.0-stdout MODEL_LOG -   File "/opt/conda/lib/python3.8/site-packages/sagemaker_pytorch_serving_container/handler_service.py", line 51, in initialize
2022-09-26T15:51:40,495 [INFO ] W-9000-model_1.0-stdout MODEL_LOG -     super().initialize(context)
2022-09-26T15:51:40,495 [INFO ] W-9000-model_1.0-stdout MODEL_LOG -   File "/opt/conda/lib/python3.8/site-packages/sagemaker_inference/default_handler_service.py", line 66, in initialize
2022-09-26T15:51:40,495 [INFO ] W-9000-model_1.0-stdout MODEL_LOG -     self._service.validate_and_initialize(model_dir=model_dir)
2022-09-26T15:51:40,495 [INFO ] W-9000-model_1.0-stdout MODEL_LOG -   File "/opt/conda/lib/python3.8/site-packages/sagemaker_inference/transformer.py", line 162, in validate_and_initialize
2022-09-26T15:51:40,495 [INFO ] W-9000-model_1.0-stdout MODEL_LOG -     self._model = self._model_fn(model_dir)
2022-09-26T15:51:40,495 [INFO ] W-9000-model_1.0-stdout MODEL_LOG -   File "/opt/conda/lib/python3.8/site-packages/sagemaker_pytorch_serving_container/default_pytorch_inference_handler.py", line 73, in default_model_fn
2022-09-26T15:51:40,495 [INFO ] W-9000-model_1.0-stdout MODEL_LOG -     raise ValueError(
2022-09-26T15:51:40,496 [INFO ] W-9000-model_1.0-stdout MODEL_LOG - ValueError: Exactly one .pth or .pt file is required for PyTorch models: []

It seems like it's having problems loading the model, saying only one .pth file is required, however in the invocation function i point to the exact model artifact present at that S3 bucket prefix. I'm having a hard time trying to fix this issue, so it would be very helpful if anyone had some suggestions!

Instead of giving the MultiDataModel a model, I also tried providing it an ECR docker image with the same inference code, but I would get the same error during invocation of the endpoint.

1 Answer
0

Hey! I see a couple of potential issues that you might want to carefully check.

1/ Note that the CloudWatch the logs you're looking at are the errors for the default model worker W-9000-model_1.0. So these messages are irrelevant to you prediction request. See the Fix: Don't load default model in MME mode for detailed description of the issue. When you make predictions, the models are lazy-loaded from your model_data_prefix . Carefully check the further logs and timings and see what happens after it tries to load your TargetModel = "/model.tar.gz". According to the InternalServerException exception, you are executing the model ec1cd509c40ca81ffc3fb09deb4599e2, so look for the logs of the worker W-9001-ec1cd509c40ca81ffc3fb09deb4599e2 and you might see some other errors.

2/ There are two types of model.tar.gz in SageMaker - one is plain model produced by an estimator when you call fit(), another when you deploy the model and it's repackaged with your inference code (see the SageMaker Python SDK source code fragments: 1 and 2).

Make sure that your model_data_prefix contains repackaged models and they are not repackaged twice. Look into your model.tar.gz and make sure it contains both your PyTorch model as well as inference.py with the code dir.

The location of repackaged model should be accessible as pytorch_model.repacked_model_data after you deployed the endpoint.

For better clarity I recommend to create a separate path for models in your multi-model endpoint and copy the models to it with the following API:

models.add_model(model_data_source=pytorch_model.repacked_model_data, model_data_path=model_name)

Here the model_name can be something like model_1.tar.gz, model_2.tar.gz etc. Note that leading slash / is not necessary in the model name.

profile pictureAWS
Ivan
answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions