Error for Compilation job Failed. Reason: ClientError: InputConfiguration: TVM cannot convert the PyTorch model. Invalid model or input-shape mismatch.

0

Error for Compilation job Failed. Reason: ClientError: InputConfiguration: TVM cannot convert the PyTorch model. Invalid model or input-shape mismatch. Make sure that inputs are lexically ordered and of the correct dimensionality. The following operators are not implemented: ['aten::silu_']

I wanted to do model compilation jobs on SageMaker NEO to use on Panorama Edge devices. I first tried the code on Sagemaker Notebook with default python version 3.10, and latest pytorch version (Yolov5 Model). Also, target device set to ml_c5 (cloud instance), with framework version 1.13. The compilation was succesful.

After that, using the same code, i change my target device to jetson_nano/jetson_xavier/Jetson_xt2 or x86_win64 (edge device instance). Since edge device only support pytorch version 1.7 and 1.8, I change framework version to 1.8. And also rollback python version to 3.8, so that i can install pytorch 1.8. However, the compilation job just got error all the times.

Here's my code:

model = torch.load(model_path)
model.eval()

input_tensor = torch.rand(1, 3, 384, 640)
# model(input_tensor)
traced_script_module = torch.jit.trace(model, input_tensor)
# save
traced_script_module.save("yolov5_model_jit.pt")
# # load again
weights = torch.load("yolov5_model_jit.pt")

model_path = 'yolov5_model_jit.pt'

import tarfile
with tarfile.open("model_yolov5_.tar.gz", "w:gz") as f:
    f.add(model_path)
import boto3
import sagemaker
import time
from sagemaker.utils import name_from_base

role = sagemaker.get_execution_role()
sess = sagemaker.Session()
region = sess.boto_region_name
bucket = 'bucket_name'

compilation_job_name = name_from_base("name")
prefix = compilation_job_name + "/models"

model_path = sess.upload_data(path="model_yolov5_.tar.gz", key_prefix=prefix)

data_shape = '{"input0":[1,3,384,640]}'
target_device = "jetson_nano"
framework = "PYTORCH"
framework_version = "1.8"
compiled_model_path = "s3://{}/{}/output".format(bucket, compilation_job_name)
from sagemaker.pytorch.model import PyTorchModel
from sagemaker.predictor import Predictor

sagemaker_model = PyTorchModel(
    model_data=model_path,
    predictor_cls=Predictor,
    framework_version=framework_version,
    role=role,
    sagemaker_session=sess,
    entry_point="resnet18.py",
    source_dir="code",
    py_version="py3",
    env={"MMS_DEFAULT_RESPONSE_TIMEOUT": "500"},
)
compiled_model = sagemaker_model.compile(
    target_instance_family=target_device,
    input_shape=data_shape,
    job_name=compilation_job_name,
    role=role,
    framework=framework.lower(),
    framework_version=framework_version,
    output_path=compiled_model_path,
)

let me know if there is any information missing or needed. Thank you so much, appreciate for reading this. Need assistance on this issue.

1 Answer
0

Hello,

Thank you for using AWS SageMaker service and reaching out to us. I understand that you are facing error when performing compilation job on SageMaker NEO.

SageMaker Neo requires machine learning models satisfy specific input data shape. The input shape required for compilation depends on the deep learning framework you use. Once your model input shape is correctly formatted, save your model according to the requirements below. Once you have a saved model, compress the model artifacts.

Having said that, I am suspecting that this error could be happening either due to some issues with the format of the saved model "model_yolov5_.tar.gz" or the supplied dataInputConfig. Hence I would recommend you to refer the below docs as they can help in identifying the root cause behind this issue -

  1. Input shapes SageMaker Neo expects for various frameworks https://docs.aws.amazon.com/sagemaker/latest/dg/neo-compilation-preparing-model.html#neo-job-compilation-expected-inputs

  2. Code examples to show how to save your model to make it compatible with Neo https://docs.aws.amazon.com/sagemaker/latest/dg/neo-compilation-preparing-model.html#neo-job-compilation-how-to-save-model

Kindly attempt creating a compilation job again after referring the above documentations and if the issue still persists, please reach out to AWS Support[1] (SageMaker) from the account in which you are creating these jobs, along with your issue/use case in detail and share relevant AWS resource names. We will troubleshoot accordingly.

[1] https://docs.aws.amazon.com/awssupport/latest/user/case-management.html#creating-a-support-casehttps://docs.aws.amazon.com/awssupport/latest/user/case-management.html#creating-a-support-case

Mandeep
answered 6 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions