neo compilation job failed on Yolov5/v7 model

0

Hi,

I was trying to use SageMaker Neo compilation to convert a yolo model(trained with our custom data) to a coreml format, but got an error on input config:

ClientError: InputConfiguration: Unable to determine the type of the model, i.e. the source framework. Please provide the value of argument "source", from one of ["tensorflow", "pytorch", "mil"]. Note that model conversion requires the source package that generates the model. Please make sure you have the appropriate version of source package installed.

I've tried both latest yolov7 model and yolov5 model, but get the same error. Seems Neo cannot recognize the Yolo model.

But when I tried to use the yolov4 model from this tutorial post: https://aws.amazon.com/de/blogs/machine-learning/speed-up-yolov4-inference-to-twice-as-fast-on-amazon-sagemaker/, it works fine.

Any idea if Neo compilation can work with Yolov7/v5 model?

asked 2 years ago312 views
1 Answer
0

Hi, It would be needed to see your code to help you, but what I can do is to share the steps required to compile Yolov5 with Neo. There are two important things here: 1/ the model needs to be torch.jit.trace(able) 2/ you need to replace some unsupported operators by friendly ones

In the following sample I'm using a specific version of the model: v6.2

These dependencies are required if you're running on SageMaker Studio.

## required if you're using Data Science Kernel on Sagemaker Studio
!apt update -y && apt install -y libgl1
%pip install torch

Next, you need to install specific yolov5 dependencies

## Yolov5 dependencies
%pip install -r https://raw.githubusercontent.com/ultralytics/yolov5/v6.2/requirements.txt  # install dependencies

Download a pre-trained COCO80 model version 6.2 from the model zoo.

import torch
import torch.nn as nn

model_type='l'
assert(model_type in ['n', 's', 'm', 'l', 'x'])

x = torch.rand([1, 3, 640, 640], dtype=torch.float32)
model = torch.hub.load('ultralytics/yolov5:v6.2', f'yolov5{model_type}', pretrained=True)
model.eval()

The following block replaces some ops in the model to make it compatible with SageMaker Neo. Then the model is .jit.traced and exported

# from Yolov5 repo. It will be available after invoking torch.hub.load
import models
from utils.activations import Hardswish, SiLU

# Update model
for k, m in model.named_modules():
    if isinstance(m, models.common.Conv):  
        if isinstance(m.act, nn.Hardswish):
            m.act = Hardswish()
        elif isinstance(m.act, nn.SiLU): # assign export-friendly activations
            m.act = SiLU()

y = model(x) # warmup
try:    
    traced = torch.jit.trace(model, x)
    traced.save('model.pth')
    
    print("Cool! Model is jit traceable")
except Exception as e:
    print("Ops. Something went wrong. Model is not traceable")

Now, you need to create a .tar.gz file with the traced model and upload it to S3

import tarfile
import boto3
import io

bucket_name='<<<YOUR S3 BUCKET HERE>>>'
model_name='yolov5'
key_path=f'models/{model_name}/model.tar.gz'
s3_client = boto3.client('s3')
s3_uri = f"s3://{bucket_name}/{key_path}"

with io.BytesIO() as f:
    with tarfile.open(fileobj=f, mode='w:gz') as tar:
        tar.add('model.pth')
        tar.list()
    f.seek(0)
    s3_client.put_object(Body=f, Bucket=bucket_name, Key=key_path)
print(s3_uri)

Finally, create a compilation job on Neo.

import time
import boto3

role='<<<YOUR ROLE_HERE>>>' # running on SageMaker? import sagemaker; sagemaker.Session(); role = sagemaker.get_execution_role()

sm_client = boto3.client('sagemaker')
compilation_job_name = f'{model_name}-pytorch-{int(time.time()*1000)}'
sm_client.create_compilation_job(
    CompilationJobName=compilation_job_name,
    RoleArn=role,
    InputConfig={
        'S3Uri': s3_uri,
        'DataInputConfig': '{"input": [1,3,640,640]}',
        'Framework': 'PYTORCH'
    },
    OutputConfig={
        'S3OutputLocation': f's3://{bucket_name}/{model_name}-pytorch/optimized/',
        'TargetPlatform': {
            'Os': 'LINUX',
            'Arch': 'ARM64', # change this to X86_64 if you need
            #'Accelerator': 'NVIDIA' # uncomment this if you have a GPU Nvidia
        },
        # uncomment or change the following line depending on your edge device
        # Jetson Xavier: sm_72; Jetson Nano: sm_53
        #'CompilerOptions': '{"trt-ver": "7.1.3", "cuda-ver": "10.2", "gpu-code": "sm_72"}' # Jetpack 4.4.1        
    },
    StoppingCondition={ 'MaxRuntimeInSeconds': 18000 }
)

#check for the compilation job to complete
while True:
    resp = sm_client.describe_compilation_job(CompilationJobName=compilation_job_name)    
    if resp['CompilationJobStatus'] in ['STARTING', 'INPROGRESS']:
        print('Running...')
    else:
        print(resp['CompilationJobStatus'], compilation_job_name)
        break
    time.sleep(5)
AWS
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions