Did the SageMaker PyTorch deployment process change?

0

Did the SageMaker PyTorch deployment process change?

It use to be the case that people needed to have a model.tar.gz in s3, and an inference script locally or in git. Now, it seems that the inference script must also be part of the model.tar.gz. This is new, right?

From the docs, https://sagemaker.readthedocs.io/en/stable/frameworks/pytorch/using_pytorch.html#for-versions-1-2-and-higher:

*For PyTorch versions 1.2 and higher, the contents of model.tar.gz should be organized as follows:

  • Model files in the top-level directory
  • Inference script (and any other source files) in a directory named code/ (for more about the inference script, see The SageMaker PyTorch Model Server)
  • Optional requirements file located at code/requirements.txt (for more about requirements files, see Using third-party libraries)*

This may be confusing, because this new mode of deployment means that people creating the model artifact need to know in advanced how the inference is going to look like. The previous design, with separation of artifact and inference code, was more agile.

AWS
EXPERT
asked 4 years ago673 views
1 Answer
0
Accepted Answer

When AWS Sample - BERT sample using torch 1.4 was published, advance knowledge of the inference seems to be necessary. If you use the PyTorch SageMaker SDK to create or deploy the model after it is trained, it automatically re-packages the model.tar.gz to include the code files and the inference files. As an example, when you use the following script, the model.tar.gz is repackaged so the contents of the src directory is automatically added to the code directory model.tar.gz, which initially only contains model files. You don't need to know the inference code in advance.

from sagemaker.pytorch import PyTorchModel
from sagemaker import get_execution_role
role = get_execution_role()

model_uri = estimator.model_data

model = PyTorchModel(model_data=model_uri,
                     role=role,
                     framework_version='1.4.0',
                     entry_point='serve.py',
                     source_dir='src')

predictor = model.deploy(initial_instance_count=1, instance_type='ml.p3.2xlarge')

For the older versions, you couldn't include additional files /dependencies during inference unless you built a custom container. The source.tar.gz was only used during training.

AWS
EXPERT
answered 4 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions