Did the SageMaker PyTorch deployment process change?

0

Did the SageMaker PyTorch deployment process change?

It use to be the case that people needed to have a model.tar.gz in s3, and an inference script locally or in git. Now, it seems that the inference script must also be part of the model.tar.gz. This is new, right?

From the docs, https://sagemaker.readthedocs.io/en/stable/frameworks/pytorch/using_pytorch.html#for-versions-1-2-and-higher:

*For PyTorch versions 1.2 and higher, the contents of model.tar.gz should be organized as follows:

  • Model files in the top-level directory
  • Inference script (and any other source files) in a directory named code/ (for more about the inference script, see The SageMaker PyTorch Model Server)
  • Optional requirements file located at code/requirements.txt (for more about requirements files, see Using third-party libraries)*

This may be confusing, because this new mode of deployment means that people creating the model artifact need to know in advanced how the inference is going to look like. The previous design, with separation of artifact and inference code, was more agile.

AWS
エキスパート
質問済み 4年前575ビュー
1回答
0
承認された回答

When AWS Sample - BERT sample using torch 1.4 was published, advance knowledge of the inference seems to be necessary. If you use the PyTorch SageMaker SDK to create or deploy the model after it is trained, it automatically re-packages the model.tar.gz to include the code files and the inference files. As an example, when you use the following script, the model.tar.gz is repackaged so the contents of the src directory is automatically added to the code directory model.tar.gz, which initially only contains model files. You don't need to know the inference code in advance.

from sagemaker.pytorch import PyTorchModel
from sagemaker import get_execution_role
role = get_execution_role()

model_uri = estimator.model_data

model = PyTorchModel(model_data=model_uri,
                     role=role,
                     framework_version='1.4.0',
                     entry_point='serve.py',
                     source_dir='src')

predictor = model.deploy(initial_instance_count=1, instance_type='ml.p3.2xlarge')

For the older versions, you couldn't include additional files /dependencies during inference unless you built a custom container. The source.tar.gz was only used during training.

AWS
エキスパート
回答済み 4年前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ