Did the SageMaker PyTorch deployment process change?

0

Did the SageMaker PyTorch deployment process change?

It use to be the case that people needed to have a model.tar.gz in s3, and an inference script locally or in git. Now, it seems that the inference script must also be part of the model.tar.gz. This is new, right?

From the docs, https://sagemaker.readthedocs.io/en/stable/frameworks/pytorch/using_pytorch.html#for-versions-1-2-and-higher:

*For PyTorch versions 1.2 and higher, the contents of model.tar.gz should be organized as follows:

  • Model files in the top-level directory
  • Inference script (and any other source files) in a directory named code/ (for more about the inference script, see The SageMaker PyTorch Model Server)
  • Optional requirements file located at code/requirements.txt (for more about requirements files, see Using third-party libraries)*

This may be confusing, because this new mode of deployment means that people creating the model artifact need to know in advanced how the inference is going to look like. The previous design, with separation of artifact and inference code, was more agile.

AWS
ESPERTO
posta 4 anni fa589 visualizzazioni
1 Risposta
0
Risposta accettata

When AWS Sample - BERT sample using torch 1.4 was published, advance knowledge of the inference seems to be necessary. If you use the PyTorch SageMaker SDK to create or deploy the model after it is trained, it automatically re-packages the model.tar.gz to include the code files and the inference files. As an example, when you use the following script, the model.tar.gz is repackaged so the contents of the src directory is automatically added to the code directory model.tar.gz, which initially only contains model files. You don't need to know the inference code in advance.

from sagemaker.pytorch import PyTorchModel
from sagemaker import get_execution_role
role = get_execution_role()

model_uri = estimator.model_data

model = PyTorchModel(model_data=model_uri,
                     role=role,
                     framework_version='1.4.0',
                     entry_point='serve.py',
                     source_dir='src')

predictor = model.deploy(initial_instance_count=1, instance_type='ml.p3.2xlarge')

For the older versions, you couldn't include additional files /dependencies during inference unless you built a custom container. The source.tar.gz was only used during training.

AWS
ESPERTO
con risposta 4 anni fa

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande