How to configure our own inference.py for two different PyTorch models in MultiDataModel to build single endpoint and call both models from there?

0

I have referred this notebook to deploy PyTorch model but as per this notebook they are just calling newly trained model to get predictions by passing input payload... but I want to configure my own inference.py file to pass an input there and get predictions from there... So, can anyone help me achieving that?

Thanks

asked a year ago687 views
2 Answers
0

I haven't got an up-to-date example, but tentatively believe this should be possible without having to build a custom container?

Generally in PyTorch framework, if your model.tar.gz contains code/inference.py (root of the tarball contains a code subfolder with an inference.py script), this should get picked up... So the approach would be to pack your inference scripts into your model tarballs.

Specifically with MME, I haven't tried on the most recent framework versions but last time I tried it out, you needed to also use TorchServe model archiver to package the model ready.

This sample creates a MultiDataModel in PyTorch with inference scripts, but is currently pinned at framework version <1.8.1 because of this issue. Hopefully it could still help you get started?

AWS
EXPERT
Alex_T
answered a year ago
  • Error:


    AttributeError Traceback (most recent call last) /tmp/ipykernel_8642/1469107258.py in <cell line: 47>() 45 46 print(type(mme)) ---> 47 predictor = mme.deploy(initial_instance_count=1, 48 instance_type='ml.m5.2xlarge', 49 endpoint_name=f'mme-pytorch-{current_time}')

    ~/anaconda3/envs/pytorch_p38/lib/python3.8/site-packages/sagemaker/multidatamodel.py in deploy(self, initial_instance_count, instance_type, serializer, deserializer, accelerator_type, endpoint_name, tags, kms_key, wait, data_capture_config, **kwargs) 240 self.sagemaker_session = local.LocalSession() 241 --> 242 container_def = self.prepare_container_def(instance_type, accelerator_type=accelerator_type) 243 self.sagemaker_session.create_model( 244 self.name,

    ~/anaconda3/envs/pytorch_p38/lib/python3.8/site-packages/sagemaker/multidatamodel.py in prepare_container_def(self, instance_type, accelerator_type, serverless_inference_config) 138 # copied over 139 if self.model: --> 140 container_definition = self.model.prepare_container_def(instance_type, accelerator_type) 141 image_uri = container_definition["Image"] 142 environment = container_definition["Environment"]

    ~/anaconda3/envs/pytorch_p38/lib/python3.8/site

  • ~/anaconda3/envs/pytorch_p38/lib/python3.8/site-packages/sagemaker/pytorch/model.py in prepare_container_def(self, instance_type, accelerator_type, serverless_inference_config) 287 ) 288 --> 289 region_name = self.sagemaker_session.boto_session.region_name 290 deploy_image = self.serving_image_uri( 291 region_name,

    AttributeError: 'NoneType' object has no attribute 'boto_session'

  • Hey @Alex_T I just tried as per your instruction but getting following error, so can you please tell me how we can fix this error? My script:

    import time
    import sagemaker
    from sagemaker.pytorch.model import PyTorchModel
    from sagemaker.multidatamodel import MultiDataModel
    from sagemaker import get_execution_role
    sagemaker_session = sagemaker.Session()
    
    print(sagemaker_session)
    role = get_execution_role()
    BUCKET = 'autofaiss-demo'
    PREFIX = 'huggingface-models'
    
    model_data_prefix = f's3://{BUCKET}/{PREFIX}/mme/'
    #print(model_data_prefix)
    
    output_1 = f's3://{BUCKET}/{PREFIX}/mme/model.tar.gz' # whisper-tiny
    current_time = time.time()
    
    #print(output_1)
    model_1 = PyTorchModel(model_data=output_1, # whisper-tiny 
                                 role=role,
                                 entry_point='inference.py',
                                 framework_version='1.8.0',
                                 py_version="py38")
    #print(model_1)
    mme = MultiDataModel(name=f'mme-pytorch-{current_time}',
                         model_data_prefix=model_data_prefix,
                         model=model_1,
                         sagemaker_session=sagemaker_session)
    
    #print(type(mme))
    predictor = mme.deploy(initial_instance_count=1,
                           instance_type='ml.m5.2xlarge',
    
0

Refer to the following link to build your own Multi-Model Endpoint Container: https://docs.aws.amazon.com/sagemaker/latest/dg/build-multi-model-build-container.html

profile pictureAWS
EXPERT
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions