How to configure our own inference.py for two different PyTorch models in MultiDataModel to build single endpoint and call both models from there?

0

I have referred this notebook to deploy PyTorch model but as per this notebook they are just calling newly trained model to get predictions by passing input payload... but I want to configure my own inference.py file to pass an input there and get predictions from there... So, can anyone help me achieving that?

Thanks

gefragt vor einem Jahr735 Aufrufe
2 Antworten
0

I haven't got an up-to-date example, but tentatively believe this should be possible without having to build a custom container?

Generally in PyTorch framework, if your model.tar.gz contains code/inference.py (root of the tarball contains a code subfolder with an inference.py script), this should get picked up... So the approach would be to pack your inference scripts into your model tarballs.

Specifically with MME, I haven't tried on the most recent framework versions but last time I tried it out, you needed to also use TorchServe model archiver to package the model ready.

This sample creates a MultiDataModel in PyTorch with inference scripts, but is currently pinned at framework version <1.8.1 because of this issue. Hopefully it could still help you get started?

AWS
EXPERTE
Alex_T
beantwortet vor einem Jahr
  • Error:


    AttributeError Traceback (most recent call last) /tmp/ipykernel_8642/1469107258.py in <cell line: 47>() 45 46 print(type(mme)) ---> 47 predictor = mme.deploy(initial_instance_count=1, 48 instance_type='ml.m5.2xlarge', 49 endpoint_name=f'mme-pytorch-{current_time}')

    ~/anaconda3/envs/pytorch_p38/lib/python3.8/site-packages/sagemaker/multidatamodel.py in deploy(self, initial_instance_count, instance_type, serializer, deserializer, accelerator_type, endpoint_name, tags, kms_key, wait, data_capture_config, **kwargs) 240 self.sagemaker_session = local.LocalSession() 241 --> 242 container_def = self.prepare_container_def(instance_type, accelerator_type=accelerator_type) 243 self.sagemaker_session.create_model( 244 self.name,

    ~/anaconda3/envs/pytorch_p38/lib/python3.8/site-packages/sagemaker/multidatamodel.py in prepare_container_def(self, instance_type, accelerator_type, serverless_inference_config) 138 # copied over 139 if self.model: --> 140 container_definition = self.model.prepare_container_def(instance_type, accelerator_type) 141 image_uri = container_definition["Image"] 142 environment = container_definition["Environment"]

    ~/anaconda3/envs/pytorch_p38/lib/python3.8/site

  • ~/anaconda3/envs/pytorch_p38/lib/python3.8/site-packages/sagemaker/pytorch/model.py in prepare_container_def(self, instance_type, accelerator_type, serverless_inference_config) 287 ) 288 --> 289 region_name = self.sagemaker_session.boto_session.region_name 290 deploy_image = self.serving_image_uri( 291 region_name,

    AttributeError: 'NoneType' object has no attribute 'boto_session'

  • Hey @Alex_T I just tried as per your instruction but getting following error, so can you please tell me how we can fix this error? My script:

    import time
    import sagemaker
    from sagemaker.pytorch.model import PyTorchModel
    from sagemaker.multidatamodel import MultiDataModel
    from sagemaker import get_execution_role
    sagemaker_session = sagemaker.Session()
    
    print(sagemaker_session)
    role = get_execution_role()
    BUCKET = 'autofaiss-demo'
    PREFIX = 'huggingface-models'
    
    model_data_prefix = f's3://{BUCKET}/{PREFIX}/mme/'
    #print(model_data_prefix)
    
    output_1 = f's3://{BUCKET}/{PREFIX}/mme/model.tar.gz' # whisper-tiny
    current_time = time.time()
    
    #print(output_1)
    model_1 = PyTorchModel(model_data=output_1, # whisper-tiny 
                                 role=role,
                                 entry_point='inference.py',
                                 framework_version='1.8.0',
                                 py_version="py38")
    #print(model_1)
    mme = MultiDataModel(name=f'mme-pytorch-{current_time}',
                         model_data_prefix=model_data_prefix,
                         model=model_1,
                         sagemaker_session=sagemaker_session)
    
    #print(type(mme))
    predictor = mme.deploy(initial_instance_count=1,
                           instance_type='ml.m5.2xlarge',
    
0

Refer to the following link to build your own Multi-Model Endpoint Container: https://docs.aws.amazon.com/sagemaker/latest/dg/build-multi-model-build-container.html

profile pictureAWS
EXPERTE
beantwortet vor einem Jahr

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen