How to run the tensorflow neuron in SageMaker endpoint for production

0

We have a huggingfacemodel with zero-shot-classification with neuron infernetia. It's based on the pretrained huggingface pipelines distilBert with TensorFlow2 neuron with zero-shot-classification model.

We planned to use it in production environment since it reduced the latency from 1s to 100ms.

However, the Sagemaker Python SDK HuggingFaceModel seems not support tensorflow 2 neuron. It gave error like bellow.

My question is that how to run this tensorflow2 neuron under sageamaker?

  1. If huggingfacemodel doesn't support tensorflow2, can you provide a pytorch version for hugginface pipeline. There isn't any example of implement neuron for huggingface pipeline.
  2. Is there any other way like create dockerfile? Thanks a lot
# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
   model_data="s3://sagemaker-us-west-2-**********/inf1/model.tar.gz",      # path to your model and script
   role=role,                    # iam role with permissions to create an Endpoint
   transformers_version="4.6.1",  # transformers version used
   tensorflow_version="2.4.1",        # pytorch version used
   py_version='py37',            # python version used
)
huggingface_model._is_compiled_model = True
# deploy the endpoint endpoint
predictor = huggingface_model.deploy(
    initial_instance_count=1,      # number of instances
    instance_type="ml.inf6.xlarge" # AWS Inferentia Instance
)

We got response

Defaulting to the only supported framework/algorithm version: 4.12.3. Ignoring framework/algorithm version: 4.6.1.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/tmp/ipykernel_18249/2024686093.py in <module>
      2 predictor = huggingface_model.deploy(
      3     initial_instance_count=1,      # number of instances
----> 4     instance_type="ml.inf1.xlarge" # AWS Inferentia Instance
      5 )

~/anaconda3/envs/amazonei_pytorch_latest_p37/lib/python3.7/site-packages/sagemaker/huggingface/model.py in deploy(self, initial_instance_count, instance_type, serializer, deserializer, accelerator_type, endpoint_name, tags, kms_key, wait, data_capture_config, async_inference_config, serverless_inference_config, volume_size, model_data_download_timeout, container_startup_health_check_timeout, inference_recommendation_id, **kwargs)
    303 
    304         return super(HuggingFaceModel, self).deploy(
--> 305             initial_instance_count,
    306             instance_type,
    307             serializer,

~/anaconda3/envs/amazonei_pytorch_latest_p37/lib/python3.7/site-packages/sagemaker/huggingface/model.py in serving_image_uri(self, region_name, instance_type, accelerator_type, serverless_inference_config)

~/anaconda3/envs/amazonei_pytorch_latest_p37/lib/python3.7/site-packages/sagemaker/workflow/utilities.py in wrapper(*args, **kwargs)
    386 
    387 
--> 388 def execute_job_functions(step_args: _StepArguments):
    389     """Execute the job class functions during pipeline definition construction
    390 

~/anaconda3/envs/amazonei_pytorch_latest_p37/lib/python3.7/site-packages/sagemaker/image_uris.py in retrieve(framework, region, version, py_version, instance_type, accelerator_type, image_scope, container_version, distribution, base_framework_version, training_compiler_config, model_id, model_version, tolerate_vulnerable_model, tolerate_deprecated_model, sdk_version, inference_tool, serverless_inference_config)
    172             )
    173         _validate_arg(full_base_framework_version, list(version_config.keys()), "base framework")
--> 174         version_config = version_config.get(full_base_framework_version)
    175 
    176     py_version = _validate_py_version_and_set_if_needed(py_version, version_config, framework)

~/anaconda3/envs/amazonei_pytorch_latest_p37/lib/python3.7/site-packages/sagemaker/image_uris.py in _validate_arg(arg, available_options, arg_name)
    569     """Creates a tag for the image URI."""
    570     if inference_tool:
--> 571         return "-".join(x for x in (tag_prefix, inference_tool, py_version, container_version) if x)
    572     return "-".join(x for x in (tag_prefix, processor, py_version, container_version) if x)
    573 

ValueError: Unsupported base framework: tensorflow2.4.1. You may need to upgrade your SDK version (pip install -U sagemaker) for newer base frameworks. Supported base framework(s): version_aliases, pytorch1.9.1.
1 Risposta
0

Hi Xin Tong, Thanks for posting the question. HuggingFace Neuron Inference Containers are currently only available for PyTorch. Please file a feature request on https://github.com/aws/deep-learning-containers for TensorFlow 2.x HuggingFace Neuron Inference Container support.

con risposta un anno fa

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande