How to run the tensorflow neuron in SageMaker endpoint for production

0

We have a huggingfacemodel with zero-shot-classification with neuron infernetia. It's based on the pretrained huggingface pipelines distilBert with TensorFlow2 neuron with zero-shot-classification model.

We planned to use it in production environment since it reduced the latency from 1s to 100ms.

However, the Sagemaker Python SDK HuggingFaceModel seems not support tensorflow 2 neuron. It gave error like bellow.

My question is that how to run this tensorflow2 neuron under sageamaker?

  1. If huggingfacemodel doesn't support tensorflow2, can you provide a pytorch version for hugginface pipeline. There isn't any example of implement neuron for huggingface pipeline.
  2. Is there any other way like create dockerfile? Thanks a lot
# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
   model_data="s3://sagemaker-us-west-2-**********/inf1/model.tar.gz",      # path to your model and script
   role=role,                    # iam role with permissions to create an Endpoint
   transformers_version="4.6.1",  # transformers version used
   tensorflow_version="2.4.1",        # pytorch version used
   py_version='py37',            # python version used
)
huggingface_model._is_compiled_model = True
# deploy the endpoint endpoint
predictor = huggingface_model.deploy(
    initial_instance_count=1,      # number of instances
    instance_type="ml.inf6.xlarge" # AWS Inferentia Instance
)

We got response

Defaulting to the only supported framework/algorithm version: 4.12.3. Ignoring framework/algorithm version: 4.6.1.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/tmp/ipykernel_18249/2024686093.py in <module>
      2 predictor = huggingface_model.deploy(
      3     initial_instance_count=1,      # number of instances
----> 4     instance_type="ml.inf1.xlarge" # AWS Inferentia Instance
      5 )

~/anaconda3/envs/amazonei_pytorch_latest_p37/lib/python3.7/site-packages/sagemaker/huggingface/model.py in deploy(self, initial_instance_count, instance_type, serializer, deserializer, accelerator_type, endpoint_name, tags, kms_key, wait, data_capture_config, async_inference_config, serverless_inference_config, volume_size, model_data_download_timeout, container_startup_health_check_timeout, inference_recommendation_id, **kwargs)
    303 
    304         return super(HuggingFaceModel, self).deploy(
--> 305             initial_instance_count,
    306             instance_type,
    307             serializer,

~/anaconda3/envs/amazonei_pytorch_latest_p37/lib/python3.7/site-packages/sagemaker/huggingface/model.py in serving_image_uri(self, region_name, instance_type, accelerator_type, serverless_inference_config)

~/anaconda3/envs/amazonei_pytorch_latest_p37/lib/python3.7/site-packages/sagemaker/workflow/utilities.py in wrapper(*args, **kwargs)
    386 
    387 
--> 388 def execute_job_functions(step_args: _StepArguments):
    389     """Execute the job class functions during pipeline definition construction
    390 

~/anaconda3/envs/amazonei_pytorch_latest_p37/lib/python3.7/site-packages/sagemaker/image_uris.py in retrieve(framework, region, version, py_version, instance_type, accelerator_type, image_scope, container_version, distribution, base_framework_version, training_compiler_config, model_id, model_version, tolerate_vulnerable_model, tolerate_deprecated_model, sdk_version, inference_tool, serverless_inference_config)
    172             )
    173         _validate_arg(full_base_framework_version, list(version_config.keys()), "base framework")
--> 174         version_config = version_config.get(full_base_framework_version)
    175 
    176     py_version = _validate_py_version_and_set_if_needed(py_version, version_config, framework)

~/anaconda3/envs/amazonei_pytorch_latest_p37/lib/python3.7/site-packages/sagemaker/image_uris.py in _validate_arg(arg, available_options, arg_name)
    569     """Creates a tag for the image URI."""
    570     if inference_tool:
--> 571         return "-".join(x for x in (tag_prefix, inference_tool, py_version, container_version) if x)
    572     return "-".join(x for x in (tag_prefix, processor, py_version, container_version) if x)
    573 

ValueError: Unsupported base framework: tensorflow2.4.1. You may need to upgrade your SDK version (pip install -U sagemaker) for newer base frameworks. Supported base framework(s): version_aliases, pytorch1.9.1.
asked a year ago305 views
1 Answer
0

Hi Xin Tong, Thanks for posting the question. HuggingFace Neuron Inference Containers are currently only available for PyTorch. Please file a feature request on https://github.com/aws/deep-learning-containers for TensorFlow 2.x HuggingFace Neuron Inference Container support.

answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions