how to inference parameters to a huggingface model hosted in sagemaker?


I created a model resource in sagemaker . the model is a tar file , downloaded from hugging face and fine tuned. based on the documentation provided ( sample code below) . the code sample is passing HF_TASK inference parameter and i assume this is hugging face specific, but is it possible to pass other parameters like padding or truncation and max_length ? such as padding : True truncation: True max_length = 512 ...

how do i pass these value?

import sagemaker 

hub = { 
   'HF_TASK' : 'text2text-generation'
role = sagemaker.get_execution_role()

huggingface_model = HuggingFaceModel( transformers_version='4.6.1', env=hub...

predictor = huggingface_model.deploy( ....
  • If you are using a Pretrained model you may not be able to tweak params such as padding. I am not sure why do you want to do that while inferencing.

asked 3 months ago28 views
No Answers

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions