deploy a llava model on AWS SageMaker use HuggingFaceModel

0

Describe the issue

import:

import sagemaker
from sagemaker.huggingface.model import HuggingFaceModel 

code:

sagemaker_session = sagemaker.Session()
role = sagemaker.get_execution_role()

hub = { 
  'HF_MODEL_ID' : "llava-hf/llava-1.5-7b-hf" , # model_id from hf.co/models 
  'HF_TASK' : 'image-to-text'                            # NLP task you want to use for predictions
}

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
   env=hub,                                                # configuration for loading model from Hub
   role=role,                                              # IAM role with permissions to create an endpoint
   transformers_version="4.26",                             # Transformers version used
   pytorch_version="1.13",                                  # PyTorch version used
   py_version='py39',                                      # Python version used
)


deploy model to SageMaker Inference

predictor = huggingface_model.deploy(
   initial_instance_count=1,
   instance_type="ml.m5.4xlarge"
)

request:

data_url = {'inputs':'https://storage.googleapis.com/sfr-vision-language-research/BLIP/demo.jpg'}
predictor .predict(data_url)

error:

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from model with message "{
  "code": 400,
  "type": "InternalServerException",
  "message": "\u0027llava\u0027"
}
". See https://ap-southeast-1.console.aws.amazon.com/cloudwatch/home?region=ap-southeast-1#logEventViewer:group=/aws/sagemaker/Endpoints/huggingface-pytorch-inference-2024-03-16-14-04-39-322 in account 601804951188 for more information.

#Attempted Solutions Consulted https://github.com/sungeuns/gen-ai-sagemaker/blob/main/MultiModal/02-llava-sagemaker-endpoint.ipynb https://github.com/haotian-liu/LLaVA/issues/600 https://github.com/haotian-liu/LLaVA/issues/907

图片

guo
asked a month ago119 views
2 Answers
0

Hi There

You might find more info in the CLoudWatch logs (See the bottom of your error messge)

I have seen similar issues when trying to deploy other versions of LLava. In the meantime, here is a working notebook for deploying LLava 1.5 on Sagemaker. I have this running in my own account and it works great.

https://github.com/aws-samples/multimodal-rag-on-slide-decks/blob/main/Blog1-TitanEmbeddings-LVM/notebooks/0_deploy_llava.ipynb

profile pictureAWS
EXPERT
Matt-B
answered a month ago
0

I encountered similar issues trying to deploy https://huggingface.co/liuhaotian/llava-v1.5-7b/tree/main to a SM EP. I'm not sure it works out-of-the-box with SageMaker. However I found https://huggingface.co/anymodality/llava-v1.5-7b, which is a fork of the model including explicit SageMaker support. That README gives instructions for downloading the model, which includes a Jupyter Notebook you can follow to deploy the endpoint

answered 10 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions