deploy a llava model on AWS SageMaker use HuggingFaceModel

0

Describe the issue

import:

import sagemaker
from sagemaker.huggingface.model import HuggingFaceModel 

code:

sagemaker_session = sagemaker.Session()
role = sagemaker.get_execution_role()

hub = { 
  'HF_MODEL_ID' : "llava-hf/llava-1.5-7b-hf" , # model_id from hf.co/models 
  'HF_TASK' : 'image-to-text'                            # NLP task you want to use for predictions
}

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
   env=hub,                                                # configuration for loading model from Hub
   role=role,                                              # IAM role with permissions to create an endpoint
   transformers_version="4.26",                             # Transformers version used
   pytorch_version="1.13",                                  # PyTorch version used
   py_version='py39',                                      # Python version used
)


deploy model to SageMaker Inference

predictor = huggingface_model.deploy(
   initial_instance_count=1,
   instance_type="ml.m5.4xlarge"
)

request:

data_url = {'inputs':'https://storage.googleapis.com/sfr-vision-language-research/BLIP/demo.jpg'}
predictor .predict(data_url)

error:

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from model with message "{
  "code": 400,
  "type": "InternalServerException",
  "message": "\u0027llava\u0027"
}
". See https://ap-southeast-1.console.aws.amazon.com/cloudwatch/home?region=ap-southeast-1#logEventViewer:group=/aws/sagemaker/Endpoints/huggingface-pytorch-inference-2024-03-16-14-04-39-322 in account 601804951188 for more information.

#Attempted Solutions Consulted https://github.com/sungeuns/gen-ai-sagemaker/blob/main/MultiModal/02-llava-sagemaker-endpoint.ipynb https://github.com/haotian-liu/LLaVA/issues/600 https://github.com/haotian-liu/LLaVA/issues/907

图片

guo
gefragt vor einem Monat141 Aufrufe
2 Antworten
0

Hi There

You might find more info in the CLoudWatch logs (See the bottom of your error messge)

I have seen similar issues when trying to deploy other versions of LLava. In the meantime, here is a working notebook for deploying LLava 1.5 on Sagemaker. I have this running in my own account and it works great.

https://github.com/aws-samples/multimodal-rag-on-slide-decks/blob/main/Blog1-TitanEmbeddings-LVM/notebooks/0_deploy_llava.ipynb

profile pictureAWS
EXPERTE
Matt-B
beantwortet vor einem Monat
0

I encountered similar issues trying to deploy https://huggingface.co/liuhaotian/llava-v1.5-7b/tree/main to a SM EP. I'm not sure it works out-of-the-box with SageMaker. However I found https://huggingface.co/anymodality/llava-v1.5-7b, which is a fork of the model including explicit SageMaker support. That README gives instructions for downloading the model, which includes a Jupyter Notebook you can follow to deploy the endpoint

beantwortet vor 21 Tagen

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen