deploy a llava model on AWS SageMaker use HuggingFaceModel

0

Describe the issue

import:

import sagemaker
from sagemaker.huggingface.model import HuggingFaceModel 

code:

sagemaker_session = sagemaker.Session()
role = sagemaker.get_execution_role()

hub = { 
  'HF_MODEL_ID' : "llava-hf/llava-1.5-7b-hf" , # model_id from hf.co/models 
  'HF_TASK' : 'image-to-text'                            # NLP task you want to use for predictions
}

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
   env=hub,                                                # configuration for loading model from Hub
   role=role,                                              # IAM role with permissions to create an endpoint
   transformers_version="4.26",                             # Transformers version used
   pytorch_version="1.13",                                  # PyTorch version used
   py_version='py39',                                      # Python version used
)


deploy model to SageMaker Inference

predictor = huggingface_model.deploy(
   initial_instance_count=1,
   instance_type="ml.m5.4xlarge"
)

request:

data_url = {'inputs':'https://storage.googleapis.com/sfr-vision-language-research/BLIP/demo.jpg'}
predictor .predict(data_url)

error:

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from model with message "{
  "code": 400,
  "type": "InternalServerException",
  "message": "\u0027llava\u0027"
}
". See https://ap-southeast-1.console.aws.amazon.com/cloudwatch/home?region=ap-southeast-1#logEventViewer:group=/aws/sagemaker/Endpoints/huggingface-pytorch-inference-2024-03-16-14-04-39-322 in account 601804951188 for more information.

#Attempted Solutions Consulted https://github.com/sungeuns/gen-ai-sagemaker/blob/main/MultiModal/02-llava-sagemaker-endpoint.ipynb https://github.com/haotian-liu/LLaVA/issues/600 https://github.com/haotian-liu/LLaVA/issues/907

图片

guo
posta un mese fa141 visualizzazioni
2 Risposte
0

Hi There

You might find more info in the CLoudWatch logs (See the bottom of your error messge)

I have seen similar issues when trying to deploy other versions of LLava. In the meantime, here is a working notebook for deploying LLava 1.5 on Sagemaker. I have this running in my own account and it works great.

https://github.com/aws-samples/multimodal-rag-on-slide-decks/blob/main/Blog1-TitanEmbeddings-LVM/notebooks/0_deploy_llava.ipynb

profile pictureAWS
ESPERTO
Matt-B
con risposta un mese fa
0

I encountered similar issues trying to deploy https://huggingface.co/liuhaotian/llava-v1.5-7b/tree/main to a SM EP. I'm not sure it works out-of-the-box with SageMaker. However I found https://huggingface.co/anymodality/llava-v1.5-7b, which is a fork of the model including explicit SageMaker support. That README gives instructions for downloading the model, which includes a Jupyter Notebook you can follow to deploy the endpoint

con risposta 21 giorni fa

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande