ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from primary with message "{ "code": 400, .....

0

Hi, Problem Context: Trying to deploy Finetuned LLM , FALCON-7b . using sagemaker below is the code from SageMaker Noteook,

from sagemaker.huggingface.model import HuggingFaceModel
import sagemaker
import json
# Define your SageMaker role
try:
    role = sagemaker.get_execution_role()
except ValueError:
    iam = boto3.client('iam')
    role = iam.get_role(RoleName='Sagemaker-ExecutionRole')['Role']['Arn']
    
    
    print(f"sagemaker role arn: {role}")

    
    
trust_remote_code = True
# Hub model configuration <https://huggingface.co/models>
hub = {
  'HF_MODEL_ID':'tdicommons/falcon_28_06_23', # model_id from hf.co/models
  #'HF_TASK':'text-generation' ,
    'SM_NUM_GPUS': json.dumps(1),
    'HF_API_TOKEN': "hf_yclNrVDnzcDjAgYZffDJFukzKSr*********",
    'model_type': 'RefinedWebModel'                                  # NLP task you want to use for predictions
}

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
   env=hub,                                                # configuration for loading model from Hub
   role=role,                                              # IAM role with permissions to create an endpoint
   transformers_version="4.10.2",                             # Transformers version used
   pytorch_version="1.9.0",                                  # PyTorch version used
   py_version='py38',                                      # Python version used
)

# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
   initial_instance_count=1,
   instance_type="ml.g5.2xlarge",
    container_startup_health_check_timeout=300,
)

# # example request: you always need to define "inputs"
# prompt = f"""
# : hey how are you?
# :
# """.strip()


# # # request
# # predictor.predict(prompt)

# # hyperparameters for llm #https://huggingface.co/blog/sagemaker-huggingface-llm#4-run-inference-and-chat-with-our-model
# payload = {
#   "inputs": prompt,
  
# }

# # send request to endpoint
# response = predictor.predict(# send request
predictor.predict({
	"inputs": "hey how are you?",
})```
============================================================================
when i watch the cloudwatch i find this there:

 W-tdicommons__falcon_28_06_-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - KeyError: 'RefinedWebModel'
2023-08-26 13:45:06,652 [INFO ] W-tdicommons__falcon_28_06_-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - KeyError: 'RefinedWebModel'	AllTraffic/i-0982c97a270117858

W-tdicommons__falcon_28_06_-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle -     raise PredictionException(str(e),

 W-tdicommons__falcon_28_06_-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - mms.service.PredictionException: 'RefinedWebModel' : 400
2023-08-26 13:45:06,653 [INFO ] W-tdicommons__falcon_28_06_-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - mms.service.PredictionException: 'RefinedWebModel' : 400
ACCESS_LOG - /169.254.178.2:57684 "GET /ping HTTP/1.1" 200 1
2023-08-26 13:49:07,830 [INFO ] pool-1-thread-3 ACCESS_LOG - /169.254.178.2:57684 "GET /ping HTTP/1.1" 200 

My Questions:
1- Am i passing something incorrectly in variable hub ={ 'HF_MODEL_ID' : ...[Here is my fine tuned model,
'HF_TASK' : 'text-generation'......,....}
2- Why is that i can do inferencing on Sagemaker without issue, but when i use same model and make endpoint i am geting eror?
3- Do you think my way of passing input to model is wrong:
predictor.predict({
	"inputs": "hey how are you",
})
profile picture
gefragt vor 8 Monaten139 Aufrufe
Keine Antworten

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen