An error occurred (ModelError) when calling the InvokeEndpoint operation (llama-2-7b jumpstart)

0

One endpoint hosting llama-2-7b was started. In the last few month, my code worked fine and the endpoint can be called with payload format like this { "inputs": [ [ {"role": "user", "content": "what is the recipe of mayonnaise?"}, ] ], "parameters": {"max_new_tokens": 512, "top_p": 0.9, "temperature": 0.6}, }
See details here https://github.com/aws/amazon-sagemaker-examples/blob/main/introduction_to_amazon_algorithms/jumpstart-foundation-models/llama-2-chat-completion.ipynb.

But yesterday when i tried to use it again. An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (422) from primary with message "Failed to deserialize the JSON body into the target type: inputs: invalid type: sequence, expected a string at line 1 column 11.

Could not find any solutions to work around this. Is this due to format issue or other issue? Thank you.

Vincent
asked 5 months ago510 views
1 Answer
0

The issue was resolved. It was due to the format of payload.

Previous payload like below is not working. payload = { "inputs": [[ {"role": "system", "content": payload_system}, {"role": "user", "content": result}, ]], "parameters": {"max_new_tokens": 512, "top_p": 0.5, "temperature": 0.5} }. Need to transform the "inputs" of payload to be like: <s>[INST] <<SYS>> {{ system_prompt }} <</SYS>>

{{ user_message }} [/INST] See details of Llama-2 format on https://huggingface.co/blog/llama2#how-to-prompt-llama-2.

The following function can help to transform "inputs" above into correct form.

def format_messages(messages: List[Dict[str, str]]) -> List[str]:
    """Format messages for Llama-2 chat models.
    The model only supports 'system', 'user' and 'assistant' roles, starting with 'system', then 'user' and 
    alternating (u/a/u/a/u...). The last message must be from 'user'.
    """
    prompt: List[str] = []

    if messages[0]["role"] == "system":
        content = "".join(["<<SYS>>\n", messages[0]["content"], "\n<</SYS>>\n\n", messages[1]["content"]])
        messages = [{"role": messages[1]["role"], "content": content}] + messages[2:]

    for user, answer in zip(messages[::2], messages[1::2]):
        prompt.extend(["<s>", "[INST] ", (user["content"]).strip(), " [/INST] ", (answer["content"]).strip(), "</s>"])

    prompt.extend(["<s>", "[INST] ", (messages[-1]["content"]).strip(), " [/INST] "])

    return "".join(prompt)
Vincent
answered 5 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions