An error occurred (ModelError) when calling the InvokeEndpoint operation (llama-2-7b jumpstart)

0

One endpoint hosting llama-2-7b was started. In the last few month, my code worked fine and the endpoint can be called with payload format like this { "inputs": [ [ {"role": "user", "content": "what is the recipe of mayonnaise?"}, ] ], "parameters": {"max_new_tokens": 512, "top_p": 0.9, "temperature": 0.6}, }
See details here https://github.com/aws/amazon-sagemaker-examples/blob/main/introduction_to_amazon_algorithms/jumpstart-foundation-models/llama-2-chat-completion.ipynb.

But yesterday when i tried to use it again. An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (422) from primary with message "Failed to deserialize the JSON body into the target type: inputs: invalid type: sequence, expected a string at line 1 column 11.

Could not find any solutions to work around this. Is this due to format issue or other issue? Thank you.

Vincent
gefragt vor 5 Monaten566 Aufrufe
1 Antwort
0

The issue was resolved. It was due to the format of payload.

Previous payload like below is not working. payload = { "inputs": [[ {"role": "system", "content": payload_system}, {"role": "user", "content": result}, ]], "parameters": {"max_new_tokens": 512, "top_p": 0.5, "temperature": 0.5} }. Need to transform the "inputs" of payload to be like: <s>[INST] <<SYS>> {{ system_prompt }} <</SYS>>

{{ user_message }} [/INST] See details of Llama-2 format on https://huggingface.co/blog/llama2#how-to-prompt-llama-2.

The following function can help to transform "inputs" above into correct form.

def format_messages(messages: List[Dict[str, str]]) -> List[str]:
    """Format messages for Llama-2 chat models.
    The model only supports 'system', 'user' and 'assistant' roles, starting with 'system', then 'user' and 
    alternating (u/a/u/a/u...). The last message must be from 'user'.
    """
    prompt: List[str] = []

    if messages[0]["role"] == "system":
        content = "".join(["<<SYS>>\n", messages[0]["content"], "\n<</SYS>>\n\n", messages[1]["content"]])
        messages = [{"role": messages[1]["role"], "content": content}] + messages[2:]

    for user, answer in zip(messages[::2], messages[1::2]):
        prompt.extend(["<s>", "[INST] ", (user["content"]).strip(), " [/INST] ", (answer["content"]).strip(), "</s>"])

    prompt.extend(["<s>", "[INST] ", (messages[-1]["content"]).strip(), " [/INST] "])

    return "".join(prompt)
Vincent
beantwortet vor 5 Monaten

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen