Meta Llama3 8B results are different than expected when triggered from lambda vs Bedrock Playground and other model execution testing

0

Hey AWS folks! I'm testing out one of the Bedrock foundation models. Specifically, I'm trying to trigger the Meta Llama 3 8B model from a lambda function. Throughout my personal testing with this model on the llama2.ai website (now runs Llama 3 8B by default) and in the Bedrock Playground, I've gotten pretty consistent and exciting results. So, I assumed it would probably be the winner when I baked an auto-regressive LLM into my business use case. When I went to test this model out from a lambda function with identical model parameters, I get significantly different results. When run from the lambda function, the results often include superfluous text that's unwanted from both technical and costs standpoints. Below are my model parameters (excluding the prompt, which you'll have to trust me is the same in both cases :) ):

max_gen_len = 512 temperature = 0.75 top_p = 0.9

I understand that, given the nature of LLM's and the temperature set, there will be variability from one model execution to the next. However, the superfluous result content (examples: long sequences of unnecessary pipe characters, lengthy prose before and after the result I actually want, and otherwise content that doesn't really make sense with the prompt) is a real problem and variability beyond what I'm expecting given my testing with the model thus far in the Bedrock Playground and outside of AWS.

Any ideas as to what might be happening here?

已提問 3 個月前檢視次數 361 次
3 個答案
5

Hello,

It seems the issue might be due to differences in hidden system prompts or configurations between Lambda and the Bedrock Playground. Double-check that both environments are using identical prompt setups, including any system prompts or hidden parameters that might be influencing the output. For detailed guidance on how system prompts can impact results, you can refer to this article: https://repost.aws/articles/AR-LV1HoR_S0m-qy89wXwHmw/the-leverage-of-llm-system-prompt-by-knowledge-bases-for-bedrock-in-rag-workflows

profile picture
專家
已回答 3 個月前
專家
已審閱 3 個月前
4

Hi,

Are you sure that you prompt exactly the same way from Lambda and meta website ? Are you sure for example that Meta doesn't include a system prompt that you don't see but that provide guidance to the LLM.

See my article to measure how such a system prompt via very deep guidance can impact results: https://repost.aws/articles/AR-LV1HoR_S0m-qy89wXwHmw/the-leverage-of-llm-system-prompt-by-knowledge-bases-for-bedrock-in-rag-workflows

Best,

Didier

profile pictureAWS
專家
已回答 3 個月前
profile picture
專家
已審閱 3 個月前
profile picture
專家
已審閱 3 個月前
0
已接受的答案

Yes, I'm sure my prompt is the same across the places where I'm executing the model. I figured it out. Hopefully, this helps the next guy who runs into this. Simply put, for this model to work properly executed from a Lambda function, the prompt needs to be nested inside of some formatting text as in the Python example below. Without doing this, the model can produce erratic results. A full code example can be found here (https://docs.aws.amazon.com/bedrock/latest/userguide/bedrock-runtime_example_bedrock-runtime_InvokeModel_MetaLlama3_section.html).

AWS folks - It's worth noting that part of my confusion here stemmed from the fact that the Bedrock documentation in the AWS console has an "API Request" section at the bottom of each foundation model. In the Meta 3 8B case, at least, that section was sort of misleading. That is, if you want to run the model successfully, you need more than the set of parameters listed in Bedrock for the FM.

Parting thoughts: I'm guessing that both the AWS Playground and the website I linked programmatically format user prompts as below. That would explain the discrepancy.

# Embed the prompt in Llama 3's instruction format.
formatted_prompt = f"""
<|begin_of_text|>
<|start_header_id|>user<|end_header_id|>
{prompt}
<|eot_id|>
<|start_header_id|>assistant<|end_header_id|>
"""
已回答 3 個月前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南