Bedrock llama2-13b-chat-v1 model throwing InvokeModel operation error


I'm getting error while accessing bedrock chat model:

raise ValueError(f"Error raised by bedrock service: {e}") ValueError: Error raised by bedrock service: An error occurred (ValidationException) when calling the InvokeModel operation: Malformed input request: 2 schema violations found, please reformat your input and try again.

I'm using below versions:

boto3 == 1.29.3 langchain == 0.0.338

below is the code:

def build_chain(): region = os.environ["AWS_REGION"] kendra_index_id = os.environ["KENDRA_INDEX_ID"] credentials_profile_name = os.environ['AWS_PROFILE']


model_id = "meta.llama2-13b-chat-v1"

llm = Bedrock( credentials_profile_name=credentials_profile_name, region_name = region, model_id=model_id )

retriever = AmazonKendraRetriever(index_id=kendra_index_id,top_k=3,region_name=region)

prompt_template = """

Human: This is a friendly conversation between a human and an AI. The AI is talkative and provides specific details from its context but limits it to 240 tokens. If the AI does not know the answer to a question, it truthfully says it does not know.

Assistant: OK, got it, I'll be a talkative truthful AI assistant.

Human: Here are a few documents in <documents> tags: <documents> {context} </documents> Based on the above documents, provide a detailed answer for, {question} Answer "don't know" if not present in the document.

Assistant: """ PROMPT = PromptTemplate( template=prompt_template, input_variables=["context", "question"] )

condense_qa_template = """ Given the following conversation and a follow up question, rephrase the follow up question to be a standalone question.

Chat History: {chat_history} Follow Up Input: {question} Standalone question:""" standalone_question_prompt = PromptTemplate.from_template(condense_qa_template)

qa = ConversationalRetrievalChain.from_llm( llm=llm, retriever=retriever, condense_question_prompt=standalone_question_prompt, return_source_documents=True, combine_docs_chain_kwargs={"prompt":PROMPT})

return qa

def run_chain(chain, prompt: str, history=[]): return chain({"question": prompt, "chat_history": history})

if name == "main":

chat_history = [] qa = build_chain() query = "what is sagemaker" result = run_chain(qa, query, chat_history)

what cloud be the solution for this?

1 Answer


You may want to start from existing code samples like this one from to have the right invocation structure in place

import json,boto3
region ="us-east-1"
client = boto3.client('bedrock-runtime',region)
prompt = "write a poem"
body = {
                "prompt": prompt,
                "temperature": 0.5,
                "top_p": 0.9,
                "max_gen_len": 512,
response = client.invoke_model(
                modelId="meta.llama2-13b-chat-v1", body=json.dumps(body)

response_body = json.loads(response["body"].read())
completion = response_body["generation"]

And then incrementally adapt it to your own needs in few iterations while keeping it working at all times.



profile pictureAWS
answered 2 months ago
  • How can I do it with langchain? or if I invoke the model with boto3 how will I pass the kendra index for the context ?

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions