Using Kendra as the RAG technique for an Q&A GenAI app but the generated answers seem to be limited to only the indexed documents

0

Hi, I am experimenting Kendra as the RAG technique for an Q&A GenAI app as described in this blog https://aws.amazon.com/blogs/machine-learning/quickly-build-high-accuracy-generative-ai-applications-on-enterprise-data-using-amazon-kendra-langchain-and-large-language-models/

After playing with the prompts, it does seem to be able to generate relevant answers based on the indexed documents (most of the time). However, it doesn't seem to be able to leverage information that is NOT in the indexed documents. For example, I asked a question 'what is pokemon?' for which pokemon is not in any of the indexed doc, it would generate garbage responses.

My question is that does this Kendra-RAG technique respond with information from the indexed documents ONLY? My understanding is that kendra is there to supplement the external information. What do I need to create a Q&A bot that will answer questions with internal AND external intel?

Do I use technique vectorstore DB like this: https://aws.amazon.com/blogs/machine-learning/question-answering-using-retrieval-augmented-generation-with-foundation-models-in-amazon-sagemaker-jumpstart/

Meaning not to use Kendra at all? Thanks for the advise.

Clara
asked 9 months ago1723 views
5 Answers
1

Hello!

So this seems to be a prompt engineering problem. In the back-end of this architecture, the wokflow is orchestrated by an open source Python Library called Langchain. If you take a look at the Python code and the way Langchain is orchestrating the RAG architecture you will see a section called 'Prompt Template'. In this prompt template you will be able to view the background prompt and where the "context" (excerpts and documents from kendra) and "question" (user input) go. It can look something like this:

*prompt_template = """ Human: This is a friendly conversation between a human and an AI. The AI is talkative and provides specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Assistant: OK, got it, I'll be a talkative truthful AI assistant.

Human: Here are a few documents in <documents> tags: <documents> {context} </documents> Based on the above documents, provide a detailed answer for, {question} Answer "don't know" if not present in the document. Assistant:""" *

As you can see the line "Answer "don't know" if not present in the document." is there as a safeguard to ensure the model does not hallucinate. BUT what this also does is ensure that any query that does not pertain do the information stored in your vector store would be met with the answer "don't know". If you would like to have the more general answers to questions, you can remove that line and anywhere else in the prompt where it indicates to the LLM to answer questions ONLY based on the documents provided. This will increase your model hallucinations, but will also allow you to ask general questions.

I hope this helps! -Moh

mtahsin
answered 8 months ago
profile picture
EXPERT
reviewed a month ago
0

I was exploring this topic recently and was recommended the following blog post to read more on this.

Link to blog post

AWS
Vijay
answered 9 months ago
  • Thank you Vijay for the pointer. I ran into errors when trying out the sample notebook. We’re you able to import the sagemakerEmbedding library?

0

I was able to resolve this problem by increasing the temperature. The model looks for knowledge from external data as its creativity (temperature) increases.

Clara
answered 9 months ago
  • I'm glad it is working.

0

You are right. Kendra is limited to the knowledge of the documents it has. This is why you often will see that different chatbots or Q&A systems leverage the power of LLM to provide more human like answers. Please see this example. It describes a sample chat that user both Kendra and LLM. The LLM is from the SageMaker Jump start, but you can modify the code to work with Bedrock

https://github.com/aws-samples/generative-ai-on-aws-immersion-day/blob/main/lab4/rag-lab.ipynb

Michael
answered 8 months ago
0

If you want to get answers based on the context, your prompt (in Python) might be something like this:

f"""{context}
Answer the following question based on the context above:
{question}"""

However, if you want to ask a question without Retrieval Augmented Generation, then your prompt would just be the question itself:

f"{question}"
AWS
answered 8 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions