Issue with metadata filtering in Amazon Bedrock

0

I am experiencing an issue with metadata filtering in Amazon Bedrock. I set up a new collection and index within Amazon OpenSearch Serverless and created a knowledge base in Amazon Bedrock. The collection is the type "vectorsearch" and the index is using the "faiss" engine. The knowledge base uses a fixed-chunking strategy with chunk-size 1024 and is configured to use the previously mentioned OpenSearch collection.

I uploaded sample PDF files and their corresponding metadata files to an S3 bucket. The metadata files have the following format:

{ "metadataAttributes": { "caseId": "2" } }

I followed the documentation to configure and sync the data source to the knowledge base. However, in the "Test knowledge base" module of the Bedrock console, when I try to filter documents by metadata attribute (e.g., caseId = "2"), I receive the following error:

"failed to create query: Rewrite first".

If I remove the quotations around "2", the previous error does not appear, but the chatbot responds with

"Sorry, I am unable to assist you with this request."

I have ensured that:

  1. The metadata files are correctly named and formatted.
  2. The metadata files are in the same folder as their corresponding PDF files in S3.
  3. The data source is synced in the Bedrock knowledge base.

Despite these steps, the issue persists.

Does anyone have any insight into how to fix this issue?

1 Answer
1
Accepted Answer

Hello.

Looking at the document below, I think there is no need to add double quotes to the "Number" type of metadata.
https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-ds.html#kb-ds-metadata

{ "metadataAttributes": { "caseId": 2 } }

So, when modifying the metadata as above and filtering from the console, try not to use double quotes for the value like caseId = 2.

profile picture
EXPERT
answered a month ago
profile picture
EXPERT
reviewed 25 days ago
  • Hello,

    Thank you very much for your response. I apologize for not making the formatting of my question clearer. I did try removing the quotes around the "2". When I do this, the LLM chatbot responds with, "Sorry, I am unable to assist you with this request."

    My hypothesis is that when I remove the quotes, Bedrock is not retrieving the documents with metadata where case = 2 because when I set up the metadata fields in the vector index, I set the data type of case to string. I would infer that 2 has a data type of number and not string, but maybe not in this case.

    Furthermore, please take a look at this documentation page: https://docs.aws.amazon.com/bedrock/latest/userguide/kb-test-config.html Underneath the "logical operators" table and inside the "note" box, the documentation states, "You must surround strings with quotation marks."

    If I set the data type of case to string, should I not put the quotation marks?

    Thank you for your help,

    Jordan

  • I deleted the collection, index, and knowledge base. I changed the metadata json to the format: { "metadataAttributes": { "caseId": 2 } } without the double quotes. When recreating the metadata fields in the index, I made sure to set the case data type to integer. I think this was the determining factor. Now when I set the metadata filter to case = 2 when testing the knowledge base, the query returns a response with the correct filtered documents! Thank you!

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions