Skip to content

Bedrock Data Sources Mixed Up

0

I am experiencing an issue with Amazon Bedrock Knowledge Base data source filtering. When I query my Knowledge Base and apply a filter to restrict results to a single data source, I am receiving documents from other data sources. This occurs both in my application (using the Bedrock API) and in the Bedrock Console’s “Test” playground.

Filter Used: { equals: { key: 'x-amz-bedrock-kb-data-source-id', value: 'DATA-SOURCE-ID' } }

How tested: In my app, I send the above filter in the RetrieveAndGenerateCommand. In the Bedrock Console “Test” playground, I set a manual filter with: Attribute: x-amz-bedrock-kb-data-source-id Value: DATA-SOURCE-ID

The Knowledge Base was created a few days ago and the three data sources were then freshly ingested and synced. The number of documents in each data source matches what I expect, and the S3 prefixes are non-overlapping.

What I’ve Checked: • My application code is correct and uses only dynamic data source IDs fetched from Bedrock. • There is no static mapping or client-side filtering. • The filter sent to Bedrock is correct and matches the documentation. • S3 prefixes for each data source are non-overlapping and correct. • The issue is reproducible in the AWS Console, not just my app.

I previously had a knowledgebase with the same three data sources. Is it possible there's some remnant or caching issue in Bedrock's back end?

1 Answer
0

Based on the information you've provided, it seems you're encountering an unexpected behavior with Amazon Bedrock Knowledge Base data source filtering. This issue, where documents from other data sources appear when filtering for a specific data source, is not typical and suggests a potential problem with the Knowledge Base or its underlying data.

Given that you've thoroughly checked your application code, ensured correct filter usage, and verified non-overlapping S3 prefixes, the issue likely lies within the Bedrock service itself. The fact that you can reproduce this in both your application and the Bedrock Console's "Test" playground further supports this conclusion.

Your suggestion about a possible remnant or caching issue in Bedrock's backend is a plausible explanation, especially considering you previously had a knowledge base with the same three data sources. While Bedrock should maintain clear separation between different knowledge bases and their associated data sources, it's possible that some residual data or index information is causing this unexpected behavior.

To address this issue, you could try the following steps:

  1. Perform a full re-sync of all data sources in your current knowledge base. This might help refresh the index and clear any potential caching issues.

  2. If the problem persists, consider creating a new knowledge base from scratch and ingesting your data sources into it. This could help eliminate any potential issues related to the previous knowledge base configuration.

  3. Double-check that all your data sources are properly synced and that there are no ongoing or failed sync jobs that might be affecting the results.

  4. If possible, try creating a small test knowledge base with a subset of your data to see if the issue reproduces in a fresh environment.

If none of these steps resolve the issue, it would be advisable to contact AWS Support. They can investigate more deeply on the backend and determine if there's an internal issue with the Bedrock service causing this behavior. When contacting support, provide them with detailed information about your knowledge base configuration, data sources, and the steps you've taken to reproduce and troubleshoot the issue.
Sources
Issue with metadata filtering in Amazon Bedrock | AWS re:Post
Bedrock knowledge base data source sync is not working | AWS re:Post
Bedrock Data source sync does not work | AWS re:Post

answered 10 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.