AWS Bedrock Knowledge Base Incremental Sync Issues

0

I'm trying to sync files from S3 to a knowledge base by updating the data source and then triggering start_ingestion_job using Boto3. There are 2,245 text files in S3, but after syncing, only 2,207 files are indexed in the knowledge base. There are no logs indicating any failures or errors in indexing the files. The relevant logs are provided below. Any insights on why this discrepancy might be occurring? 'statistics': { 'numberOfDocumentsDeleted': 0, 'numberOfDocumentsFailed': 0, 'numberOfDocumentsScanned': 2207, 'numberOfMetadataDocumentsModified': 0, 'numberOfMetadataDocumentsScanned': 0, 'numberOfModifiedDocumentsIndexed': 0, 'numberOfNewDocumentsIndexed': 1327}, 'status': 'COMPLETE',

2 Answers
0
Accepted Answer

Hi everyone, I found the issue. The count mismatch was due to empty files in the S3 bucket. They weren’t being scanned, which is why I initially missed them. Thanks for your help!

Archana
answered 25 days ago
profile picture
EXPERT
reviewed 7 days ago
0

Check that all the files in your S3 bucket are in the supported format and are less than 50 MB in size (see the list in the documentation).

profile pictureAWS
EXPERT
answered a month ago
profile picture
EXPERT
reviewed a month ago
profile pictureAWS
EXPERT
reviewed a month ago
  • Thank you for your response. All the files in S3 are text files, each under 50MB, so the issue likely lies elsewhere.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions