Skip to content

Bedrock Knowledge Base Creation: Failure to Ingest from S3

0

Hi! I am a new AWS user who is trying to build a prototype/dummy RAG model in AWS Bedrock. I am using the Console right now as a starting point. I have uploaded various files to S3; I then created a Knowledge Base (and Data Source within that) in Bedrock. I go to Add Documents (from S3 Location), select the documents, hit Add. After a few minutes of a blue banner saying "Please wait while the permissions are being processed", I end up with a blue "Document ingestion in progress." immediately followed by a red banner of "Failed to ingest the document s3://xyz/Test.txt."

Things I have tested so far:

  1. Document Name and Composition - tried two different PDFs, a DOCX and ultimately a TXT that just said "Mickey Mouse" as the only value...none of it worked. Also removed spaces from document names in S3 - didn't help
  2. Privileges - I am currently using an AdministratorAccess role (that I created from my initial Root account) so I'm presuming I have privileges to everything I need
  3. Region and Model Selection - all regions aligned to us-east-1 and I have permission for the relevant FMs

Thank you in advance for any feedback or help you're able to give me!!

2 Answers
1
Accepted Answer

This is probably the important part that you need:

S3 Bucket Permissions: Even though you're using an AdministratorAccess role, it's important to ensure that the Bedrock service has the necessary permissions to access your S3 bucket. You may need to add a bucket policy that explicitly allows Bedrock to read from your bucket.

See also: https://docs.aws.amazon.com/AmazonS3/latest/userguide/add-bucket-policy.html

You as a user might have AdministratorAccess, however, the services that you use don't inherit your permissions.

answered a year ago
EXPERT
reviewed a year ago
0

Thank you for providing details about your issue with ingesting documents into your Amazon Bedrock Knowledge Base. Based on the information you've shared, there are a few potential reasons for the ingestion failure:

  1. S3 Bucket Permissions: Even though you're using an AdministratorAccess role, it's important to ensure that the Bedrock service has the necessary permissions to access your S3 bucket. You may need to add a bucket policy that explicitly allows Bedrock to read from your bucket.

  2. File Format and Size: While you've tried various file formats, make sure that the files are within the supported size limits for Bedrock Knowledge Base ingestion. Even for text files, there might be size restrictions.

  3. Embedding Model Access: Ensure that you have not only permissions for the foundation models but also specifically for the embedding model used by the Knowledge Base. Often, the Titan embedding model is used, and you need to have subscribed to it in the Bedrock Model access section.

  4. OpenSearch Serverless Configuration: If you're using Amazon OpenSearch Serverless for vector storage, verify that it's correctly set up and that Bedrock has the necessary permissions to interact with it.

  5. Sync Process: Sometimes, the console may not reflect the actual state of operations. Try refreshing the page or waiting a few minutes to see if the ingestion status updates.

To troubleshoot further:

  1. Check CloudWatch logs for more detailed error messages related to Bedrock and OpenSearch Serverless.

  2. Verify in the Bedrock console under "Model access" that you have access to the Titan embedding model (or whichever embedding model you're using).

  3. If possible, try initiating a sync using the AWS CLI or SDK with the UpdateDataSource API call, followed by a StartIngestionJob call. This might provide more detailed error information.

  4. Ensure that your S3 bucket and the Bedrock service are in the same AWS region.

If these steps don't resolve the issue, you may need to contact AWS Support for more specific assistance, as they can provide guidance based on your account's exact configuration and any potential backend issues.
Sources
Bedrock knowledge base data source sync is not working | AWS re:Post
Amazon Bedrock Data Source syncing error | AWS re:Post
Data sync failed error in Amazon Bedrock Knowledgebase | AWS re:Post

answered a year ago
EXPERT
reviewed a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.