Skip to content

bedrock data automation failing

0

I have Bedrock Knowledge Base, using Data Automation accessing a S3 bucket. If I try to sync .mp3 or .mp4 files I get the following error:

Encountered error: Ignored 2 files as their file format was not supported. [***.mp3, ***.mp4]. Call to Customer Source did not succeed.

I have tried different files and formats, but none succeeded. I also tried using those same files in the Demo of the Data Automation section in the console, and they worked correctly, so I don't think file format is the real reason. Is there some other permission that is required for media files to be accessed?

I have both with and without metadata.json files.

Syncing is successful on other file types (pdf, txt, jpg) in the same folder.

1 Answer
1

The issue you're experiencing with syncing .mp3 and .mp4 files in your Bedrock Knowledge Base Data Automation is likely not related to file format support, as you've mentioned these files work in the Demo section. Instead, the problem may be related to permissions or configuration settings.

Here are a few things to check and try:

  1. Ensure that your IAM role associated with the Bedrock Knowledge Base has the necessary permissions to access and process these specific file types in your S3 bucket. This includes s3:GetObject and s3:ListBucket permissions.

  2. If you're using server-side encryption on your S3 bucket with a KMS key, make sure the IAM role has kms:Decrypt permissions for that key.

  3. Verify that the files don't exceed the maximum ingestion job file size specified in Amazon Bedrock endpoints and quotas.

  4. Check that your knowledge base configuration, including the vector embeddings model, vector store, and file formats, is properly set up for these media file types.

  5. If you're using Amazon OpenSearch Serverless as your vector store, confirm that the vector index is configured with the 'faiss' engine.

  6. Ensure that your S3 bucket and the model customization job are in the same AWS region.

  7. Try refreshing the Bedrock console page or waiting a few minutes, as sometimes the console may not immediately reflect the actual state of operations.

If the issue persists after checking these points, you may want to try initiating a sync using the AWS CLI or SDK with the UpdateDataSource API call, followed by a StartIngestionJob call. This could potentially bypass any console interface issues.

If none of these solutions work, it would be advisable to contact AWS Support for further assistance. They can provide more specific troubleshooting based on your account setup and configuration.
Sources
Bedrock knowledge base data source sync is not working | AWS re:Post
Bedrock Data source sync does not work | AWS re:Post

answered a year ago
EXPERT
reviewed a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.