Bedrock metadata limitations?

0

I have metadata files that have failed ingestion. All 30K of them šŸ„²

Below is the example metadata file, they are all 2KB, and looks like this:

Enter image description here

The error seen here: Enter image description here

says: "Encountered error: Ignored 1 files due to invalid metadata attributes. Check that the attribute keys and values don't exceed the character quota, and that the attribute values are acceptable data types (strings, numbers, Booleans, or String Lists). Then retry your request [Files: s3://s<censured>/test/<censure>.txt]. Call to Customer Source did not succeed.

I tried as well to create a new KB, assign it a new collection in OpenSearch, and upload the txt and metadata without the urls (100 or so characters each) and had the same error.

What is wrong here? Thank you for your time!šŸ˜Š

3 Answers
1

The ingestion error you're encountering with Bedrock metadata likely stems from invalid attribute keys or values in your metadata files.

Ensure that your metadata attributes adhere to the following guidelines: keys and values should not exceed the character limits, values must be of acceptable data types (strings, numbers, Booleans, or String Lists), and metadata should be properly formatted.

Double-check that the attributes conform to Bedrock's specifications and retry the ingestion process after making any necessary corrections.

profile picture
EXPERT
answered 10 days ago
0
Accepted Answer

Can't have metadata with empty strings.

Antoine
answered 10 days ago
0

Ensure that all attribute keys and values in your metadata files do not exceed the character limits set by the service. Typically, keys and values should be within a certain character length, usually around 255 characters. Review your metadata files for any unusually long attribute names or values. Confirm that all attribute values are of acceptable data types: strings, numbers, Booleans, or String Lists.

Write a Validation Script:

Create a script (in Python, Node.js, or your preferred language) to read each metadata file. Validate each attribute against the required schema (character length, data type, etc.). Log or flag any files that fail validation.

Correct Invalid Files:

Optionally, the script could also correct some common issues automatically, such as truncating long strings or converting data types where possible. For more complex corrections, the script can generate a report so you can manually review and fix only the problematic files.

Retry the Ingestion:

After running the script and correcting the files, retry the ingestion process.

profile pictureAWS
EXPERT
Deeksha
answered a month ago
  • Thanks for sharing. I have no string longer than 97 characters (the urls) and everything is strings.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions