Bedrock fine tuning ValidationError


While trying to create a custom model fine tuning job (model Cohere:command), it displays this error: "Validation error: After preprocessing, duplicates and large examples were removed.At least 32 examples needed if passing only train file"

The train file contains >100 rows JSONL format. The avg length of 'completion' objects is 187 chars. The avg length of 'prompt' objects is 500 chars.

Performed multiple attempts with different datasets with similar result.

2 Answers

The error indicates that after preprocessing, the number of examples in the training file dropped below the minimum required for model training. A few things to check:

  1. Make sure the training file contains at least 32 valid examples after preprocessing. Empty or duplicate rows get removed during this step.
  2. Check that the format of each example is valid as per documentation. Each example should contain 'prompt' and 'completion' fields with required data.
  3. Try increasing the number of examples in the training file. AWS recommends a minimum of 1000 examples for optimal model performance.

Referring to the AWS documentation on data format and requirements can help troubleshoot issues with example count or format causing validation errors. Let me know if preprocessing the data differently helps resolve the error.

answered 3 months ago
profile picture
reviewed 3 months ago

In addition to the above suggestions, if the issue persists at your end please feel free to open a case with AWS Premium Support for further investigation.

answered 3 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions