AWS Textract Training Adapter Error

0

I am working with AWS Textract to scan documents, I am trying to train an adapter to get better accuracy in output.

I am running into an issue at the 'train adapter' section, the error message I am receiving is 'Adapter training initialization failed because of annotation format check failure. Verify annotations and try training the Adapter again.'

When I perform the training the status turns to failed and I get "Manifest file contains invalid records. Consult validation error file at OutputConfig path for more details."

the error I am receiving in the JSON validation file is: [{"code": "ERROR_QUERY_RESULT_TEXT_LENGTH_LIMIT_EXCEEDED", "message": "QUERY_RESULT Text length is greater than the maximum length."}]}

when annotating the query output length is no more than 50 characters long making me confused as to why I am receiving this validation error?

Please could someone assist and suggest any debugging action I should try?

Many thanks in advance.

Max

asked 5 months ago492 views
3 Answers
0

Hi Max, thank you for using Custom Queries feature for Textract!

We have a 128 character limit for Max query response length in annotation: https://docs.aws.amazon.com/textract/latest/dg/limits-document.html, so you will have to truncate the query result text in the annotation file before training.

If you are still encountering "the query output length is no more than 50 characters" issue, please feel free to open a support case and we can further investigate for you. Hope this helps, thanks!

yifanx
answered 5 months ago
0

The error message you're encountering with AWS Textract during the training of an adapter indicates an issue with the annotation format or the content length within the annotations. Here are steps to help debug and potentially resolve this issue:

  1. Review Annotation Format: Ensure that the annotations you're using for training the adapter conform to the required format specified by Textract. Check if the annotation structure, such as JSON format or specific fields, aligns with the expected format.

  2. Check Annotation Length: Although you mentioned the query output length is within the limit, there might be other fields or elements within the annotation that exceed the specified maximum length. Double-check all fields, including metadata or additional information associated with annotations.

  3. Validate Annotation Content: Review each annotation record meticulously, ensuring that there are no unexpected characters, special symbols, or formatting issues that might cause validation errors.

  4. Check Query Results in Detail: Investigate the query results thoroughly to ensure that there are no discrepancies between the actual content and the annotations being used for training. The error might stem from mismatches between the query results and the annotations.

  5. Consider Text Preprocessing: If there are certain fields or elements causing the length issues, consider preprocessing the text data to trim or adjust the content length to comply with the specified limits before using it for training.

  6. Reach Out to AWS Support: If you're confident about the correctness of your annotations and the query results, but the error persists, consider reaching out to AWS Support for further assistance. They might provide insights or offer solutions based on a deeper analysis of your specific case.

  7. Experiment with Smaller Dataset: Try training the adapter with a smaller subset of your dataset to isolate the problematic records or annotations that might be causing the issue. This can help identify specific entries causing the validation errors.

  8. Validate your data via LabelGPT Try to connect your data via labelGPT and build a data validation and correction pipeline and autoscheduler to AWS training module

By thoroughly reviewing the annotation format, content length, and ensuring alignment between annotations and query results, you can identify and potentially resolve the validation error you're encountering during the adapter training process in AWS Textract. If the issue persists, AWS Support can provide tailored assistance based on the specifics of your dataset and use case.

answered 4 months ago
0

Hey

Are you ready to pass the AWS Cloud Practitioner Exam? Find out by testing yourself with this new offering on Udemy. Each of the 6 practice tests in this set provide an entire exam’s worth of questions, enabling you to confirm your topics and providing you with the confidence you’ll need to take the exam.

For the next 5 days only, you can get 50% off my AWS Cloud Practitioner Practice Exams course! This comprehensive course will equip you with the knowledge and skills you need to ace the Cloud Practitioner exam and land your dream cloud computing job.

Offer expires on [Date 5 days from today].

Click and enroll: https://www.udemy.com/course/aws-certified-cloud-practitioner-practice-exams-j/?couponCode=A234B9BA940CFA0C3C22

Best regards,

Issam
answered 3 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions