Inconsistent results from Textract

0

Hello,

I have a question about the inconsistent results coming from Textract for some images.

I'm trying to extract handwritten text from images. The image has text with some words crossed out. In general, we have found in our tests that crossed-out words are problematic for Textract to recognize properly. When I did a couple of tests a few weeks ago, Textract was getting at least some of the words correctly, but now when I test it again with the same image, I'm getting some gibberish results.

Here is the screenshot of that test: https://gyazo.com/162513b77655d016cbe1c2a1df41c92c

Interestingly, when I break down the image into smaller chunks (just a single line), Textract does recognize many words fine.

Here is a sample: https://gyazo.com/62b77f4a4ab4686d7748ca411b4330a7

I'm not sure if there were any major updates pushed to Textract service recently, which is producing weird and inconsistent results for me. I'd really appreciate any suggestions or help in making Textract produce better and more consistent results.

Thanks!

asked 2 years ago757 views
3 Answers
0

Thank you for providing feedback and the sample document(s). I did an investigation and found one of our pre-processing components mistakenly flips this image upside down which is the reason that we see bad results. Really sorry for the inconvenience, I have forwarded your feedback to the science team, and they are currently working on improvement.

To get the best results from your documents, I recommend using the best practices provided in our documentation: https://docs.aws.amazon.com/textract/latest/dg/textract-best-practices.html.

Thank you for your feedback!

AWS
Wenzhu
answered 2 years ago
0

Thank you for providing customer feedback. Seeing the screenshot shown in the link is partially in shadow. To investigate what lead to the inconsistency, may we ask the sample image you used for testing to reproduce the same results?

answered 2 years ago
0

Sure, here are the sample images that I have used in my tests:

https://drive.google.com/drive/folders/17PPnmXvMpFAVXQwhhnpiU-OW5eLGXC9s?usp=sharing

I understand your point about the images not being bright enough or having some sort of shadow as you said, but the thing is, I was getting better results with the same images a few weeks ago, plus, when I crop the big image into smaller ones, it does produce comparatively better results.

Thanks for looking into this matter.

answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions