Inconsistent results from Textract

0

Hello,

I have a question about the inconsistent results coming from Textract for some images.

I'm trying to extract handwritten text from images. The image has text with some words crossed out. In general, we have found in our tests that crossed-out words are problematic for Textract to recognize properly. When I did a couple of tests a few weeks ago, Textract was getting at least some of the words correctly, but now when I test it again with the same image, I'm getting some gibberish results.

Here is the screenshot of that test: https://gyazo.com/162513b77655d016cbe1c2a1df41c92c

Interestingly, when I break down the image into smaller chunks (just a single line), Textract does recognize many words fine.

Here is a sample: https://gyazo.com/62b77f4a4ab4686d7748ca411b4330a7

I'm not sure if there were any major updates pushed to Textract service recently, which is producing weird and inconsistent results for me. I'd really appreciate any suggestions or help in making Textract produce better and more consistent results.

Thanks!

已提問 2 年前檢視次數 784 次
3 個答案
0

Thank you for providing feedback and the sample document(s). I did an investigation and found one of our pre-processing components mistakenly flips this image upside down which is the reason that we see bad results. Really sorry for the inconvenience, I have forwarded your feedback to the science team, and they are currently working on improvement.

To get the best results from your documents, I recommend using the best practices provided in our documentation: https://docs.aws.amazon.com/textract/latest/dg/textract-best-practices.html.

Thank you for your feedback!

AWS
Wenzhu
已回答 2 年前
0

Thank you for providing customer feedback. Seeing the screenshot shown in the link is partially in shadow. To investigate what lead to the inconsistency, may we ask the sample image you used for testing to reproduce the same results?

已回答 2 年前
0

Sure, here are the sample images that I have used in my tests:

https://drive.google.com/drive/folders/17PPnmXvMpFAVXQwhhnpiU-OW5eLGXC9s?usp=sharing

I understand your point about the images not being bright enough or having some sort of shadow as you said, but the thing is, I was getting better results with the same images a few weeks ago, plus, when I crop the big image into smaller ones, it does produce comparatively better results.

Thanks for looking into this matter.

已回答 2 年前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南