Inconsistent results from Textract

0

Hello,

I have a question about the inconsistent results coming from Textract for some images.

I'm trying to extract handwritten text from images. The image has text with some words crossed out. In general, we have found in our tests that crossed-out words are problematic for Textract to recognize properly. When I did a couple of tests a few weeks ago, Textract was getting at least some of the words correctly, but now when I test it again with the same image, I'm getting some gibberish results.

Here is the screenshot of that test: https://gyazo.com/162513b77655d016cbe1c2a1df41c92c

Interestingly, when I break down the image into smaller chunks (just a single line), Textract does recognize many words fine.

Here is a sample: https://gyazo.com/62b77f4a4ab4686d7748ca411b4330a7

I'm not sure if there were any major updates pushed to Textract service recently, which is producing weird and inconsistent results for me. I'd really appreciate any suggestions or help in making Textract produce better and more consistent results.

Thanks!

質問済み 2年前782ビュー
3回答
0

Thank you for providing feedback and the sample document(s). I did an investigation and found one of our pre-processing components mistakenly flips this image upside down which is the reason that we see bad results. Really sorry for the inconvenience, I have forwarded your feedback to the science team, and they are currently working on improvement.

To get the best results from your documents, I recommend using the best practices provided in our documentation: https://docs.aws.amazon.com/textract/latest/dg/textract-best-practices.html.

Thank you for your feedback!

AWS
Wenzhu
回答済み 2年前
0

Thank you for providing customer feedback. Seeing the screenshot shown in the link is partially in shadow. To investigate what lead to the inconsistency, may we ask the sample image you used for testing to reproduce the same results?

回答済み 2年前
0

Sure, here are the sample images that I have used in my tests:

https://drive.google.com/drive/folders/17PPnmXvMpFAVXQwhhnpiU-OW5eLGXC9s?usp=sharing

I understand your point about the images not being bright enough or having some sort of shadow as you said, but the thing is, I was getting better results with the same images a few weeks ago, plus, when I crop the big image into smaller ones, it does produce comparatively better results.

Thanks for looking into this matter.

回答済み 2年前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ