Textract doesn't detect few lines from top of the page.

0

Hi,

Sometimes Textract does not extract OCR from the top of the page. The following shows an example: Enter image description here The input image is 300 DPI. Is this a bug or I am missing a pre-process step or setting?

Thanks

gefragt vor 2 Jahren244 Aufrufe
1 Antwort
0

Thanks for bringing up the issue. 300 DIP seems like a reasonable resolution at which OCR should work fine. However, I can confirm some of the details at the top are missing. So, in this case, could you please reach out to the Textract team via a support case citing quality issue - and provide the redacted document to help the team debug what exactly is happening with the document. Thanks.

AWS
Rohan_K
beantwortet vor 2 Jahren
  • Hi,

    I found there is a bug in Textract API. If the page has a barcode or QR code a the bottom of the page, It won't pick up few lines from the top. If I remove the barcode from the page, then it will report back all the the text in the document.

    Thanks.

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen