How does Textract deal with actual text in a PDF?

0

How does Textract deal with PDF documents that contain actual text, as in text that you can copy and paste into a Notepad document (as opposed to an image that Textract can recognise text in). Does Textract simply take the text verbatim, or does it render the text as an image, and then OCR it?

已提问 2 年前325 查看次数
1 回答
1
已接受的回答

Textract does the latter where it first renders the PDF as an image and then performs OCR on it.

AWS
已回答 2 年前
profile picture
专家
已审核 9 个月前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则