How does Textract deal with actual text in a PDF?


How does Textract deal with PDF documents that contain actual text, as in text that you can copy and paste into a Notepad document (as opposed to an image that Textract can recognise text in). Does Textract simply take the text verbatim, or does it render the text as an image, and then OCR it?

asked 8 months ago78 views
1 Answer
Accepted Answer

Textract does the latter where it first renders the PDF as an image and then performs OCR on it.

answered 8 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions