How does Textract deal with actual text in a PDF?

0

How does Textract deal with PDF documents that contain actual text, as in text that you can copy and paste into a Notepad document (as opposed to an image that Textract can recognise text in). Does Textract simply take the text verbatim, or does it render the text as an image, and then OCR it?

已提問 2 年前檢視次數 322 次
1 個回答
1
已接受的答案

Textract does the latter where it first renders the PDF as an image and then performs OCR on it.

AWS
已回答 2 年前
profile picture
專家
已審閱 8 個月前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南