1 Answer
- Newest
- Most votes
- Most comments
0
For multi-column documents, I'd suggest trying getLineClustersInReadingOrder()
from the JavaScript/TypeScript version of the Amazon Textract Response parser (reading-order feature doc & code here on GitHub, package on NPM). It's only an approximate/rule-based heuristic though, so might struggle a bit with the warping in your example or need tuning the parameters a little.
For images like your second sample that are a bit more visually busy and light on text, it might be worth trying whether Rekognition DetectText can perform any better?
Relevant content
- asked 5 years ago
- Accepted Answerasked 10 months ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 3 months ago