How does textract determine when to segment text vertically or horizontally?
How does textract determine when to segment text vertically or horizontally? Is there a way to help textract do this?
For example, this one https://gyazo.com/358819c526664c00922c6d231ea38fe5
vs this one https://gyazo.com/0279b6e040a91392e2ccad1f6379ae31
As of the moment, I am unaware of this parameter. However if you are mostly expecting horizontal text, you might want to test the image above to AWS Rekognition text detection instead https://docs.aws.amazon.com/rekognition/latest/dg/text-detection.html
As of now, Textract does not support vertical text alignment. Please refer to : https://docs.aws.amazon.com/textract/latest/dg/limits.html for more information. The Textract algorithm will perform better on sparser text use cases so we recommend padding the image to improve performance.
Relevant questions
Response messages for long multipage documents
asked 10 months agoHow to extract key value pairs from Textract with A2I JSON output??
asked 4 months agoAmazon Textract boto3 response changed when using in lambda
asked a month agoHow does Textract deal with actual text in a PDF?
Accepted Answerasked a month agoHow does textract determine when to segment text vertically or horizontally?
asked 4 months agoGetting reimbursed for failed recognition cases from Textract
asked 4 months agoAWS Kendra - Search PDF with handwritten text
asked 21 days agoTextract - How to extract just certain fields
asked 3 months agoSort and extract full text
asked a month agoInconsistent results from Textract
asked 4 months ago