- Newest
- Most votes
- Most comments
Hi, I helped build the JavaScript/TypeScript version of the TRP library you mentioned. In general can say we think they're pretty helpful for simplifying post-processing logic on Textract results, and would encourage GitHub issues/feature requests if you see any shortcomings!
1/ Detecting multiple text orientations on the same page
Yes, but in my experience the accuracy is usually a bit worse than if all text on a page is similarly aligned... I believe this is a pretty common challenge for all OCR solutions.
2/ Relying in page ordering of results
Ideally you might follow the various trees of relationship (e.g. here) from a PAGE block to the types of content it contains... But yes, you should be able to rely on the sequencing that all of a page's content comes before the next PAGE block in the list. The old version of Python TRP depended on this. The JavaScript/TypeScript version of the library still depends on it.
For the sake of clarity it's worth pointing out that within a page the order of Textract WORD blocks might not naturally align with human reading order: Especially for e.g. multi-column documents. TRP provides some basic client-side heuristics to estimate reading order if you want, but the best performance should come from enabling the Layout analysis feature in Amazon Textract, and taking the returned order of layout blocks as your best-guess guide for reading order.
3/ Ordering of word polygon points
Yes, the word polygon point array should go from the top-left in a clockwise direction around the word - and this is what both the Python and JS/TS response parser libraries use to judge the "orientation" of a word.
Relevant content
- asked 6 months ago
- asked 3 years ago
- AWS OFFICIALUpdated 4 months ago
- AWS OFFICIALUpdated a year ago

I only gave the Python version a quick look, but my concern so far is that it looks like it's based on pages of Textract results, rather than document pages, though the examples seem to work on entire Textract result sets, which doesn't seem ideal for my use-case. I suppose I could pipe the Textract results though this, store them on disk, then process each file or something?
Anyway, if I end up attempting to use this library I'll be sure to post if I run into anything!
Good to know. So far for my use-case it seems like multiple alignments on one page are rare, but I'll keep this in mind if we decide we need to address these better.
Yeah, I agree that would be ideal/safest. I was hoping to avoid needing to buffer all Textract results in memory if I could avoid it, and it sounds like I can, which is great!