Textract: Combining Header and Tabular Information

0

Hello,

I'm extracting header and table information from a PDF using Textract and need to combine them into an XML file. Currently, I'm using "AnalyzeDocument - Forms" for the header and "AnalyzeDocument - Tables" for the tabular data.

I'm facing a challenge in linking the header and table information, as the unique ID required is only in the header.

Any thoughts on how to handle this? I planned to start with the UI and then switch to the API, but perhaps it's only possible through the API.

Best, Brian

Brian
asked 9 months ago169 views
1 Answer
0

Are you looking for table header ? If yes, then possibly look into using TABLE_TITLE. More information in the document https://docs.aws.amazon.com/textract/latest/dg/how-it-works-tables.html. It would be also great if you can share the document to help us better diagnose the issue and also refer to our best practices to optimize how you use Textract

AWS
answered 9 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions