Textract: Combining Header and Tabular Information

0

Hello,

I'm extracting header and table information from a PDF using Textract and need to combine them into an XML file. Currently, I'm using "AnalyzeDocument - Forms" for the header and "AnalyzeDocument - Tables" for the tabular data.

I'm facing a challenge in linking the header and table information, as the unique ID required is only in the header.

Any thoughts on how to handle this? I planned to start with the UI and then switch to the API, but perhaps it's only possible through the API.

Best, Brian

Brian
已提问 10 个月前185 查看次数
1 回答
0

Are you looking for table header ? If yes, then possibly look into using TABLE_TITLE. More information in the document https://docs.aws.amazon.com/textract/latest/dg/how-it-works-tables.html. It would be also great if you can share the document to help us better diagnose the issue and also refer to our best practices to optimize how you use Textract

AWS
已回答 10 个月前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则