Textract: Combining Header and Tabular Information

0

Hello,

I'm extracting header and table information from a PDF using Textract and need to combine them into an XML file. Currently, I'm using "AnalyzeDocument - Forms" for the header and "AnalyzeDocument - Tables" for the tabular data.

I'm facing a challenge in linking the header and table information, as the unique ID required is only in the header.

Any thoughts on how to handle this? I planned to start with the UI and then switch to the API, but perhaps it's only possible through the API.

Best, Brian

Brian
demandé il y a 10 mois185 vues
1 réponse
0

Are you looking for table header ? If yes, then possibly look into using TABLE_TITLE. More information in the document https://docs.aws.amazon.com/textract/latest/dg/how-it-works-tables.html. It would be also great if you can share the document to help us better diagnose the issue and also refer to our best practices to optimize how you use Textract

AWS
répondu il y a 10 mois

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions