Specify pages to extract from pdf with C# AWS SDK in Textract


I have a program written which performs a call to Textract to get tables in multi-page pdfs. This has been working great so far. The problem I have run into is that I now have pdfs where I only need certain tables on specific pages, and I am having trouble figuring out how to set the "Pages" property in the QueriesConfig in the StartDocumentAnalysisRequest. A simple example of this, from the StartDocumentAnalysisRequest level, would be sufficient. The program is written in C# using Amazon.Textract and Amazon.Textract.Model

Thank you!

1 Answer

Thank you for using Textract. Sorry to hear that you are facing issues. Currently, the pages parameter is applicable only for QUERIES feature type. TABLES feature is for all pages. A recommendation is to split the document and then only call Textract with TABLES feature for the pages that you are interested in.

answered 2 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions