- Newest
- Most votes
- Most comments
Hi, appreciate this question is from some time ago now but hope the following might still be useful:
1/ Scanning whole document: As documented here, Amazon Textract will only scan the first page ["1"]
for your Queries by default. If you want to scan all pages, you can set the Pages
parameter of your query to ["*"]
.
2/ Checkboxes marked as 'X': Today, the selection elements feature is not customizable or fine-tunable: It generally should recognise a range of elements as outlined in the doc: From checkboxes to radio buttons to circled or crossed text. Assuming you have something like [X]
as often used in MarkDown, I'd tentatively expect it to detect pretty well... But if it's not working for your particular documents then you'd need to explore another post-processing solution like Amazon Comprehend, rules-based logic, or custom ML models.
3/ Automatically attach queries: You need to specify your input queries on each call to run Amazon Textract (e.g. AnalyzeDocument
or StartDocumentAnalysis
) today. To run a fixed set of queries, you would handle this on your application side: Perhaps managing the configuration in a store like AWS AppConfig, SSM Parameter Store, or DynamoDB rather than hard-coding it in your app.
4/ Searching the whole document, including tables etc: As mentioned in 1/, you can configure the Pages
parameter of your query to scan all pages of your document. Queries should already use the visual/layout information from the page when trying to answer your questions, so there should be no need to explicitly configure it to work together with the FORMS
/TABLES
features (or to enable those features on the request if you don't need them).
5/ Adding text to query answers: In general, Textract Queries answers questions using by extracting the content of the source document. If you just want to add fixed text to your detected results, I'd suggest to do this on application side and it should be pretty straightforward. If you want more open-ended question answering that transforms the source text (for example "What's the contract date in YYYY-MM-DD format?", or "Summarize the document", or "Is this invoice from before 2020?"), your use-case might be better suited for a Generative AI-based technology like Anthropic Claude 3+ on Amazon Bedrock.
Relevant content
- asked 9 months ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 6 months ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 7 months ago