New user sign up using AWS Builder ID is currently unavailable on re:Post. To sign up, please use the AWS Management Console instead.
All Content tagged with Amazon Textract
Amazon Textract is a machine learning (ML) service that automatically extracts text, handwriting, and data from scanned documents.
Content language: English
Select tags to filter
Sort by most recent
349 results
Which is better? Performing the OCR and querying only with AWS Textract, or separating it into two steps, only OCR with Textract and another model for understanding
I find myself in need of extractin...
We are looking to build solution which extracts data from PDF file and turns that into a more structured data that we can use on our application.
We tried the DocumentAnalysis from Textract which rea...
https://docs.aws.amazon.com/textract/latest/dg/limits-document.html
On the above document, under "File Size and Page Count Limits", it states there are different quotas for "Document" Size versuse "Fi...
Textract is not detecting BlockType 'QUERY' or 'QUERY_RESULT' from some PDF files.. Have uploaded in AWS Textract environment in the webpage and is getting the output for the query question. But same ...
Hi, I need help with getting the Python code for extracting section_headers from a multi-page PDF.
I am trying to build a document comparison application as a part of my work. The documents I work with contains a lot of text and tabular data. They are pdf files with around 100 pages . I want the a...
Hi, I used Textract bulk document uploader to process over 9k documents. Now I need to download the processed files. However, I found I can only download 50 files each time. Is there any way to downlo...
Hello
Amaqzon Textract is quite a useful service to use.
Pricing is clearly articulated here - https://aws.amazon.com/textract/pricing/
Only challenge is how to track the utilization.
When I look ...
1. Assume we've 50-250 data points that need to be extracted from PDF files. Each PDF file may be 4-15 pages.
2. The format and layout of each PDF file may be different. A datapoint we're searching fo...
Amazon Textract's pricing page says that the Free Tier lasts for 3 months, but does it count 3 full months from the time you have the service? Or is it 3 calendar months? I started a free trial in the...
I am developing a Bank Statement PDF Converter that converts PDF files into CSV format. I have resolved many bugs in the output and am nearing completion.
However, during testing with sample files, I...
Issue: textract identifies enumerated paragraph number as a different layout box, and misses the enumeration
Example:
![phone contract example - `layout text 2` should be part of `layout text `](/medi...