Textract all image data in one file

0

Hello everyone. After using bulk file upload, is there a way to make a bulk download of the results? Having to open several zip files takes too much time.

Thanks in advance.

asked a year ago305 views
3 Answers
3
Accepted Answer

There is no easy method for this method but there is workaround

you can store all output to S3

https://aws.amazon.com/blogs/machine-learning/store-output-in-custom-amazon-s3-bucket-and-encrypt-using-aws-kms-for-multi-page-document-processing-with-amazon-textract/

then

Use AWS Lambda or similar service to combine all the result files into one file. (for example)

import boto3

s3 = boto3.resource('s3')
bucket = s3.Bucket('your_bucket_name')

combined_contents = ""

for obj in bucket.objects.all():
    body = obj.get()['Body'].read()
    combined_contents += body.decode('utf-8') + "\n"  # Assumes text files. You'll need to adjust for other formats.

bucket.put_object(Key='combined_results.txt', Body=combined_contents)

Download the combined file

profile picture
EXPERT
answered a year ago
2

To open several zip files you can use command line tools

Windows:

@echo off
for /R "C:\path\to\your\zips" %%I in (*.zip) do (
    "C:\Program Files\7-Zip\7z.exe" x -o"C:\path\to\extract\to" "%%~fI"
)
pause

Linux:

find /path/to/your/zips -name '*.zip' -exec unzip {} -d /path/to/extract/to \;

If you're processing documents in bulk using Amazon Textract and want to store the results for later use, you would typically set up an Amazon S3 bucket to store the documents and the results. When you call Textract to process a document, you can specify the bucket where the document is located, and then store the returned data in another object in the bucket.

To download the results in bulk, you could then download the objects from your S3 bucket. The AWS CLI includes a sync command that can be used to download all objects in a bucket:

aws s3 sync s3://mybucket .
profile picture
EXPERT
answered a year ago
0

Thanks for your answer, but is there a way to get all the results in one file/zip? Having 7-8 differents file takes too much time to process.

answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions