Textract all image data in one file

0

Hello everyone. After using bulk file upload, is there a way to make a bulk download of the results? Having to open several zip files takes too much time.

Thanks in advance.

posta un anno fa323 visualizzazioni
3 Risposte
3
Risposta accettata

There is no easy method for this method but there is workaround

you can store all output to S3

https://aws.amazon.com/blogs/machine-learning/store-output-in-custom-amazon-s3-bucket-and-encrypt-using-aws-kms-for-multi-page-document-processing-with-amazon-textract/

then

Use AWS Lambda or similar service to combine all the result files into one file. (for example)

import boto3

s3 = boto3.resource('s3')
bucket = s3.Bucket('your_bucket_name')

combined_contents = ""

for obj in bucket.objects.all():
    body = obj.get()['Body'].read()
    combined_contents += body.decode('utf-8') + "\n"  # Assumes text files. You'll need to adjust for other formats.

bucket.put_object(Key='combined_results.txt', Body=combined_contents)

Download the combined file

profile picture
ESPERTO
con risposta un anno fa
2

To open several zip files you can use command line tools

Windows:

@echo off
for /R "C:\path\to\your\zips" %%I in (*.zip) do (
    "C:\Program Files\7-Zip\7z.exe" x -o"C:\path\to\extract\to" "%%~fI"
)
pause

Linux:

find /path/to/your/zips -name '*.zip' -exec unzip {} -d /path/to/extract/to \;

If you're processing documents in bulk using Amazon Textract and want to store the results for later use, you would typically set up an Amazon S3 bucket to store the documents and the results. When you call Textract to process a document, you can specify the bucket where the document is located, and then store the returned data in another object in the bucket.

To download the results in bulk, you could then download the objects from your S3 bucket. The AWS CLI includes a sync command that can be used to download all objects in a bucket:

aws s3 sync s3://mybucket .
profile picture
ESPERTO
con risposta un anno fa
0

Thanks for your answer, but is there a way to get all the results in one file/zip? Having 7-8 differents file takes too much time to process.

con risposta un anno fa

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande