Textract all image data in one file

0

Hello everyone. After using bulk file upload, is there a way to make a bulk download of the results? Having to open several zip files takes too much time.

Thanks in advance.

已提問 1 年前檢視次數 323 次
3 個答案
3
已接受的答案

There is no easy method for this method but there is workaround

you can store all output to S3

https://aws.amazon.com/blogs/machine-learning/store-output-in-custom-amazon-s3-bucket-and-encrypt-using-aws-kms-for-multi-page-document-processing-with-amazon-textract/

then

Use AWS Lambda or similar service to combine all the result files into one file. (for example)

import boto3

s3 = boto3.resource('s3')
bucket = s3.Bucket('your_bucket_name')

combined_contents = ""

for obj in bucket.objects.all():
    body = obj.get()['Body'].read()
    combined_contents += body.decode('utf-8') + "\n"  # Assumes text files. You'll need to adjust for other formats.

bucket.put_object(Key='combined_results.txt', Body=combined_contents)

Download the combined file

profile picture
專家
已回答 1 年前
2

To open several zip files you can use command line tools

Windows:

@echo off
for /R "C:\path\to\your\zips" %%I in (*.zip) do (
    "C:\Program Files\7-Zip\7z.exe" x -o"C:\path\to\extract\to" "%%~fI"
)
pause

Linux:

find /path/to/your/zips -name '*.zip' -exec unzip {} -d /path/to/extract/to \;

If you're processing documents in bulk using Amazon Textract and want to store the results for later use, you would typically set up an Amazon S3 bucket to store the documents and the results. When you call Textract to process a document, you can specify the bucket where the document is located, and then store the returned data in another object in the bucket.

To download the results in bulk, you could then download the objects from your S3 bucket. The AWS CLI includes a sync command that can be used to download all objects in a bucket:

aws s3 sync s3://mybucket .
profile picture
專家
已回答 1 年前
0

Thanks for your answer, but is there a way to get all the results in one file/zip? Having 7-8 differents file takes too much time to process.

已回答 1 年前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南