error message: when calling the DetectDocumentText operation


Good morning,

I use boto3 to extract with aws textract the image text (see part of my code below). If the test file is 'image.png' I have a result otherwise the following error message: when calling the DetectDocumentText operation: Unable to get object metadata from S3. Check object key, region and/or access permissions. In my account, the access keys are active. What is the problem? Thanks for your feedback.

my code : import boto3 from trp import Document

s3BucketName = "textract-console-eu-west-3-a6efc625-6d9c-44ec-a867-7d946e8a0c29" textractmodule = boto3.client('textract',region_name='eu-west-3')

response = textractmodule.detect_document_text( Document={ S3Object': { Bucket': s3BucketName, Name': test } })

posta un anno fa728 visualizzazioni
3 Risposte

Your code is the same of my

here is my Role/user : Id - Last utilisation(last hour) - region (eu-west-3)- last used service (textract) - Statut(Active) I dont know why with image.png is correct , and with another file ist incorrect.

con risposta un anno fa


thanks a lot for your response. I find the solution and it works;

the code from PIL import Image import boto3 import io import pandas as pd from trp import Document

image=images[0]"image.png", format="png")

im ="image.png") buffered = io.BytesIO(), format='PNG')

client = boto3.client('textract') response = client.analyze_document( Document={'Bytes': buffered.getvalue()}, FeatureTypes=['TABLES'] )

for item in response["Blocks"]: if item["BlockType"] == "LINE": tata=item["Geometry"]["BoundingBox"] X0 , Y0, width, height = tata['Left'] , tata['Top'] ,tata['Width'] , tata['Height'] dim = item["Text"].upper(), X0 , Y0 , width , height Detail_page.append(dim)

df = pd.DataFrame (Detail_page,columns = ['text','X0','Y0','width','height'])

con risposta un anno fa
  • Good to hear that. May I ask you please to accept the question, so it will help also others. Thanks


Please try this way:

import boto3 
#import Document

# Add your file to your bucket and change the bellow 2 lines
s3BucketName = "dus-idp-textract" 
document = "The_river_effect_in_justified_text.jpg"
textractmodule = boto3.client('textract',region_name='eu-west-1')

response = textractmodule.detect_document_text( Document={ 'S3Object': { 'Bucket': s3BucketName, 'Name': document } })

I just tried it and it works. Also please add the permissions (Textract Policy) to the Role/User you are using in order to have permission to call Amazon Textract APIs

If you consider that this answer helped, please accept it

profile pictureAWS
con risposta un anno fa

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande