error message: when calling the DetectDocumentText operation


Good morning,

I use boto3 to extract with aws textract the image text (see part of my code below). If the test file is 'image.png' I have a result otherwise the following error message: when calling the DetectDocumentText operation: Unable to get object metadata from S3. Check object key, region and/or access permissions. In my account, the access keys are active. What is the problem? Thanks for your feedback.

my code : import boto3 from trp import Document

s3BucketName = "textract-console-eu-west-3-a6efc625-6d9c-44ec-a867-7d946e8a0c29" textractmodule = boto3.client('textract',region_name='eu-west-3')

response = textractmodule.detect_document_text( Document={ S3Object': { Bucket': s3BucketName, Name': test } })

feita há um ano719 visualizações
3 Respostas

Your code is the same of my

here is my Role/user : Id - Last utilisation(last hour) - region (eu-west-3)- last used service (textract) - Statut(Active) I dont know why with image.png is correct , and with another file ist incorrect.

respondido há um ano


thanks a lot for your response. I find the solution and it works;

the code from PIL import Image import boto3 import io import pandas as pd from trp import Document

image=images[0]"image.png", format="png")

im ="image.png") buffered = io.BytesIO(), format='PNG')

client = boto3.client('textract') response = client.analyze_document( Document={'Bytes': buffered.getvalue()}, FeatureTypes=['TABLES'] )

for item in response["Blocks"]: if item["BlockType"] == "LINE": tata=item["Geometry"]["BoundingBox"] X0 , Y0, width, height = tata['Left'] , tata['Top'] ,tata['Width'] , tata['Height'] dim = item["Text"].upper(), X0 , Y0 , width , height Detail_page.append(dim)

df = pd.DataFrame (Detail_page,columns = ['text','X0','Y0','width','height'])

respondido há um ano
  • Good to hear that. May I ask you please to accept the question, so it will help also others. Thanks


Please try this way:

import boto3 
#import Document

# Add your file to your bucket and change the bellow 2 lines
s3BucketName = "dus-idp-textract" 
document = "The_river_effect_in_justified_text.jpg"
textractmodule = boto3.client('textract',region_name='eu-west-1')

response = textractmodule.detect_document_text( Document={ 'S3Object': { 'Bucket': s3BucketName, 'Name': document } })

I just tried it and it works. Also please add the permissions (Textract Policy) to the Role/User you are using in order to have permission to call Amazon Textract APIs

If you consider that this answer helped, please accept it

profile pictureAWS
respondido há um ano

Você não está conectado. Fazer login para postar uma resposta.

Uma boa resposta responde claramente à pergunta, dá feedback construtivo e incentiva o crescimento profissional de quem perguntou.

Diretrizes para responder a perguntas