error message: when calling the DetectDocumentText operation

0

Good morning,

I use boto3 to extract with aws textract the image text (see part of my code below). If the test file is 'image.png' I have a result otherwise the following error message: when calling the DetectDocumentText operation: Unable to get object metadata from S3. Check object key, region and/or access permissions. In my account, the access keys are active. What is the problem? Thanks for your feedback.

my code : import boto3 from trp import Document

s3BucketName = "textract-console-eu-west-3-a6efc625-6d9c-44ec-a867-7d946e8a0c29" textractmodule = boto3.client('textract',region_name='eu-west-3')

response = textractmodule.detect_document_text( Document={ S3Object': { Bucket': s3BucketName, Name': test } })

asked a year ago634 views
3 Answers
0

Your code is the same of my

here is my Role/user : Id - Last utilisation(last hour) - region (eu-west-3)- last used service (textract) - Statut(Active) I dont know why with image.png is correct , and with another file ist incorrect.

answered a year ago
0

Hi,

thanks a lot for your response. I find the solution and it works;

the code from PIL import Image import boto3 import io import pandas as pd from trp import Document

image=images[0] image.save(path+"image.png", format="png")

im = Image.open(path+"image.png") buffered = io.BytesIO() im.save(buffered, format='PNG')

client = boto3.client('textract') response = client.analyze_document( Document={'Bytes': buffered.getvalue()}, FeatureTypes=['TABLES'] )

for item in response["Blocks"]: if item["BlockType"] == "LINE": tata=item["Geometry"]["BoundingBox"] X0 , Y0, width, height = tata['Left'] , tata['Top'] ,tata['Width'] , tata['Height'] dim = item["Text"].upper(), X0 , Y0 , width , height Detail_page.append(dim)

df = pd.DataFrame (Detail_page,columns = ['text','X0','Y0','width','height'])

answered a year ago
  • Good to hear that. May I ask you please to accept the question, so it will help also others. Thanks

0

Please try this way:

import boto3 
#import Document

# Add your file to your bucket and change the bellow 2 lines
s3BucketName = "dus-idp-textract" 
document = "The_river_effect_in_justified_text.jpg"
textractmodule = boto3.client('textract',region_name='eu-west-1')

response = textractmodule.detect_document_text( Document={ 'S3Object': { 'Bucket': s3BucketName, 'Name': document } })
response

I just tried it and it works. Also please add the permissions (Textract Policy) to the Role/User you are using in order to have permission to call Amazon Textract APIs

If you consider that this answer helped, please accept it

profile pictureAWS
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions