Textract API with Lambda - Getting InvalidS3ObjectException error

0

Hi, I am trying to run the same

  1. directly from AWS CloudShell with 'python3 textract_doc_analysis.py' command,
  2. running it through Lambda. In both the cases I modified the code. But, they didn't work.

For Lambda Role, I added the policies for S3 and Textract full access, apart from Lambda credentials. Also, explicitly added the S3 object paths also. ----------- Lambda Role -------------- { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": "logs:CreateLogGroup", "Resource": "arn:aws:logs:us-east-2:xxxxxxxxx:" }, { "Effect": "Allow", "Action": [ "logs:CreateLogStream", "logs:PutLogEvents" ], "Resource": [ "arn:aws:logs:us-east-2:xxxxxxxx:log-group:/aws/lambda/textract_doc_analysis:" ] }, { "Effect": "Allow", "Action": [ "s3:GetObject", "s3:PutObject" ], "Resource": [ "arn:aws:s3:::learn-textract-bucket-20230626*", "arn:aws:s3:::learn-textract-bucket-20230626/*", "arn:aws:s3:::learn-textract-bucket-20230626/pdf-invoices3.pdf", "arn:aws:s3:::learn-textract-bucket-20230626/pdf-sample1.pdf" ] } ] }

---- error ------ "errorMessage": "An error occurred (InvalidS3ObjectException) when calling the StartDocumentAnalysis operation: Unable to get object metadata from S3. Check object key, region and/or access permissions.", "errorType": "InvalidS3ObjectException",

----- code -------

import json import boto3 def lambda_handler(event, context): boto3.set_stream_logger(name='botocore')

s = boto3.Session(profile_name="default")

s = boto3.Session() # ???? Not sure whether this is right ?????

tx = s.client("textract", region_name='us-east-2') doc = "/pdf-files/sample_pay_stub.pdf" bucket = "learn-textract-bucket-20230626"

resp = tx.start_document_analysis(
    DocumentLocation={
        "S3Object": {
            "Bucket": bucket,
            "Name": doc
        }
    },
    FeatureTypes=["TABLES"]
)

print(resp)

return {
    'statusCode': 200,
    'body': json.dumps('Hello from Lambda!')
}
  • Ignore the post. I fixed it.

gefragt vor 10 Monaten291 Aufrufe
2 Antworten
0

Please refer to a sample of Lambda function using Textract : https://docs.aws.amazon.com/textract/latest/dg/lambda.html

in its simplest form, it should look like this :

import boto3
import json
import os


def lambda_handler(event, context):
    print(event)
    # Get the service resource
    textract = boto3.client('textract')
    # Call Amazon Textract
    response = textract.detect_document_text(
        Document={
            'S3Object': {
                'Bucket': os.environ['BUCKET_NAME'],
                'Name': event['Records'][0]['s3']['object']['key']
            }
        })
    # Print detected text
    print(response)
    return response

AWS
beantwortet vor 10 Monaten
0

Yes. Got it. Thanks.

beantwortet vor 10 Monaten

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen