Hi,
I am trying to run the same
- directly from AWS CloudShell with 'python3 textract_doc_analysis.py' command,
- running it through Lambda.
In both the cases I modified the code. But, they didn't work.
For Lambda Role, I added the policies for S3 and Textract full access, apart from Lambda credentials.
Also, explicitly added the S3 object paths also.
----------- Lambda Role --------------
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "logs:CreateLogGroup",
"Resource": "arn:aws:logs:us-east-2:xxxxxxxxx:"
},
{
"Effect": "Allow",
"Action": [
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": [
"arn:aws:logs:us-east-2:xxxxxxxx:log-group:/aws/lambda/textract_doc_analysis:"
]
},
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject"
],
"Resource": [
"arn:aws:s3:::learn-textract-bucket-20230626*",
"arn:aws:s3:::learn-textract-bucket-20230626/*",
"arn:aws:s3:::learn-textract-bucket-20230626/pdf-invoices3.pdf",
"arn:aws:s3:::learn-textract-bucket-20230626/pdf-sample1.pdf"
]
}
]
}
---- error ------
"errorMessage": "An error occurred (InvalidS3ObjectException) when calling the StartDocumentAnalysis operation: Unable to get object metadata from S3. Check object key, region and/or access permissions.",
"errorType": "InvalidS3ObjectException",
----- code -------
import json
import boto3
def lambda_handler(event, context):
boto3.set_stream_logger(name='botocore')
s = boto3.Session(profile_name="default")
s = boto3.Session() # ???? Not sure whether this is right ?????
tx = s.client("textract", region_name='us-east-2')
doc = "/pdf-files/sample_pay_stub.pdf"
bucket = "learn-textract-bucket-20230626"
resp = tx.start_document_analysis(
DocumentLocation={
"S3Object": {
"Bucket": bucket,
"Name": doc
}
},
FeatureTypes=["TABLES"]
)
print(resp)
return {
'statusCode': 200,
'body': json.dumps('Hello from Lambda!')
}
Ignore the post. I fixed it.