- Newest
- Most votes
- Most comments
hi there, According to my testing, as per your API, you only need the following permission for the API { "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": "s3:GetObject", "Resource": "arn:aws:s3:::rafaxu-bucket/example.pdf" } ] }
but there are a few other steps you can check further.
- is there any s3 bucket policies to limit the access
- is there any kms key applied to the object? if that is the case, you may need to get KMS related permission for your iam user/role
- you can add boto3.set_stream_logger(name='botocore') to your code to find some debug information which may help you.
I recommend you to seperate the S3 upload and Texttract API in different code snippet for troubleshooting purpose.
here is my testing code and working example
import boto3
boto3.set_stream_logger(name='botocore')
s = boto3.Session(profile_name="default")
tx = s.client("textract")
doc = "example.pdf"
bucket = "rafaxu-bucket"
resp = tx.start_document_analysis(
DocumentLocation={
"S3Object": {
"Bucket": bucket,
"Name": doc
}
},
FeatureTypes=["TABLES"]
)
print(resp)
Here is my IAM policy for IAM user:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::rafaxu-bucket/example.pdf"
}
]
}
- Texttract full access just to the texttract api.
If I remove the policy, I do get this error: botocore.errorfactory.InvalidS3ObjectException: An error occurred (InvalidS3ObjectException) when calling the StartDocumentAnalysis operation: Unable to get object metadata from S3. Check object key, region and/or access permissions.
help that works
Document and Textract client not being in the same AWS region is another potential error. Make sure the Textract call is done from the same region as the bucket.
# when your bucket is in us-east-2
textract_client = boto3.client('textract', region_name='us-east-2')
Hi, thank you for the response. Incredibly, that seems to have worked... Doesn't instantiating both the s3 and textract clients from the same session object ensure they all use the same region?
@danem: The bucket region is defined when the bucket is created, not when the boto3 client session is instantiated. So every S3 bucket is 'bound' to a specific region. Textract on the other hand is available in most regions and when a boto3 client session is instantiated, it will execute the Textract API call against that region.
Relevant content
- asked a year ago
- asked 2 years ago
- asked 2 years ago
- AWS OFFICIALUpdated 3 years ago
- AWS OFFICIALUpdated 3 months ago
- AWS OFFICIALUpdated 5 months ago
Sorry, I"m not sure where that IAM policy needs to be applied. Not on the bucket, right? I've created a new user with the S3FullAccess and TextractFullAccess policies applied, and I'm now using that as the account executing the code, but still running into the same issue. Thank you for your help.