Call to Textract API issue - UnknownOperationException

0

Hello,

I want to call textract DetectDocumentText API from Salesforce using a base64 string that represents a black png.

I tried using the following http request:

{
"Document": {
"Bytes": "iVBORw0KGgoAAAANSUhEUgAAAGQAAAAKCAQAAADoFTP1AAAAIklEQVR42mNk+MkwLADjqEdGPTLqkVGPjHpk1COjHmFgAAAiwgnFqUltyAAAAABJRU5ErkJggg=="
}
}

also tried with S3 bucket file and JSON:

{
"Document": {
"S3Object": {
"Bucket": "mybucket",
"Name": "filename.jpeg"
}
}
}

Result: 200 with POST method => response: {"Output":{"__type":"com.amazon.coral.service#UnknownOperationException"},"Version":"1.0"}

Apart from this, I tried calling S3 API from Salesforce and successfully uploaded a file. There, I am using the same authorization method so that seems to be fine I guess.

Not sure what I am doing wrong or if something is missing in my http request.

Thanks for your help.

Edited by: AgustinB on Aug 31, 2021 3:12 PM

Edited by: AgustinB on Aug 31, 2021 3:18 PM

asked 3 years ago917 views
8 Answers
0

Thank you for using Amazon Textract. Can you share the code that you're using to make the HTTP request so that we can help you better? The endpoint shouldn't contain any path appended for different APIs. The API you'd want to use must be set with x-amz-target: Textract.DetectDocumentText. Could you try with this and let us know how that goes?

Also, we encourage to use the AWS SDKs. Are you facing any issues with SDK or is there a use case that you're not able to achieve with the SDK?

AWS
answered 3 years ago
0

Thanks for your reply raghuataws.

I tried appending x-amz-target: Textract.DetectDocumentText to the headers but same error.

HTTP request from Salesforce:

HttpRequest req = new HttpRequest();
req.setEndpoint('callout:TEXTRACT');
req.setMethod('POST');
req.setHeader('Content-Type', 'application/json');
req.setHeader('X-Amz-Target', 'Textract.DetectDocumentText');
req.setBody('{ "Document": { "Bytes": "", "S3Object": { "Bucket": "", "Name": "", "Version": "" } } }');

Note that the endpoint is set as a Named Credential in Salesforce with the following details:
URL: https://textract.us-east-2.amazonaws.com
Identity Type: Named Principal
Authentication Protocol: AWS Signature Version 4
AWS Access and Secret key from my user in AWS
AWS Region: us-east-2
AWS Service: Textract

This automatically generates an authorization header when making the callout.

thank you!

answered 3 years ago
0

Can you please share the request ID and region info so we can take a look at our log? or the complete code snippet so that we can run and reproduce the issue.

AWS
answered 3 years ago
0

Thanks for your reply, awscaesar.

Request ID: 311e933c-e658-4773-8a66-8486e65ec812
Region: us-east-2

To clarify, I am not doing this through AWS SDK. I want to call Amazon Textract API directly from Salesforce with an HTTP request callout.
As explained previously, I was able to call the S3 API from Salesforce directly and was able to upload a file. So, unless you tell me it is mandatory to use the AWS SDK for Textract, then something is wrong with the request I am sending.

Thanks again.

Regards,

answered 3 years ago
0

Hi AgustinB, from our log, seems like the request doesn't contain access key and account id, probably need to fix the syntax. But it's also a miss from service side, Textract should not return 200, but return 400/4xx instead with helpful debugging message.

I'll report the issue to the right service team to fix it.

AWS
answered 3 years ago
0

Hello awscaesar,

Thanks for taking a look into this.

I don't understand why the access key and account id are not in my request.

For the authorization I am using AWS signature method with the following parameters:

  • AccessKey = ******
  • SecretKey = ******
  • AWS Region = us-east-2
  • Service Name = textract

Can the textract service be authorized via this method? Cause it is working for s3 service.

Thanks also for reporting the 200 status issue from the service side.

Best Regards,

answered 3 years ago
0

Hi AgustinB,

Can you try content type as application/x-amz-json-1.1 as instructed from https://docs.aws.amazon.com/translate/latest/dg/API_Reference.html ? S3 service may support more protocol types but generally AWS API requires content headers as specified in https://docs.aws.amazon.com/translate/latest/dg/API_Reference.html, otherwise you'd got UnknownOperationException in POST request.

Additionally since you are not using AWS SDK, please make sure you sign the request with sigv4 as instructed in https://docs.aws.amazon.com/general/latest/gr/signature-version-4.html.

Another way to get the proper headers needed is to use AWS CLI with --debug mode. You can use AWS CLI to call the API, which should print the header information on screen. For example:

pic=$(base64 $YOUR_IMAGE_PATH)  
doc_string=$( jq -n --arg p "$pic" '{Bytes:$p}')  
aws textract detect-document-text --region us-east-2 --document="${doc_string}" --debug  

Thanks,
Siqi

AWS
answered 3 years ago
0

Hi Siqi-AWS,

Changing the Content-Type to application/x-amz-json-1.1, as you suggested did the trick. It is now working when sending an S3 object to the DetectDocumentText API from Salesforce. Just to clarify, the request was already using sigv4.

Really appreciate your help. Many thanks.

answered 3 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions