I need help with Textract's Detecting Text.

1

I need help with Textract's Detecting Text. I'm trying to set up the API the same way as the Demo. I want to achieve the same results as the Demo's Forms. I have attached a photo for reference.DemoForm

**Currently, I am using the following code. ** import boto3 import json

class TextractWrapper: def init(self): self.textract_client = boto3.client( 'textract', aws_access_key_id='###############', aws_secret_access_key='##################', )

def analyze_document(self, feature_types, document_bytes):
    try:
        response = self.textract_client.analyze_document(
            Document={'Bytes': document_bytes}, FeatureTypes=feature_types)
        print("Se detectaron {} bloques.".format(len(response['Blocks'])))
    except self.textract_client.exceptions.ServiceException as e:
        print("Error al analizar el documento: {}".format(e))
        raise

    return response

textract = TextractWrapper()

document_file_name = r"C:\Users\cvict\Desktop\FinGlobal\DOC.jpg" with open(document_file_name, 'rb') as document_file: document_bytes = document_file.read()

feature_types = ['FORMS']

response = textract.analyze_document(feature_types, document_bytes) response_json = json.dumps(response) print(response_json)

My problem is that when I run the code, it does not print the key-value dictionaries. I have tried several methods, but I have not been able to achieve the same result as the demo.

1 Answer
1
import boto3
import json

class TextractWrapper:
    def __init__(self):
        self.textract_client = boto3.client(
            'textract',
            aws_access_key_id='###############',
            aws_secret_access_key='##################',
        )

    def analyze_document(self, feature_types, document_bytes):
        try:
            response = self.textract_client.analyze_document(
                Document={'Bytes': document_bytes}, FeatureTypes=feature_types)
            print("Se detectaron {} bloques.".format(len(response['Blocks'])))
        except self.textract_client.exceptions.ServiceException as e:
            print("Error al analizar el documento: {}".format(e))
            raise

        return response

textract = TextractWrapper()

document_file_name = r"C:\Users\cvict\Desktop\FinGlobal\DOC.jpg"
with open(document_file_name, 'rb') as document_file:
    document_bytes = document_file.read()

feature_types = ['FORMS']

response = textract.analyze_document(feature_types, document_bytes)
response_json = json.dumps(response)
print(response_json)


answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions