Amazon Textract amazon-textract-response-parser library with python throws "ValidationError: {'_schema': ['Invalid input type.']}" on expense analysis textract response

0

Hi,

I tried following the instructions at https://aws.amazon.com/fr/blogs/machine-learning/announcing-expanded-support-for-extracting-data-from-invoices-and-receipts-using-amazon-textract/ to parse the Textrcat response from a call to get_expense_analysis API (aynchronous API call on a pdf file).

I get a response from the API which seems to be valid json and compliant with the Textract Expense API documentation.

But, when I execute the following code that I copied from the article

from trp.trp2_expense import TAnalyzeExpenseDocument, TAnalyzeExpenseDocumentSchema
t_doc = TAnalyzeExpenseDocumentSchema().load(out)
# out is the json output from the analyse expense API call (tried both text and json dictionnary form to be sure)

I get following error

---------------------------------------------------------------------------
ValidationError                           Traceback (most recent call last)
/tmp/ipykernel_7538/447086103.py in <cell line: 2>()
      1 from trp.trp2_expense import TAnalyzeExpenseDocument, TAnalyzeExpenseDocumentSchema
----> 2 t_doc = TAnalyzeExpenseDocumentSchema().load(out)

~/anaconda3/envs/python3/lib/python3.8/site-packages/marshmallow/schema.py in load(self, data, many, partial, unknown)
    717             if invalid data are passed.
    718         """
--> 719         return self._do_load(
    720             data, many=many, partial=partial, unknown=unknown, postprocess=True
    721         )

~/anaconda3/envs/python3/lib/python3.8/site-packages/marshmallow/schema.py in _do_load(self, data, many, partial, unknown, postprocess)
    902             exc = ValidationError(errors, data=data, valid_data=result)
    903             self.handle_error(exc, data, many=many, partial=partial)
--> 904             raise exc
    905 
    906         return result

ValidationError: {'_schema': ['Invalid input type.']}

It is supposed to work out of the box as documented in the article.

I am using Python 3.8.12 and successfully installed amazon-textract-response-parser-0.1.30 botocore-1.24.46 marshmallow-3.14.1

What am I missing (I can share the pdf or the output of the API call if needed) ?

Any help welcome.

Christian.

c_ds
gefragt vor 2 Jahren621 Aufrufe
1 Antwort
1

Can you file a ticket against the repository? https://github.com/aws-samples/amazon-textract-response-parser/issues Ideally add the JSON causing the issue, that will help to debug. Thank you.

AWS
beantwortet vor 2 Jahren
  • Hi Martin. Done. Thanks !

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen