Hi,
I tried following the instructions at https://aws.amazon.com/fr/blogs/machine-learning/announcing-expanded-support-for-extracting-data-from-invoices-and-receipts-using-amazon-textract/ to parse the Textrcat response from a call to get_expense_analysis API (aynchronous API call on a pdf file).
I get a response from the API which seems to be valid json and compliant with the Textract Expense API documentation.
But, when I execute the following code that I copied from the article
from trp.trp2_expense import TAnalyzeExpenseDocument, TAnalyzeExpenseDocumentSchema
t_doc = TAnalyzeExpenseDocumentSchema().load(out)
# out is the json output from the analyse expense API call (tried both text and json dictionnary form to be sure)
I get following error
---------------------------------------------------------------------------
ValidationError Traceback (most recent call last)
/tmp/ipykernel_7538/447086103.py in <cell line: 2>()
1 from trp.trp2_expense import TAnalyzeExpenseDocument, TAnalyzeExpenseDocumentSchema
----> 2 t_doc = TAnalyzeExpenseDocumentSchema().load(out)
~/anaconda3/envs/python3/lib/python3.8/site-packages/marshmallow/schema.py in load(self, data, many, partial, unknown)
717 if invalid data are passed.
718 """
--> 719 return self._do_load(
720 data, many=many, partial=partial, unknown=unknown, postprocess=True
721 )
~/anaconda3/envs/python3/lib/python3.8/site-packages/marshmallow/schema.py in _do_load(self, data, many, partial, unknown, postprocess)
902 exc = ValidationError(errors, data=data, valid_data=result)
903 self.handle_error(exc, data, many=many, partial=partial)
--> 904 raise exc
905
906 return result
ValidationError: {'_schema': ['Invalid input type.']}
It is supposed to work out of the box as documented in the article.
I am using Python 3.8.12 and successfully installed amazon-textract-response-parser-0.1.30 botocore-1.24.46 marshmallow-3.14.1
What am I missing (I can share the pdf or the output of the API call if needed) ?
Any help welcome.
Christian.
Hi Martin. Done. Thanks !