A2I for Named Entity Recognition in PDFs

0

Hi! I developed my own custom Named Entity Recognizer in Comprehend by following these 2 posts on the blog:

https://aws.amazon.com/pt/blogs/machine-learning/custom-document-annotation-for-extracting-named-entities-in-documents-using-amazon-comprehend/

https://aws.amazon.com/pt/blogs/machine-learning/extract-custom-entities-from-documents-in-their-native-format-with-amazon-comprehend/

Now, I really wanted to make A2I work with this recognizer. My idea is simple: by checking the confidence scores of the entities recognized, the PDF document would be sent to A2I in case the threshold is below the specified, and then on the interface I would highlight the correct parts of the PDF to be considered the desired entity. It's a similar approach to this other post: https://aws.amazon.com/pt/blogs/machine-learning/setting-up-human-review-of-your-nlp-based-entity-recognition-models-with-amazon-sagemaker-ground-truth-amazon-comprehend-and-amazon-a2i/, the difference being that in my workflow, the entire document should appear on the screen to be highlighted and corrected, instead of just text as it is in this post. Is this possible?

1 回答
1

Hi

If you start a custom task, you can build your own HTML template.

The key is the way to build the template, you can also use a iframe and grant_read_access to generate a temporal signed URL to show the PDF from S3. for example: https://docs.aws.amazon.com/sagemaker/latest/dg/sms-ui-template-crowd-classifier.html

But I prefer to use a a js lib like pdf.js to draw the PDF.

Please also read this documentation https://docs.aws.amazon.com/sagemaker/latest/dg/a2i-custom-templates.html

已回答 2 年前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则