A2I for Named Entity Recognition in PDFs

0

Hi! I developed my own custom Named Entity Recognizer in Comprehend by following these 2 posts on the blog:

https://aws.amazon.com/pt/blogs/machine-learning/custom-document-annotation-for-extracting-named-entities-in-documents-using-amazon-comprehend/

https://aws.amazon.com/pt/blogs/machine-learning/extract-custom-entities-from-documents-in-their-native-format-with-amazon-comprehend/

Now, I really wanted to make A2I work with this recognizer. My idea is simple: by checking the confidence scores of the entities recognized, the PDF document would be sent to A2I in case the threshold is below the specified, and then on the interface I would highlight the correct parts of the PDF to be considered the desired entity. It's a similar approach to this other post: https://aws.amazon.com/pt/blogs/machine-learning/setting-up-human-review-of-your-nlp-based-entity-recognition-models-with-amazon-sagemaker-ground-truth-amazon-comprehend-and-amazon-a2i/, the difference being that in my workflow, the entire document should appear on the screen to be highlighted and corrected, instead of just text as it is in this post. Is this possible?

已提問 2 年前檢視次數 418 次
1 個回答
1

Hi

If you start a custom task, you can build your own HTML template.

The key is the way to build the template, you can also use a iframe and grant_read_access to generate a temporal signed URL to show the PDF from S3. for example: https://docs.aws.amazon.com/sagemaker/latest/dg/sms-ui-template-crowd-classifier.html

But I prefer to use a a js lib like pdf.js to draw the PDF.

Please also read this documentation https://docs.aws.amazon.com/sagemaker/latest/dg/a2i-custom-templates.html

已回答 2 年前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南