By using AWS re:Post, you agree to the Terms of Use

Comprehend can find text?

0

Can the Comprehend model return a full text from a context phrase that I search for?

For example, suppose this text below is an area in a newspaper (in .PDF format) that contains several subjects and that talks about Weddings in a specific area containing title and body text:

Wedding 2022

We had a very traditional wedding and it was extremely expensive, but it was worth it. Carol and I only paid half. Her parents paid for everything else. We got married in church. Carol wore a white dress and she looked fantastic. I wore a suit and I think I looked quite good too! We had a big reception. We had 200 guests. The reception was in a wonderful hotel. We took lots of pictures. It was just great!

If I send the model “Wedding 2022” and “We got married in church” will it be able to find this text among different themes and will I be able to receive all this text below?

Wedding 2022

We had a very traditional wedding and it was extremely expensive, but it was worth it. Carol and I only paid half. Her parents paid for everything else. We got married in church. Carol wore a white dress and she looked fantastic. I wore a suit and I think I looked quite good too! We had a big reception. We had 200 guests. The reception was in a wonderful hotel. We took lots of pictures. It was just great!

Is Comprehend the best tool to try to solve this problem?

1 Answer
1

Comprehend is not a search tool. It is an API that will make it easy to :

  • Detect the dominant language
  • Detect named entities
  • Detect key phrases
  • Determine sentiment
  • Analyze targeted sentiment
  • Detect syntax
  • Detect events
  • Do Topic modeling

from documents you provide through the real-time or batch API. It will provide json formatted response containing the inferred elements. For instance:

{
    "LanguageCode": "en",
    "KeyPhrases": [
        {
            "Text": "today",
            "Score": 0.89,
            "BeginOffset": 14,
            "EndOffset": 19
        },
        {
            "Text": "Seattle",
            "Score": 0.91,
            "BeginOffset": 23,
            "EndOffset": 30
        }
    ]
}

Notice that the response contains BeginOffset and EndOffset which tell you where the entity was detected in the document should you want to pull the text (or more text arround it) from the document. If your objective is to do natural language full text search on documents, I'd recommend looking into Amazon Kendra (https://aws.amazon.com/kendra/)

If you want to see both these solutions in action to provide Knowledge extraction and natural language search powered by AI/ML you can check out the Document Understanding Solution : https://aws.amazon.com/solutions/implementations/document-understanding-solution/

answered 6 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions