Error: The provided entity lists contained only generic or high-frequency words. Please provide additional unique entity names for training

0

I was trying to train Comprehend to identify my custom types. When I submit a training job to identify my custom entities it comes up with the error below.

"The provided entity lists contained only generic or high-frequency words. Please provide additional unique entity names for training."

What does the error really mean?

I had created 5 entity type each having 200 different text and then nearly 40 documents of training data. Each of the training data were store as separate files in the s3 bucket. I provided spearate S3 bucket names for both the entity.csv and for the 40 documents stored as txt files

Any help will be appreciated

質問済み 2年前473ビュー
1回答
0

Comprehend will filter out stop words, such as "the", from the provided list of entity names. If Comprehend finds no samples for a given entity type after this filtering, then it will result in the error message mentioned. Please make sure you aren't using stop words for entity names.

AWS
回答済み 2年前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ