How to improve Comprehend results when the model returns incorrect results?

0

What is the best way to improve the results I receive back from Comprehend when using Custom Entities?

I've trained the model, but the results are still falling short. I'm not sure if training the model with more annotation examples will improve my results. It would be great to be able to correct the model when it gets something wrong. What is the best way to accomplish my goal of improving the output over time?

For example, AWS returned the value of "select common flexbase" as one single "material" type. This should have been three separate "material" types (and was annotated hundreds, if not thousands, of times separately in the training annotations).

In another example from the same sample email, AWS returned the value of "Rowlett @ Bickers Drive & Westover Drive Export Trees Trash Concrete Asphalt Dirt Import Select Common Flexbase (Crushed Conc) Flexbase (Crushed Limestone)......." as one single "address" type. The address part was only "Rowlett @ Bickers Drive & Westover Drive" with the majority of the remainder being individual "material" types.

I have a lot of other examples, but it's a bit difficult explaining them without the benefit of attaching the source files and the AWS Comprehend output.

edbyu
preguntada hace 5 años334 visualizaciones
3 Respuestas
0

Hi,
I am from Comprehend Team and sent you private message for requesting dataset. We can analyse the model and error samples to suggest next steps.

Thanks

respondido hace 5 años
0

If anyone else is struggling with this I was able to achieve better results in my training dataset by having each document be a single line in the training file (rather than observing line breaks in the source document).

Example - use a 10 line email as a source document, make the 10 line email one single line (no line breaks) in your training file, not 10 different lines.

I hope this makes sense.

edbyu
respondido hace 5 años
0

Hi,

May I know, if I follow up your suggestion, one document as one line, after training and gets the result of F1 score, how can I know which line has the problem in the detail?

Thanks.

Best regards

P-san
respondido hace 3 años

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.

Pautas para responder preguntas