How to improve accuracy of speaker diarization?

0

What steps can I take to improve AWS Transcribe's Speaker Diarization accuracy?

Unfortunately the algorithm is not doing a great job of correctly identifying who is talking, even with a clean audio recording. Its even having trouble distinguishing between a man and a woman's voice.

Much appreciated!

boogie
已提問 2 年前檢視次數 534 次
1 個回答
0

Other than a clean audio recording, I'd optimize the following factors:

  • When using custom vocabularies: keep the list small, and provide IPA pronunciations if you can.
  • When using real-time streams: two to five speakers seem best.
  • When using an audio source: set the "Maximum number of speaker" to the actual number of speakers in the file.

For sources and more information:
https://aws.amazon.com/transcribe/faqs/
https://docs.aws.amazon.com/transcribe/latest/dg/diarization.html

已回答 2 年前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南