By using AWS re:Post, you agree to the Terms of Use

How to explicitly choose from several pronounciations


One frustrating thing about using Polly is finding that small changes to a sentence have a dramatic impact on the way words in the sentence are pronounced. For example, consider the following sentences:

  • Your watch will now be activated.
  • Your car will now be activated.

If you have Polly say this using English - Indian - Raveena, the word "activated" will be pronounced very differently between the two sentences. In fact, the "will now" will also be pronounced a bit differently.

Is there anything I can do, aside from fully manually defining a phoneme for the word "activated", to get Polly to use one of the pronounciations instead of the other, regardless of the structure of the sentence? I realize there's a lot of smarts involved in getting a sentence to be spoken, but it's frustrating to know she can say "activated" better than she is in my particular sentence.


asked 3 years ago13 views
2 Answers

Hi DanGoyette,

Thanks for reporting this issue. Unfortunately, it looks like this is a unit selection issue that is only reproducible in this instance. Currently, there is no easy way to fix the rootcause. But you can work around the issue by tweaking the phonemes in the phoneme tag slightly and passing it:

Your watch will now be <phoneme alphabet="x-sampa" ph='%ak.tIv."eIt.Id'> activated.</phoneme
> ```

answered 3 years ago

Thank you for the help. I raises one other question: In order to make it easier to author phoneme tags, is there a way to output speech as phonemes? That way, if I like the way the system pronounces something, I could ideally copy/paste the phoneme it's using.

I tried to change the output type to Speech Marks, selecting SSML as the Speech Mark type, hoping that it would output text with phoneme tags. However, it only appears to output an empty file. And I wouldn't be surprised if phoneme tags are just used as input into your system, and the system doesn't generate phoneme tags as part of its text to speech conversion process.

Anyway, the more general question is whether there's an easier way to author phoneme tags that sound correct, other than trial-and-error?


answered 3 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions