Skip to content

Voice in Amazon Lex still sounds robotic and delayed even after setting Joanna (NTTS)

0

I'm building a voice bot using Amazon Lex V2 integrated with Amazon Connect and Amazon Bedrock for generative AI responses. I have configured the voice in the Lex bot's conversational profile to use Joanna (Neural Text-to-Speech) , but the voice still sounds robotic instead of natural.

Additionally, there is a noticeable delay between the user input and the system's spoken response.

I’ve verified that:

The conversational profile is correctly associated with the Lex bot. Joanna (NTTS) is selected under Audio generation in the profile settings. The integration with Bedrock seems to be functioning, as responses are generated and passed back to Lex. Despite this, the quality of the voice output is not matching expectations , and the latency affects the user experience .

Has anyone experienced similar issues with NTTS voices sounding robotic or delays in response when integrating Lex + Bedrock? Are there any optimizations or configurations I might be missing?

Any advice on improving voice quality and reducing latency would be greatly appreciated!

Thanks.

2 Answers
1

Have you try to use Lex without Bedrock to see if the delay and voice quality is still there? Make sure in your Amazon Connect instance configuration to choose "Use the best available voice" https://docs.aws.amazon.com/connect/latest/adminguide/update-instance-settings.html#update-contact-flow-settings

also double check if you have the right "Set Voice" block settings to work with LexV2 in Connect https://docs.aws.amazon.com/connect/latest/adminguide/set-voice.html#set-voice-lexv2bot

In addition, you need to use proper alias and versioning and NOT use the testbotalias to improve of speed and delay https://docs.aws.amazon.com/connect/latest/adminguide/amazon-lex.html#lex-bot-create-bot-version

answered 2 days ago
0

When using Amazon Lex with neural voices like Joanna (NTTS), there are a few factors that could be affecting your voice quality and response time.

First, regarding the voice quality sounding robotic: While you've selected Joanna (Neural TTS), you should verify that the engine parameter is explicitly set to "neural" in your voice settings. The default engine is "standard" if not specified, which could explain why you're still getting robotic-sounding speech despite selecting a neural voice.

Amazon Polly's Neural TTS engine uses a different approach than standard voices - it employs a neural network to convert phonemes into spectrograms and then uses a vocoder to convert those into audio signals. This produces more natural-sounding speech compared to the concatenative synthesis used by standard voices.

For the latency issue, this is a common challenge when integrating multiple services (Lex + Bedrock + Polly). The delay occurs because:

  1. Bedrock needs time to generate the AI response
  2. The response then needs to be processed by Lex
  3. Lex needs to use Polly to convert the text to speech

Each of these steps adds to the overall latency. Additionally, neural voices, while higher quality, can sometimes introduce slightly more processing time compared to standard voices.

To improve your implementation:

  1. Ensure the engine parameter is explicitly set to "neural" in your voice settings
  2. Consider optimizing your Bedrock prompts to generate more concise responses when appropriate
  3. Review your network configuration to minimize latency between services
  4. If latency is a critical concern and you can't optimize further, you might need to consider the tradeoff between voice quality and response time

It's worth noting that Amazon Polly now also offers generative voices, which provide even more natural-sounding speech. However, this is a relatively new feature and may not yet be fully integrated with all Lex implementations.
Sources
Neural voices - Amazon Polly
Amazon Lex – Migrate interactive voice response (IVR) flows to Amazon Lex – Amazon Web Services
VoiceSettings - Amazon Lex API Reference
Use generative voice with Lex | AWS re:Post

answered 4 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.