StartSpeechSynthesisTask is too slow

Question

Looks like audio is available on S3 only after been fully synthesized. So I will continue to use my text chunking code in order to get faster playing. Amazon, is it not possible to stream incomplete audio?

Answer

Hi zdanevich-vitaly-andreevich,  
  
Thanks for contacting us!  
  
By design audio file produced by StartSpeechSynthesisTask is available for download once it's finished.  
Generally Amazon S3 never adds partial objects, so it is not possible for an object to appear in S3 if it's not completely uploaded.  
If you requirement is low latency then I recommend using SynthesizeSpeech operation.  
  
On the other hand if you're synthesizing very long texts and you still care about latency you can combine those two in the way that you synthesize first chunk using SynthesizeSpeech and the rest using StartSpeechSynthesisTask.  
  
Thanks,  
Hubert

Answer

Hi,
I wanted to use audio files synthesized by Polly with my Alexa skill, but had to stop that activity.
Alexa takes a session during 8 seconds and then stops responding, but the StartSpeechSynthesisTask takes at least 20-30 seconds even for short sentences - 60 - 80 characters.
I found in the Alexa Skills development documentation, that the described scenario is not allowed by AWS due to security reasons. Audio files that are used with SSML <audio /> element, must be in public access and no authentication is required: https://developer.amazon.com/en-US/docs/alexa/custom-skills/speech-synthesis-markup-language-ssml-reference.html#audio

StartSpeechSynthesisTask is too slow

관련 콘텐츠