AWS Polly Speech Generation: Is it possible to produce speech plus speech marks in one call?

1

I need both speech output (as .MP3) and speechMarks output (as .JSON). Currently, I'm using two calls from the CLI based on the documentation, one call to generate each one for the same text.

I believe that means I'm getting billed twice for the same text. Is this correct? Are you billed twice, once for audio, and a second time for speech marks?

Also, this takes two calls / more time / seems to duplicate effort / the server must be doing the same computation twice.

Is there a way to make a single call that generates both speech audio output (mp3) and SpeechMarks (json) in a single call, and/or a way to pay once rather than twice for the same text?

Related question / Similar issue: I also need multiple speech variants for the same text to allow for end-user preferences (eg different voices, different speed/prosody). Is there a way to batch generate multiple sets of speech output from a single call to decrease speech generation cost for this situation, rather than paying for 2x the amount of text for each small variant?

Would prefer to do this using the CLI, but also fine to use tasks, the js API, the python API, etc.

Here are the docs with examples of the calls to generate audio and SpeechMarks:

https://docs.amazonaws.cn/en_us/polly/latest/dg/using-speechmarks.html

https://docs.amazonaws.cn/en_us/polly/latest/dg/get-started-cli-exercise.html

Thanks for your help

ccccc
질문됨 2년 전306회 조회
1개 답변
0

Hello,

It's not possible to call both speech audio output (mp3) and SpeechMarks (json) in a single API call. One API call can only provide audio output (or) speech marks output. Also, it's currently not possible to pass different outputs for input parameters in same API call. For different speech variants outputs, different API calls needs to be initiated.

AWS
지원 엔지니어
Ayush_S
답변함 2년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠