Getting User Utterance as Text in Amazon Sumerian

0

I have a Sumerian Host that acts basically as a front-end for a Lex chatbot.

However, in some cases, I need to to some processing based on the actual user utterance (that is, the text of what the user says). Is there a way I can use "Send Audio to Lex" action (or a different one) to get a text version of the user audio (so, to perform speech-to-text)?

Maxi
질문됨 2년 전295회 조회
1개 답변
0
수락된 답변

Hello Maxi,

If you are looking for only transcribing a user input, Amazon Transcribe would be a better fit. However, if you are looking for doing some processing of the user input, in context of the bot, you can hook a Lambda function in a bot, and use "inputTranscript" field to get the text of what the user said.

Thanks

AWS
답변함 2년 전
  • Maxi, do you want to do the additional input text processing on the client side (inside of Sumerian using JavaScript) or on the server side? If you want the processing to happen on the server side, then the answer swapandeepataws provided is good. However, if your goal is to do some processing on the client side directly in Sumerian, that's possible, too. Let me know if that's your objective. I'd be happy to provide some sample code.

  • Also, a clarification on swapandeepataws's answer. Amazon Sumerian doesn't offer direct integration of Amazon Transcribe as it does for Lex and Polly. If you want to integrate Transcribe with Sumerian you can, but it requires linking the AWS JavaScript SDK into your Sumerian project and writing custom JavaScript code to interact with that SDK.

  • Thanks Kris, saw your comments too late, but as you can see it is solved.

  • Hi! I added the lambda and used the "inputTranscript" as you suggested and it works fine.

    JFYI, as I want to always have a transcript on Sumerian side, I created a bot with a "fake" intent that has only one utterance that is never matched. I added a AWS.FallbackIntent which calls my lambda function. The Lambda simply retrieves "inputTranscript" and returns in into a properly formatted Lex reply (https://docs.aws.amazon.com/lex/latest/dg/lambda-input-response-format.html).

    The result is that, whatever the user says, the fall back intent is always activated, which calls the Lambda, which returns user utterance as message. I can then take the message and process it in any way I want.

    Thanks for the hint.

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠