Is it possible to connect the audio streams for amazon connect to other systems such as multimodel LLM's?

0

Within a lot of voice api's such as twilio you are able to get a WebSocket style interface into the audio streams. This allows them to be easily connected to real time audio language models.

Currently with connect you only seem to be able to stream audio into kenesis for storage or send it to lex for simplistic intent based logic. I can forward on the text from lex into an LLM via the fallback approach but the latency is very noticeable and produces a bad user experience.

Is what I am trying to achieve possible now or is this a capability coming soon?

Thanks,

asked a month ago64 views
1 Answer
0

Currently, Amazon Connect does not provide a native WebSocket-style interface for real-time audio streaming that would allow easy connection to multimodal LLMs or other external systems in the way you describe.

Amazon Connect primarily handles audio in the following ways:

  1. Live Media Streaming: This allows you to capture customer audio during interactions and send it to a Kinesis video stream. However, this is more for capturing and storing the audio rather than real-time processing.

  2. Audio Recording: Connect can record conversations and store them in Amazon S3, but again, this is not for real-time processing.

  3. Integration with Amazon Lex: As you mentioned, Connect can send audio to Lex for intent-based logic, but this has limitations in terms of latency and complexity of processing.

  4. Contact Lens: This feature provides real-time and post-call analytics, but it's not designed for connecting to external LLMs or audio processing systems.

To achieve what you're looking for - a low-latency, real-time connection of audio streams to external systems like multimodal LLMs - you would likely need to use an external service or create a custom solution. This might involve using a service like AWS Chime or another programmable voice service as an intermediary between the caller and Amazon Connect.

Such an external service could act as a voice gateway, allowing you to manipulate and process the audio in real-time before forwarding it to Amazon Connect. This approach would let you integrate with more advanced audio processing systems or LLMs while still leveraging Amazon Connect's contact center capabilities.

However, this type of inline audio processing is not currently a native feature of Amazon Connect, and there hasn't been any public announcement about such a capability coming soon. If this functionality is crucial for your use case, you might need to explore hybrid solutions or alternative contact center platforms that offer more flexible audio stream handling.
Sources
Ability to listen an audio in Amazon Connect | AWS re:Post
Best approach to adding an INLINE audio processing block between a user call and AWS Connect Agent? | AWS re:Post
Data handled by Amazon Connect - Amazon Connect

profile picture
answered a month ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions