Skip to content

AWS Bedrock Agents Support Streaming Responses?

2

Hi, I am exploring the use of AWS Bedrock agents for a use case that requires real-time (streaming) responses from a large language model. My questions are:

Is it possible to configure a Bedrock agent to return responses in streaming mode, similar to other solutions that provide partial fragments while the model continues generating text? If yes, is there any example configuration or documentation available that explains how to enable this feature? If this is not currently supported, are there plans to include this functionality in the future?

1 Answer
2
Accepted Answer

Currently, AWS Bedrock Agents do not directly support streaming responses in the way you're describing. However, Amazon Bedrock does offer streaming capabilities for certain API operations and models, which might be relevant to your use case.

For example, Amazon Bedrock provides the InvokeModelWithResponseStream API operation, which allows you to invoke a specified Amazon Bedrock model and receive the response in a stream. This is similar to the real-time, partial fragment responses you're looking for.

Additionally, Amazon Bedrock Knowledge Bases recently introduced support for the RetrieveAndGenerateStream API. This new streaming API enables customers to receive responses as they are being generated by the Large Language Model (LLM), rather than waiting for the complete response. This feature is designed to reduce latency for the first response, providing users with a more seamless and responsive experience when interacting with Bedrock Knowledge Bases.

While these streaming capabilities are not specifically tied to Bedrock Agents, they demonstrate that Amazon Bedrock is moving in the direction of supporting real-time, streaming responses for various use cases.

For developers looking to implement streaming responses, Amazon Bedrock offers the ConverseStream API, which provides a consistent interface for interacting with different models that support streaming. This API allows you to send messages to specified Amazon Bedrock models and receive responses in a stream.

To use these streaming capabilities, you would need to use the appropriate SDK for your programming language. Examples are available for languages such as .NET, Java, JavaScript, and Python, showing how to send text messages to models like Cohere Command or Mistral and process the response stream in real-time.

While Bedrock Agents themselves don't currently support streaming responses, you might be able to leverage these other streaming capabilities within Amazon Bedrock to achieve similar functionality for your use case. As for future plans, Amazon Web Services regularly updates and expands its offerings, so it's possible that streaming support for Bedrock Agents could be added in the future. However, no specific plans have been announced at this time.
Sources
InvokeModelWithResponseStream - Amazon Bedrock
ConverseStream - Amazon Bedrock
Amazon Bedrock Knowledge Bases now supports streaming responses with RetrieveAndGenerateStream API - AWS
Invoke Cohere Command on Amazon Bedrock using Bedrock's Converse API with a response stream - AWS SDK Code Examples
Invoke Mistral on Amazon Bedrock using Bedrock's Converse API with a response stream - AWS SDK Code Examples

answered a year ago
AWS
EXPERT
reviewed a year ago
EXPERT
reviewed a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.