Skip to content

Streaming response slow on Amplify

0

I'm hosting a Nuxt app on AWS Amplify. The app includes an API route for querying an AI agent. I'm using the BedrockRuntimeClient for that, but I didn't setup the AI routes through an Amplify backend. The route is a "normal" route that should also be able to run on a non-Amplify server.

The problem that I currently have is that the AI response is fast when working locally (about 2 sec.), but is very slow on the AWS hosting (about 12 sec.). To me it seems like something within Amplify is slowing my response down.

I saw this post in the OpenAI forum from someone else, who mentioned that nginx could be responsible for a delay in streaming. I also found this article in the Amplify docs, that mentions how to handle streaming the Amplify way, but we don't want to do that.

Does anyone here know where the delay in my response could be coming from? If it is related to some sort of caching / server setup, then is there a way to whitelist specific routes, so that I don't have to do it the Amplify way?

asked a year ago308 views
1 Answer
-1

The slow response you're experiencing with your AI agent query on AWS Amplify hosting compared to your local environment could be due to several factors. While Amplify is not specifically designed for AI workloads or streaming responses, there are a few things you can consider to improve the performance:

  1. Region and latency: Ensure that your Amplify app and the Bedrock service are in the same AWS region to minimize latency. The difference in response times could be due to network latency between your Amplify app and the Bedrock service.

  2. Lambda function configuration: If you're using AWS Lambda functions as part of your API route, check the memory and timeout settings. Increasing the memory allocation can sometimes improve performance for AI-related tasks.

  3. Caching and server setup: As you mentioned, there could be caching or server configuration issues. Amplify uses Amazon CloudFront as a CDN, which might be caching responses. You may need to adjust the caching behavior for your AI route.

  4. Response streaming: The Bedrock API supports response streaming, which can help reduce the perceived latency. However, implementing this with Amplify might require some additional configuration.

  5. Custom server or API Gateway: Consider using a custom server setup or API Gateway instead of Amplify's built-in hosting for your AI routes. This would give you more control over the server configuration and potentially allow for better optimization of AI workloads.

  6. Amplify configuration: Look into your Amplify app's configuration to see if there are any settings that might be affecting the response times for your AI route.

To whitelist specific routes or bypass Amplify's default handling for your AI endpoints, you might need to use custom server middleware or explore using API Gateway in conjunction with your Amplify app. This would allow you to have more granular control over how your AI routes are handled.

If these suggestions don't resolve the issue, you may want to consider using a different hosting solution that's more optimized for AI workloads and streaming responses, especially if this is a critical part of your application's functionality.
Sources
Issue streaming response from bedrock agent | AWS re:Post
Response streaming for Lambda functions - AWS Lambda
Invoke an agent from your application - Amazon Bedrock

answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.