- Newest
- Most votes
- Most comments
Lambda response streaming is currently not supported in all runtimes, according to the documentation:
Currently, Lambda supports response streaming only on Node.js 14.x, Node.js 16.x, and Node.js 18.x managed runtimes. You can also use a custom runtime with a custom Runtime API integration to stream responses or use the Lambda Web Adapter. You can stream responses through Lambda Function URLs, the AWS SDK, or using the Lambda InvokeWithResponseStream API.
Also, with API Gateway, you'll need specific configurations. Details are described in the following blog:
Neither API Gateway nor Lambda’s target integration with Application Load Balancer support chunked transfer encoding. It therefore does not support faster TTFB for streamed responses. You can, however, use response streaming with API Gateway to return larger payload responses, up to API Gateway’s 10 MB limit. To implement this, you must configure an HTTP_PROXY integration between your API Gateway and a Lambda function URL, instead of using the LAMBDA_PROXY integration.
That being said, examples which could be useful for you are most probably around custom runtimes and HTTP_PROXY configurations, assuming that using Node.js is not an option. The following sources might be useful for you:
- Blog "Introducing AWS Lambda response streaming" (uses Node.js, but contains background information about limitations and alternatives)
- "Lambda Response Streaming" in Serverlessland (also includes pattern and SAM templates)
- As an alternative to a custom runtime, you can use the aws-lambda-web-adapter
In general, I recommend to also look into asynchronous invocations of AWS Lambda. Since the issue you are solving is long-running tasks in the LLM, you potentially don't need response streaming but simply an asynchronous invocation model. That setup could potentially be more aligned with your current architecture, especially regarding the runtimes used. Serverlessland maintains an end-to-end example for this pattern on Github.
As mentioned in the previous answer, aws-lambda-web-adapter can do this. Here is a response streaming exapmle for Springboot. The demo streams the content of a file, but you can stream any text as well.
Relevant content
- asked 4 months ago
- asked 2 years ago
- asked 3 years ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 2 years ago