When invoking Amazon Bedrock models with large prompts (>10K context length) using AWS Java SDK, customers experience high latencies (45s+) or timeouts with "HTTP request: Read timed out" error after 2 mins. This article explains the cause and provides a resolution to help optimize performance.
Problem and Symptoms
When using the AWS Java SDK to invoke Amazon Bedrock models, you may encounter the following issues:
- High latencies: Requests taking 45 seconds or longer to complete
- Timeouts: "HTTP request: Read timed out" error after 2 minutes
- These issues typically occur with larger or complex prompts (>10K context length) and/or larger output tokens (>2K)
However, when investigating the issue in CloudWatch, you may notice a discrepancy:
- The
InvocationLatency metric for Bedrock indicates that the invocation took between 30-40 seconds, which is shorter than the actual latency experienced.
- Additionally, you may see 2 or 3 additional invocation attempts for each affected request.
A typical stack trace for this issue may resemble the following:
Error: Unable to execute HTTP request: Read timed out
software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: Read timed out
at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:111)
at software.amazon.awssdk.core.exception.SdkClientException.create(SdkClientException.java:47)
at software.amazon.awssdk.core.internal.http.pipeline.stages.utils.RetryableStageHelper2.setLastException(RetryableStageHelper2.java:226)
at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage2.execute(RetryableStage2.java:65)
:
:
at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:53)
at software.amazon.awssdk.services.bedrockruntime.DefaultBedrockRuntimeClient.invokeModel(DefaultBedrockRuntimeClient.java:329)
at com.example.BedrockClaudeExample.main(BedrockClaudeExample.java:51)
Suppressed: software.amazon.awssdk.core.exception.SdkClientException: Request attempt 1 failure: Unable to execute HTTP request: Read timed out
Suppressed: software.amazon.awssdk.core.exception.SdkClientException: Request attempt 2 failure: Unable to execute HTTP request: Read timed out
Suppressed: software.amazon.awssdk.core.exception.SdkClientException: Request attempt 3 failure: Unable to execute HTTP request: Read timed out
Caused by: java.net.SocketTimeoutException: Read timed out`
Cause
The root cause of the high latency and "HTTP request: Read timed out" errors is the default timeout value of 30 seconds for socketTimeout in the AWS Java SDK. This value is defined in the SDK's configuration options.
When the SDK encounters a socket error, it is configured to retry the request up to three times by default. This retry mechanism is part of the SDK's built-in retry strategies, which are designed to handle transient errors such as socket timeouts, service-side throttling, and concurrency failures.
As a result, when the SDK encounters a timeout, it will retry the request multiple times, leading to one of two possible outcomes:
- High latency: The request is retried multiple times, with each retry taking up to 30 seconds. If the request is eventually successful, the end-to-end latency will be the sum of the individual retry times, resulting in a high overall latency. For example: Request > first invocation (30s timed out) > first retry (30s timed out) > second retry (14s, response returned) > End to end latency (74s)
- Read Timed Out error: The request is retried three times after the first invocation, with each retry timing out after 30 seconds. If all retries fail, the SDK will throw an "Unable to execute HTTP request: Read timed out" error after 2 minutes.
Solution
To resolve the high latency and "HTTP request: Read timed out" errors, you can increase the socketTimeout value to a higher value that accommodates the maximum InvocationLatency reported in CloudWatch metrics for your test runs. A typical value is 480 seconds.
To implement this solution, follow these steps in your Java project:
- Add the Apache Client dependency: In your
pom.xml file, add the following dependency under the <dependencies> section:
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>apache-client</artifactId>
<version>2.27.21</version>
</dependency>
- Configure the Apache HttpClient: In your Java code, import the
ApacheHttpClient class and modify the BedrockRuntimeClient builder to set the socketTimeout value:
import software.amazon.awssdk.http.apache.ApacheHttpClient;
try (BedrockRuntimeClient bedrockClient = BedrockRuntimeClient.builder()
.region(Region.US_EAST_1)
.credentialsProvider(DefaultCredentialsProvider.create())
.httpClientBuilder(ApacheHttpClient.builder()
.socketTimeout(Duration.ofSeconds(480))
)
.build())
Refer to the AWS SDK for Java documentation for more information on configuring the Apache HttpClient.