Skip to content

Solving 'Read Timed Out' Error and High Latencies in Amazon Bedrock with AWS Java SDK Client

4 minute read
Content level: Advanced
1

When invoking Amazon Bedrock models with large prompts (>10K context length) using AWS Java SDK, customers experience high latencies (45s+) or timeouts with "HTTP request: Read timed out" error after 2 mins. This article explains the cause and provides a resolution to help optimize performance.

Problem and Symptoms

When using the AWS Java SDK to invoke Amazon Bedrock models, you may encounter the following issues:

  • High latencies: Requests taking 45 seconds or longer to complete
  • Timeouts: "HTTP request: Read timed out" error after 2 minutes
  • These issues typically occur with larger or complex prompts (>10K context length) and/or larger output tokens (>2K)

However, when investigating the issue in CloudWatch, you may notice a discrepancy:

  • The InvocationLatency metric for Bedrock indicates that the invocation took between 30-40 seconds, which is shorter than the actual latency experienced.
  • Additionally, you may see 2 or 3 additional invocation attempts for each affected request.

A typical stack trace for this issue may resemble the following:

Error: Unable to execute HTTP request: Read timed out
software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: Read timed out
        at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:111)
        at software.amazon.awssdk.core.exception.SdkClientException.create(SdkClientException.java:47)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.utils.RetryableStageHelper2.setLastException(RetryableStageHelper2.java:226)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage2.execute(RetryableStage2.java:65)
        :
        :
        at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:53)
        at software.amazon.awssdk.services.bedrockruntime.DefaultBedrockRuntimeClient.invokeModel(DefaultBedrockRuntimeClient.java:329)
        at com.example.BedrockClaudeExample.main(BedrockClaudeExample.java:51)
        Suppressed: software.amazon.awssdk.core.exception.SdkClientException: Request attempt 1 failure: Unable to execute HTTP request: Read timed out
        Suppressed: software.amazon.awssdk.core.exception.SdkClientException: Request attempt 2 failure: Unable to execute HTTP request: Read timed out
        Suppressed: software.amazon.awssdk.core.exception.SdkClientException: Request attempt 3 failure: Unable to execute HTTP request: Read timed out
Caused by: java.net.SocketTimeoutException: Read timed out`

Cause

The root cause of the high latency and "HTTP request: Read timed out" errors is the default timeout value of 30 seconds for socketTimeout in the AWS Java SDK. This value is defined in the SDK's configuration options.

When the SDK encounters a socket error, it is configured to retry the request up to three times by default. This retry mechanism is part of the SDK's built-in retry strategies, which are designed to handle transient errors such as socket timeouts, service-side throttling, and concurrency failures.

As a result, when the SDK encounters a timeout, it will retry the request multiple times, leading to one of two possible outcomes:

  1. High latency: The request is retried multiple times, with each retry taking up to 30 seconds. If the request is eventually successful, the end-to-end latency will be the sum of the individual retry times, resulting in a high overall latency. For example: Request > first invocation (30s timed out) > first retry (30s timed out) > second retry (14s, response returned) > End to end latency (74s)
  2. Read Timed Out error: The request is retried three times after the first invocation, with each retry timing out after 30 seconds. If all retries fail, the SDK will throw an "Unable to execute HTTP request: Read timed out" error after 2 minutes.

Solution

To resolve the high latency and "HTTP request: Read timed out" errors, you can increase the socketTimeout value to a higher value that accommodates the maximum InvocationLatency reported in CloudWatch metrics for your test runs. A typical value is 480 seconds.

To implement this solution, follow these steps in your Java project:

  1. Add the Apache Client dependency: In your pom.xml file, add the following dependency under the <dependencies> section:
<dependency>
    <groupId>software.amazon.awssdk</groupId>
    <artifactId>apache-client</artifactId>
    <version>2.27.21</version>
</dependency>
  1. Configure the Apache HttpClient: In your Java code, import the ApacheHttpClient class and modify the BedrockRuntimeClient builder to set the socketTimeout value:
import software.amazon.awssdk.http.apache.ApacheHttpClient;

try (BedrockRuntimeClient bedrockClient = BedrockRuntimeClient.builder()
    .region(Region.US_EAST_1)
    .credentialsProvider(DefaultCredentialsProvider.create())
    .httpClientBuilder(ApacheHttpClient.builder()
        .socketTimeout(Duration.ofSeconds(480))
    )
    .build())

Refer to the AWS SDK for Java documentation for more information on configuring the Apache HttpClient.

2 Comments

I would like to suggest that the timeout and retry configuration is not the root cause of the problem, but it is how some underlying problem may be initially detected. I am seeing the same problem using the Python boto3 client, bedrock-runtime. Using that SDK, the timeout is set to 60 seconds by default at the http connection level, and the outcome is very similar -- you must wait a lot longer than usual to get a successful response.

The real problem is that the initial response from Bedrock is either being throttled or completely lost. The reason this must be true is because the exact same request often finishes in under ten seconds. Sometimes it times out over a minute, but usually it finishes very quickly. Both cases can be reproduced using the exact same input tokens. This would point to an intermittent problem on the Bedrock server side.

replied a year ago

Thanks Joe for your adds. Yes, that could be another case, but in this case it was neither getting throttled, nor lost. The latency behavior was consistent, tested it with playground (just to isolate client side settings).

AWS
EXPERT
replied a year ago