Lambda: Failed while publishing some or all AWS SDK client-side metrics to CloudWatch.

0

I see a lot of these warnings in my Lambda logs:

Failed while publishing some or all AWS SDK client-side metrics to CloudWatch

With nested cause

software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: Acquire operation took longer than the configured maximum time. This indicates that a request cannot get a connection from the pool within the specified maximum time. This can be due to high request rate.\nConsider taking any of the following actions to mitigate the issue: increase max connections, increase acquire timeout, or slowing the request rate.\nIncreasing the max connections can increase client throughput (unless the network interface is already fully utilized), but can eventually start to hit operation system limitations on the number of file descriptors used by the process. If you already are fully utilizing your network interface or cannot further increase your connection count, increasing the acquire timeout gives extra time for requests to acquire a connection before timing out. If the connections doesn't free up, the subsequent requests will still timeout.\nIf the above mechanisms are not able to fix the issue, try smoothing out your requests so that large traffic bursts cannot overload the client, being more efficient with the number of times you need to call AWS, or by increasing the number of hosts sending requests.

Does anybody have any idea what can be the cause of this?

My project is configured according to this documentation: https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/metrics.html

Full exception details:

{
  "timestamp": "2024-01-12T12:04:12.706+0000UTC",
  "instant": {
    "epochSecond": 1705061052,
    "nanoOfSecond": 706000000
  },
  "thread": "sdk-async-response-1-3",
  "level": "WARN",
  "loggerName": "software.amazon.awssdk.metrics.publishers.cloudwatch",
  "message": "Failed while publishing some or all AWS SDK client-side metrics to CloudWatch.",
  "thrown": {
    "message": "software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: Acquire operation took longer than the configured maximum time. This indicates that a request cannot get a connection from the pool within the specified maximum time. This can be due to high request rate.\nConsider taking any of the following actions to mitigate the issue: increase max connections, increase acquire timeout, or slowing the request rate.\nIncreasing the max connections can increase client throughput (unless the network interface is already fully utilized), but can eventually start to hit operation system limitations on the number of file descriptors used by the process. If you already are fully utilizing your network interface or cannot further increase your connection count, increasing the acquire timeout gives extra time for requests to acquire a connection before timing out. If the connections doesn't free up, the subsequent requests will still timeout.\nIf the above mechanisms are not able to fix the issue, try smoothing out your requests so that large traffic bursts cannot overload the client, being more efficient with the number of times you need to call AWS, or by increasing the number of hosts sending requests.",
    "name": "java.util.concurrent.CompletionException",
    "extendedStackTrace": [
      {
        "class": "software.amazon.awssdk.utils.CompletableFutureUtils",
        "method": "errorAsCompletionException",
        "file": "CompletableFutureUtils.java",
        "line": 65
      },
      {
        "class": "software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncExecutionFailureExceptionReportingStage",
        "method": "lambda$execute$0",
        "file": "AsyncExecutionFailureExceptionReportingStage.java",
        "line": 51
      },
      {
        "class": "java.util.concurrent.CompletableFuture",
        "method": "uniHandle",
        "file": null,
        "line": -1
      },
      {
        "class": "java.util.concurrent.CompletableFuture$UniHandle",
        "method": "tryFire",
        "file": null,
        "line": -1
      },
      {
        "class": "java.util.concurrent.CompletableFuture",
        "method": "postComplete",
        "file": null,
        "line": -1
      },
      {
        "class": "java.util.concurrent.CompletableFuture",
        "method": "completeExceptionally",
        "file": null,
        "line": -1
      },
      {
        "class": "software.amazon.awssdk.utils.CompletableFutureUtils",
        "method": "lambda$forwardExceptionTo$0",
        "file": "CompletableFutureUtils.java",
        "line": 79
      },
      {
        "class": "java.util.concurrent.CompletableFuture",
        "method": "uniWhenComplete",
        "file": null,
        "line": -1
      },
      {
        "class": "java.util.concurrent.CompletableFuture$UniWhenComplete",
        "method": "tryFire",
        "file": null,
        "line": -1
      },
      {
        "class": "java.util.concurrent.CompletableFuture",
        "method": "postComplete",
        "file": null,
        "line": -1
      },
      {
        "class": "java.util.concurrent.CompletableFuture",
        "method": "completeExceptionally",
        "file": null,
        "line": -1
      },
      {
        "class": "software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryingExecutor",
        "method": "maybeAttemptExecute",
        "file": "AsyncRetryableStage.java",
        "line": 103
      },
      {
        "class": "software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryingExecutor",
        "method": "maybeRetryExecute",
        "file": "AsyncRetryableStage.java",
        "line": 184
      },
      {
        "class": "software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryingExecutor",
        "method": "lambda$attemptExecute$1",
        "file": "AsyncRetryableStage.java",
        "line": 159
      },
      {
        "class": "java.util.concurrent.CompletableFuture",
        "method": "uniWhenComplete",
        "file": null,
        "line": -1
      },
      {
        "class": "java.util.concurrent.CompletableFuture$UniWhenComplete",
        "method": "tryFire",
        "file": null,
        "line": -1
      },
      {
        "class": "java.util.concurrent.CompletableFuture",
        "method": "postComplete",
        "file": null,
        "line": -1
      },
      {
        "class": "java.util.concurrent.CompletableFuture",
        "method": "completeExceptionally",
        "file": null,
        "line": -1
      },
      {
        "class": "software.amazon.awssdk.utils.CompletableFutureUtils",
        "method": "lambda$forwardExceptionTo$0",
        "file": "CompletableFutureUtils.java",
        "line": 79
      },
      {
        "class": "java.util.concurrent.CompletableFuture",
        "method": "uniWhenComplete",
        "file": null,
        "line": -1
      },
      {
        "class": "java.util.concurrent.CompletableFuture$UniWhenComplete",
        "method": "tryFire",
        "file": null,
        "line": -1
      },
      {
        "class": "java.util.concurrent.CompletableFuture",
        "method": "postComplete",
        "file": null,
        "line": -1
      },
      {
        "class": "java.util.concurrent.CompletableFuture",
        "method": "completeExceptionally",
        "file": null,
        "line": -1
      },
      {
        "class": "software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage",
        "method": "lambda$execute$0",
        "file": "MakeAsyncHttpRequestStage.java",
        "line": 103
      },
      {
        "class": "java.util.concurrent.CompletableFuture",
        "method": "uniWhenComplete",
        "file": null,
        "line": -1
      },
      {
        "class": "java.util.concurrent.CompletableFuture$UniWhenComplete",
        "method": "tryFire",
        "file": null,
        "line": -1
      },
      {
        "class": "java.util.concurrent.CompletableFuture",
        "method": "postComplete",
        "file": null,
        "line": -1
      },
      {
        "class": "java.util.concurrent.CompletableFuture",
        "method": "completeExceptionally",
        "file": null,
        "line": -1
      },
      {
        "class": "software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage",
        "method": "completeResponseFuture",
        "file": "MakeAsyncHttpRequestStage.java",
        "line": 240
      },
      {
        "class": "software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage",
        "method": "lambda$executeHttpRequest$3",
        "file": "MakeAsyncHttpRequestStage.java",
        "line": 163
      },
      {
        "class": "java.util.concurrent.CompletableFuture",
        "method": "uniHandle",
        "file": null,
        "line": -1
      },
      {
        "class": "java.util.concurrent.CompletableFuture$UniHandle",
        "method": "tryFire",
        "file": null,
        "line": -1
      },
      {
        "class": "java.util.concurrent.CompletableFuture$Completion",
        "method": "run",
        "file": null,
        "line": -1
      },
      {
        "class": "java.util.concurrent.ThreadPoolExecutor",
        "method": "runWorker",
        "file": null,
        "line": -1
      },
      {
        "class": "java.util.concurrent.ThreadPoolExecutor$Worker",
        "method": "run",
        "file": null,
        "line": -1
      },
      {
        "class": "java.lang.Thread",
        "method": "run",
        "file": null,
        "line": -1
      }
    ]
  },
  "endOfBatch": false,
  "loggerFqcn": "org.apache.logging.slf4j.Log4jLogger",
  "threadId": 41,
  "threadPriority": 5
}
1 Answer
1

Hi,

This usually due to the fact that you issue too many requests at the same to one or more other AWS downstream services. Probably CloudWatch in your case. And usually you cannot obtain enough http connections to serve them.

So, are working at high Lambda throughput and does each Lambda makes a lot of request for other AWS services like CW?

To better understand, you may want to read this blog post: https://aws.amazon.com/blogs/developer/tuning-the-aws-sdk-for-java-to-improve-resiliency/

This StackOverFlow question may also help: https://stackoverflow.com/questions/68377235/unable-to-execute-http-request-acquire-operation-took-longer-than-the-configure

Best,

Didier

profile pictureAWS
EXPERT
answered 4 months ago
  • No, the lambda actually have a pretty low throughput, like 6 in 1 minute. I use the default settings for CloudWatch. Could SnapStart cause some issues?

  • I don't think so: the error that you have is an overload on the http requests. AFAIK, SnapStart doesn't interfere here: the timeout for http is counted here only after Lambda is started (so SnapStart is over). I'd really analyze all the sources of http requests in your case: try to reduce the amount of logs pushed to CW to see if exception disappears and then reincrease again when you know.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions