Skip to content

How do I troubleshoot high latency on my Application Load Balancer in Elastic Load Balancing?

6 minute read
0

I want to troubleshoot high latency on my Application Load Balancer in Elastic Load Balancing (ELB).

Short description

The following are causes of high latency on an Application Load Balancer:

  • Network connectivity or congestion issues
  • High memory (RAM) usage on backend instances
  • High CPU usage on backend instances
  • Incorrect web server configurations on backend instances
  • Issues with web application dependencies that run on backend instances
  • Large geographical distance between clients or on-premises targets and the Application Load Balancer

Resolution

To troubleshoot high latency on your Application Load Balancer, take the following actions:

  • Check for network connectivity issues. For more information, see Troubleshoot your Application Load Balancers.

  • Measure the first byte response and check for slow DNS resolution that might cause latency:

    curl -kso /dev/null -w "\n===============\n
    | DNS lookup: %{time_namelookup}\n
    | Connect: %{time_connect}\n
    | App connect: %{time_appconnect}\n
    | Pre-transfer: %{time_pretransfer}\n
    | Start transfer: %{time_starttransfer}\n
    | Total: %{time_total}\n
    | HTTP Code: %{http_code}\n===============\n" https://example.com/

    Example output:

    ===============
    | DNS lookup: 0.002346
    | Connect: 0.003080
    | App connect: 0.008422
    | Pre-transfer: 0.008587
    | Start transfer: 0.050238
    | Total: 0.057486
    | HTTP Code: 200
    ===============

    Note: To isolate what's causing latency, complete the preceding step with your Application Load Balancer, and then complete the step again and bypass your Application Load Balancer. For more information, see curl man page on the curl website.

  • Check the Average statistic of the Amazon CloudWatch TargetResponseTime metric for your Application Load Balancer. If the value is high, then you have an issue on the backend instances or the application dependency servers. For more information, see How do I troubleshoot an increase in the TargetResponseTime metric for an Application Load Balancer?

  • To identify the backend instances that have high latency, check the access log entries of your Application Load Balancer.

  • To identify backend instances with latency issues, check the target_processing_time for a longer than normal time period.

  • To confirm issues with the Application Load Balancer, review the request_processing_time and response_processing_time fields for longer than normal time periods.

    Example log entry:

    http 2024-04-01T22:23:00.765170Z app/example-loadbalancer/50dc6c495c0c9188
    192.168.131.39:2817 10.0.0.1:80 0.001 12.401 0.001 200 200 34 366
    "GET http://www.example.com:80/ HTTP/1.1" "curl/7.46.0" - -
    arn:aws:elasticloadbalancing:us-east-2:123456789012:targetgroup/example-targets/73e2d6bc24d8a067
    "Root=1-58337262-36d228ad5d99923122bbe354" "-" "-"
    0 2024-04-01T22:22:48.364000Z "forward" "-" "-" "10.0.0.1:80" "200" "-" "-"

    Note: The preceding example log entry shows that the request_processing_time is 0.001, the target_processing_time is 12.401, and the response_processing_time is 0.001. The target_processing_time value shows a longer than normal time period and that the Application Load Balancer target caused latency. For more information, see Syntax.

  • To identify high CPU utilization or spikes in CPU utilization, check the CloudWatch CPUUtilization metric of your backend instances. To resolve high CPU utilization, upgrade your instance to a larger instance type.

  • To review the Apache processes on your backend and check for memory issues, run the following command:

    watch -n 1 "echo -n 'Apache Processes: ' && ps -C apache2 --no-headers | wc -l && free -m"

    Example output:

    Every 1.0s: echo -;n 'Apache Processes: ' && ps -;C apache2 -;no-headers | wc -1 && free -;m
    Apache Processes: 27
              total     used     free     shared     buffers     cached
    Mem:      8204      7445     758      0          385         4567
    -/+ buffers/cache:  2402     5801
    Swap:     16383     189      16194
  • To check how many simultaneous requests that your instance can serve, view the MaxClient setting for the web servers on your backend instances. If your instance has an appropriate amount of memory and CPU utilization and still experiences high latency, then increase the MaxClient value.

  • Compare the number of processes that Apache generates (httpd) with the MaxClient value. If the number of Apache processes frequently reaches the MaxClient value, then increase the MaxClient value.

    Example command:

    [root@ip-192.0.2.0 conf]# ps aux | grep httpd | wc -l 15

    Example output:

    <IfModule prefork.c>StartServers         10
    MinSpareServers      5
    MaxSpareServers      10
    ServerLimit          15
    MaxClients           15
    MaxRequestsPerChild  4000
    </IfModule>

Check backend dependencies

Check for dependencies on your backend instances that might cause latency issues. Example dependencies include Amazon Simple Storage Service (Amazon S3) buckets, network address translation (NAT) instances or proxy servers, and remote web services.

Use Linux tools to identify performance bottlenecks

To identify performance bottlenecks on the server, use the following Linux commands:

  • Run the uptime command to check for high system load averages because of resource contention. The output shows you system load averages that are the number of tasks that are waiting to run or are blocked on I/O. 
  • The mpstat -P ALL 1 command shows you a breakdown of CPU usage for each core. Run the command to determine imbalanced usage, such as a single core that's handling most of the work in a single-threaded application.
  • To identify resource-heavy processes and patterns over time, run the pidstat 1 command. The output shows real-time CPU usage by process.
  • Run the dmesg | tail command to identify recent system-level events that might affect performance. The output shows the last 10 system messages.
  • To identify high read or write operations or slow disk performance, run the iostat -xz 1 command. The output shows disk I/O metrics and usage. 
  • Run the free -m command to view available system memory. If available memory is low, then the system might rely on swap that increases disk I/O and latency.
  • To identify bandwidth saturation, run the sar -n DEV 1 tool command. The output shows network interface throughput that includes received (rxkB/s) and transmitted (txkB/s) traffic.
  • Run sar -n TCP,ETCP 1 to view the following key TCP metrics that help you determine connection issues:
    The active/s metric is the number of locally initiated TCP connections per second.
    The passive/s metric is the number of remotely initiated TCP connections per second.
    The retrans/s metric is the number of TCP retransmissions per second. When this metric is high, you might be experiencing packet loss or congestion.
  • To identify what's using the most bandwidth on the network, run the iftop command. The output shows active bandwidth usage for each connection in real time.
  • The niftop command is a variant of iftop that might be available through third-party repositories on Red Hat Enterprise Linux (RHEL) and Debian-based distributions. If iftop isn't preinstalled on your system, then use niftop.

Note: Depending on your Linux distribution, you might need to manually install some of the preceding commands.

AWS OFFICIALUpdated 8 months ago