How do I troubleshoot Classic Load Balancer capacity issues in ELB?
2 minute read
My Amazon CloudWatch metric SurgeQueueLength for my Classic Load Balancer has an increased maximum statistic. Clients also receive HTTP 503 Service Unavailable or HTTP 504 Gateway Timeout errors when they try to connect to my Classic Load Balancer. How do I troubleshoot these Elastic Load Balancing capacity issues?
The Classic Load Balancer metricSurgeQueueLength measures the total number of requests queued by your Classic Load Balancer. An increased maximum statistic for SurgeQueueLength indicates that backend systems can't process incoming requests as fast as the requests are received. Possible reasons for a high SurgeQueueLength metric include:
Overloaded Amazon Elastic Compute Cloud (Amazon EC2) instances behind the Classic Load Balancer that are unable to process all incoming requests
Application dependency issues due to external resource performance issues
Maximum allowable connection limits for instances
When requests exceed the maximum SurgeQueueLength, the SpilloverCount metric starts to measure rejected requests. The maximum SurgeQueueLength is 1024.