- Newest
- Most votes
- Most comments
Here is what I found
- We are using Tomcat to serve our web application.
- It is configured to use HTTP 1.1 protocol.
- It doesn't support TCP Pipelining.
- Target Group is configured with HTTP 1.1 protocol.
When a request arrives at the application load balancer, the ALB first establish TCP connection with the underlying server before it can send request to the server (TCP/IP Model). It has two options at that moment.
- Make use of existing keep-alive TCP connection and send the request over it.
- Establish a new TCP connection and send the request over it.
Since the underlying server doesn't support TCP pipelining therefore, over a single TCP connection the ALB can't send another request if there is already a request that has been sent and the response for that request is not yet arrived. In other words, there can only be one request getting served over a TCP connection at a moment. Keep-alive connections allow us to send multiple requests over a single TCP connection, but following HTTP 1.1 protocol, these requests must be sent in a sequence such that at a moment there is only one request getting served over it. To be able to send another request, it must have to wait for the response of previously sent request.
Now we know that for the ALB to use an existing TCP keep-alive connection, there must NOT be any outstanding request over it. ( Outstanding request: a request for which the ALB is waiting for the server to respond). If it finds such keep-alive TCP connection with no outstanding request, it uses that connection to send the request to the server and waits for the response before sending another request over it.
A TCP connection can't outlive forever. We have configured our Tomcat server to close the connection after a certain time period elapses ( 20 seconds) since its opening or once it has served certain number of requests (200 ).
Second case, where if doesn't find any existing keep-alive TCP connection with no outstanding request, it immediately tries to establish a new TCP connection with the underlying server (It happens within 10 seconds the moment the ALB receives the HTTP request). Once the connection is established, it sends the request over it.
There is a limit on the number of active TCP connections our Tomcat server can have at a moment. If there comes any new TCP connection request over that limit, it sends connection refused error to the client trying to establish the connection ( here, the ALB). To such errors, the ALB respond with 504 Gateway timeout error.
Hello,
you're absolutely right about the two connections ALB maintains:
Front-end connection: This connects ALB to the client making the request. Back-end connection: This connects ALB to the EC2 instance in the target group.
ALB establishes a single back-end connection with each EC2 instance in the target group, irrespective of the number of clients.
In your example, even with two clients sending requests, ALB will maintain only one back-end connection to the EC2 instance. This connection acts as a channel for forwarding requests from multiple clients to the EC2 instance.
Reason for 504 Gateway Timeout Errors:
Your assumption about the single back-end connection being the culprit for the 504 errors is likely correct. Here's why:
Tomcat Server Limit: With a limit of 200 requests per connection, exceeding this limit while the connection is still open will result in new requests being rejected.
Multiple Clients, Single Connection: With a single back-end connection, if the first 200 requests from both clients fill the connection, subsequent requests will be rejected until the existing requests are processed and the connection frees up. This delay can lead to timeouts. Understanding ALB's Decision for New Connections:
While ALB uses a single connection per EC2 instance, it does consider certain factors before reusing an existing connection:
Healthy Target: The EC2 instance needs to be healthy in the target group to receive new requests.
Connection Idle Timeout: ALB maintains an idle timeout period for back-end connections. If a connection remains inactive for a certain time (configurable), it may be closed and a new connection established for the next request.
Least Outstanding Requests: In some configurations, ALB might prefer back-end connections with fewer outstanding requests to minimize queuing. (This behavior can be disabled)
ALB Connection Management: https://docs.aws.amazon.com/elasticloadbalancing/latest/application/introduction.html
ALB Target Group Health: https://docs.aws.amazon.com/elasticloadbalancing/latest/application/introduction.html
Tomcat Connection Management: https://tomcat.apache.org/tomcat-9.0-doc/config/index.html
Tomcat Connection Pooling: https://tomcat.apache.org/tomcat-7.0-doc/jdbc-pool.html
ALB Metrics:
Relevant content
- Accepted Answerasked a year ago
- asked a year ago
- AWS OFFICIALUpdated a month ago
- AWS OFFICIALUpdated 5 months ago
- AWS OFFICIALUpdated 3 months ago
"ALB establishes a single back-end connection with each EC2 instance in the target group, irrespective of the number of clients." - This is not correct - we maintain a pool of connections to each target - someone from ELB team will create a correct/complete response for this shortly.