AWS Application Load Balancer and Http2 Persistent Connections ("keep alive")


I have some questions about the "AWS Application Load Balancer" in regard to http2 persistent connections:

Does the "AWS Application Load Balancer" itself maintain its own internal http2-connection-pool? (or nah?)

If the load balancer does indeed maintain its own http2-connection-pool for persistent http2 connections I have these follow-up questions:

  1. I can't find anything in the AWS docs explaining how the size(s) of the http2-connection-pools (maintained by ALB) are configured (if at all). Can it maintain for example 2 million http2 connections open at the same time (for the sake of ultra low latency). At what cost (are there scaling costs)? Any links that elaborate on these aspects?

  2. Does the ALB, by default, maintain a fixed-size http2-connection-pool between itself and the browsers (clients) or are these connection-pools dynamically sized? If they are fixed-size how big are they by default? If they are dynamic what rules govern their expansion/contraction and what's the max amount of persistent http2-connections that they can hold? 30k? 40k? 5million?

  3. Let's assume we have 20k http2-clients that run single-page-applications (SPAs) with sessions lasting up to 30mins. These clients need to enjoy ultra-low latency for their semi-frequent http2-requests through AWS ALB (say 1 request per 4secs which translates to about 5k requests/second landing on the ALB):Does it make sense to configure the ALB to have a hefty http2-connection-pool so as to ensure that all these 20k http2-connections from our clients will indeed be kept alive throughout the lifetime of the client-session?Reasoning: In this way no http2-connection will be closed and reopened (guarantees lower jitter because reestablishing a new http2-connection involves some extra latency - at least that's my intuition about this and I'd be happy to stand corrected if I miss something)

1 Answer
Accepted Answer

Hi Dominick, thanks for reaching out!

So, when it comes to the concurrent connection limits of an Application Load Balancer, there is no upper limitations on the amount of traffic it can serve; it can scale automatically to meet the vast majority of traffic workloads.

An ALB will scale up aggressively as traffic increases, and scale down conservatively as traffic decreases. As it scales up, new higher capacity nodes will be added and registered with DNS, and previous nodes will be removed. This effectively gives an ALB a dynamic connection pool to work with.

When working with the client behavior you have described, the main attribute you'll want to look at when configuring your ALB will be the Connection Idle Timeout setting. By default, this is set to 60 seconds, but can be set to a value of up to 4000 seconds. In your situation, you can set a value that will meet your need to maintain long-term connections of up to 30 minutes without the connection being terminated, in conjunction with utilizing HTTP keep-alive options within your application.

As you might expect, an ALB will start with an initial capacity that may not immediately meet your workload. But as stated above, the ALB will scale up aggressively, and scale down conservatively, scaling up in minutes, and down in hours, based on the traffic received. I highly recommend checking out our best practices for ELB evaluation page to learn more about scaling and how you can test your application to better understand how an ALB will behave based on your traffic load. I will highlight from this page that depending on how quickly traffic increases, the ALB may return an HTTP 503 error if it has not yet fully scaled to meet traffic demand, but will ultimately scale to the necessary capacity. When load testing, we recommend that traffic be increased at no more than 50 percent over a five minute interval.

When it comes to pricing, ALBs are charged for each hour that the ALB is running, and the number of Load Balancer Capacity Units (LCU) used per hour. LCUs are measured based on a set of dimensions on which traffic is processed; new connections, active connections, processed bytes, and rule evaluations, and you are charged based only on the dimension with the highest usage in a particular hour.

As an example using the [ELB Pricing Calculator], assuming the ~20,000 connections are ramped up by 10 connections per second, with an average connection duration of 30 minutes (1800 seconds) and sending 1 request every 4 seconds for a total of 1GB of processed data per hour, you could expect a rough cost output of:

1 GB per hour / 1 GB processed bytes per hour per LCU for EC2 instances and IP addresses as targets = 1 processed bytes LCUs for EC2 instances and IP addresses as targets
10 new connections per second / 25 new connections per second per LCU = 0.40 new connections LCUs
10 new connections per second x 1,800 seconds = 18,000 active connections
18,000 active connections / 3000 connections per LCU = 6 active connections LCUs
1 rules per request - 10 free rules = -9 paid rules per request after 10 free rules
Max (-9 USD, 0 USD) = 0.00 paid rules per request
Max (1 processed bytes LCUs, 0.4 new connections LCUs, 6 active connections LCUs, 0 rule evaluation LCUs) = 6 maximum LCUs
1 load balancers x 6 LCUs x 0.008 LCU price per hour x 730 hours per month = 35.04 USD
Application Load Balancer LCU usage charges (monthly): 35.04 USD
answered 3 years ago
reviewed 3 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions