By using AWS re:Post, you agree to the AWS re:Post Terms of Use

ALB with Host-Based Routing returns 503

0

Hi,

We have configured several ALB with numerous rules to route our traffic based on host headers. In a attempt to avoid inter-region data transfer cost as much as possible, we tried to limit EC2 instances creation to only one AZ (eu-west-1b, configured on our target groups) as high availability is not our biggest concern.

Everything seems to be working fine :

  • All machines are available and running our application
  • Group's health checks are showing nothing but green instances
  • Route53 records are correct too and pointing to the desired ALB

But when we try to load test on that ALB, we receive about 50% 200 status response from our app and 50% 503 from the ALB (inspecting access logs show that nothing is forwarded to our app after a "waf,forward" action.). We reproduce this behavior any time we launch our test.

The 50-50 responses had us thinking it was a AZ problem (as we only use one but the ALB can't have less than two). So we tried to modify our configuration to launch instances on both ALB AZs to test if that was the problem. Our next test was successful, only 200 status responses.

That solution raised a few questions :

  • Is a "one only" AZ configuration possible with Host-Based routing ? (We didn't find any articles stating that kind of limitation)
  • Even if multiple AZ is configured on ALB and on target group, how can we be sure that in case of an AZ failure, we wont return to a 50 200 - 50 503 scenario ?

I hope everything is clear enough.

Thanks.

3 Answers
0

Sounds to me one of your end hosts isnt configured correctly.

To find out which EC2 it is Do one of the following

  1. Take out one of the EC2s from the target group to see it fix's the issue or makes it worse. Then add back in and remove the other.
  2. Review ALB logs and review what codes are returned from which EC2
  3. Check the logs on the EC2s
profile picture
EXPERT
answered 20 days ago
  • Hi Gary,

    Thank you for taking the time to reply.

    The problem only occurs with only one instance running in the target group. When a 503 occurs, the ALB log shows a 503 response coming from the ALB itself : 2.12.000.000:00000 - -1 -1 -1 503 - 142 Our application log doesn't show anything either.

    But when we get a 200 response, we see that our application is indeed handling the request : 2.12.000.000:00000 10.0.102.33:80 0.002 0.891 0.000 200 200 142

    With one running instance on each AZ in the target group, everything works fine.

    Hope those explanations help you see clearer.

    Thank you.

  • Sounds like you have cross zone load balancing disabled. This will cause your issue.

    Make sure you enable it

    See cross zone load balancing here https://docs.aws.amazon.com/elasticloadbalancing/latest/userguide/how-elastic-load-balancing-works.html

0

It looks like you're running into an issue with traffic distribution when using a single Availability Zone (AZ) for your ALB-backed EC2 instances. Let's break down the problem and address your questions.

1. Is a "one only" AZ configuration possible with Host-Based routing?

Yes, it is possible to configure an Application Load Balancer (ALB) to route traffic to instances in one AZ. There's no inherent limitation preventing this when you're using host-based routing. However, there are a few things to consider:

  • ALB and AZ Distribution: ALBs themselves are multi-AZ by design, and traffic is distributed across multiple AZs for high availability. If you only have EC2 instances in one AZ, the ALB will still attempt to route traffic to other AZs, potentially leading to failed requests if there are no instances in those AZs.
  • If you want to limit EC2 instances to a single AZ, you need to ensure your ALB target group is pointing only to instances in that AZ. You can achieve this by either selecting specific instances in the target group or using IP-based targets that reside only in the desired AZ.

2. How can we ensure that in case of an AZ failure, we won’t return to a 50 200 - 50 503 scenario?

If you're using only one AZ and that AZ experiences a failure, the ALB won't be able to route traffic to any healthy instances because there are no instances in the other AZs to handle the traffic. This is likely why you're seeing the 50% 503 response—ALB can't forward traffic to any healthy instances.

To address this, you have a couple of options:

  • Multi-AZ Setup: Ideally, the ALB should route traffic to instances across at least two AZs. This would ensure high availability, so in the event of an AZ failure, the ALB can still forward traffic to instances in the healthy AZ.
  • Failover Considerations: If high availability is not a primary concern but you still want to avoid 503 errors during AZ failures, you could consider implementing failover routing at the application level or use Route 53 health checks to manage the failover process.
  • Scaling and Load Balancing: Even if you're limiting EC2 instances to one AZ for cost reasons, it's good practice to still configure auto-scaling across multiple AZs. This ensures that if one AZ becomes unhealthy, traffic can be routed to healthy instances in the other AZs, and your application remains highly available.

What to Do:

  • For Host-Based Routing with One AZ: You can use instance targeting within the specific AZ or IP-based targeting to ensure traffic is only routed to the instances in your chosen AZ.
  • For AZ Failover Protection: Configure your ALB to work across multiple AZs. This not only improves high availability but also helps avoid the 503 errors if one AZ fails.

Conclusion:

  • A single AZ configuration is possible with host-based routing, but it introduces risks for traffic routing failures in the event of AZ issues.
  • To avoid 50% 503 errors, consider multi-AZ deployment for redundancy, even if high availability isn't your primary concern, as it protects against AZ failures and helps ensure smoother traffic routing.

If you need more specific help, you might want to review AWS documentation on ALB and Route 53 health checks for deeper insights into optimizing your setup:

Let me know if you need further clarification!

profile picture
answered 19 days ago
  • Hi Saurabh,

    Thanks for your detailed answer.

    I assumed that an auto-scaling group configured to only run instances in eu-west-1b (by configuring only one private subnet in that zone) would have made it a "one AZ target group" for the ALB. This works fine if we target this group as the default forward action on the ALB by the way.

    The instant we introduce a host header rule (not changing anything on the ASG or on the target group), 503 begin to appear.

    Between what Gary and you pointed out, we will most certainly configured everything in multi AZ from now on though.

    Thanks.

0

The behavior you're experiencing is related to how Application Load Balancers (ALBs) are designed to work across multiple Availability Zones (AZs) for high availability and fault tolerance. Here's an explanation of what's happening and answers to your questions:

  1. One AZ Configuration: While it's technically possible to configure an ALB with instances in only one AZ, it's not recommended and can lead to the behavior you're seeing. ALBs are designed to distribute traffic across multiple AZs for better performance and reliability. When you limit instances to a single AZ, you're essentially defeating one of the primary purposes of using an ALB.

  2. 50% 200 / 50% 503 Responses: This split in responses is likely occurring because the ALB is attempting to route traffic to both AZs, even though you only have instances in one. When requests are routed to the AZ with no healthy targets, the ALB returns a 503 Service Unavailable error.

  3. Multiple AZ Configuration: By adding instances to both AZs used by the ALB, you resolved the issue because the ALB now has healthy targets in both zones to route traffic to.

  4. AZ Failure Scenario: In case of an AZ failure, if you have instances in both AZs, the ALB will automatically route all traffic to the healthy AZ. This is part of the ALB's built-in fault tolerance. However, if you only have instances in one AZ and that AZ fails, you would indeed return to a scenario where all requests result in 503 errors.

To ensure optimal performance and reliability:

  1. Always configure your ALB to use at least two AZs.
  2. Distribute your EC2 instances across these AZs in your target groups.
  3. If cost is a concern, consider using Auto Scaling groups to maintain a minimum number of instances across AZs while allowing for scale-out when needed.

By following these practices, you can maintain high availability and fault tolerance while optimizing for cost. Remember, the small additional cost of running instances in multiple AZs is often outweighed by the benefits of improved reliability and performance.

Sources
AWS WAF & ALB One-Click Integration | AWS re:Post
Troubleshoot a Classic Load Balancer: HTTP errors - Elastic Load Balancing
Troubleshoot a Classic Load Balancer: Health checks - Elastic Load Balancing
How to leverage Application Load Balancer’s advanced request routing to route application traffic across multiple Amazon EKS clusters | Containers

profile picture
answered 20 days ago
profile picture
EXPERT
reviewed 20 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions