Skip to content

Faster failover time

0

I have an app deployed across two regions;

Route53 ----> ALB (Region 1) ----> ECS services
        \---> ALB (Region 2) ----> ECS services 

I setup Route 53 with Traffic Flow using the Latency policy so incoming requests are sent to the region with lowest latency. It works.

If one of my region loses connection, traffic is automatically redirected to the surviving region. However, it takes a few minutes for this to happen and my users are getting 503 in the meantime. Is there anything I can do to make the failover faster?

asked 3 years ago699 views
1 Answer
1
Accepted Answer

Hi,

You can create a Route 53 health check for each region to validate application health, and configure the request interval and failure threshold according to your use case requirements.

In case of failure, Route53 will remove this record from the routing, while still resolving the DNS lookup using your latency based policy.

EXPERT
answered 3 years ago
  • Thanks. This improves the response time somewhat. I still get 503 for a minute when one region is down. Is there a way to re-architect my application so that detect which regions are responding and only then direct the traffic to them according to latency?

  • What TTL have you configured for the Route53 record? In order to respond quickly to changes in health status, AWS recommends to specify a TTL of 60 seconds or less when a health check is associated to a latency record. Given your use case requirements, you could reduce this value and combine with a retry policy at the client side.

  • Thanks. I set the TTL to 60 seconds in the Traffic Flow and it's failing over faster now.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.