By using AWS re:Post, you agree to the AWS re:Post Terms of Use

Does Route 53 latency-based routing perform failover when one region performance degrades?

0

Hi,

I have the following deployment:

  1. Region1: API Gateway (apigw1) backed by lambda function (myFunc) exposed via custom domain name (say, api.mydomain.com)
  2. Region2: API Gateway (apigw2) backed by same lambda function implementation (myFunc) exposed via same custom domain name (i.e. api.mydomain.com).
  3. Global: Two "A" records for "api.mydomain.com" in Route 53 Hosted Zone (mydomain.com) with Routing Policy as "Latency" pointing to apigw1 endpoint and apigw2 endpoint in region1 and region2 respectively.

My questions are:

  1. If there is a problem in region1 (e.g. myFunc lambda function in region1 has performance degradation) resulting in significantly higher API response time, will the clients connected to apigw1 endpoint for region1 be eventually (say, in 30 mins or 1 hour or after some reasonable time period) be diverted to apigw2 in region2?
  2. If answer to above question is "no", what changes in Routing Policy or other components do I need to perform to ensure Route 53 can failover to a healthy region endpoint in case of issues in one region?

Please advise.

2 Answers
1
Accepted Answer

To address your questions about Route 53 latency-based routing and failover:

  1. Latency-based routing alone does not automatically failover or divert traffic based on performance degradation. Route 53 uses historical latency data to make routing decisions, but it doesn't actively monitor real-time performance issues like slow lambda function responses. This means that clients connected to the region1 endpoint (apigw1) would not automatically be diverted to region2 (apigw2) if performance degrades in region1.

  2. To ensure failover to a healthy region endpoint in case of issues, you should implement a combination of health checks and failover routing along with your latency-based routing. Here's how you can modify your setup:

    a. Create health checks for each regional endpoint (apigw1 and apigw2).

    b. Instead of using just latency-based routing, implement a combination of failover and latency routing:

    • Create two failover records (primary and secondary) for api.mydomain.com.
    • For each failover record, create latency-based alias records pointing to your regional API Gateway endpoints.
    • Associate the health checks with the corresponding latency records.

    c. Configure the failover records:

    • Set one region as primary and the other as secondary.
    • Route 53 will use the primary region when it's healthy, and automatically switch to the secondary if the primary fails the health check.

    d. Within each failover record, the latency-based routing will direct traffic to the lowest latency endpoint within that group.

This setup allows you to maintain the benefits of latency-based routing while also implementing automatic failover in case of regional issues. Route 53 will detect when an endpoint is unhealthy based on the health checks and stop including it when responding to queries.

Remember to configure appropriate health check parameters that accurately reflect the performance and availability of your API Gateway and Lambda function. This ensures that Route 53 can detect issues and initiate failover when necessary.

By implementing this combination of failover and latency-based routing with health checks, you can achieve both low-latency routing under normal conditions and automatic failover to a healthy region when issues occur in one region.
Sources
Latency-based routing - Amazon Route 53
Active-active and active-passive failover - Amazon Route 53
Cross-Region DNS-based load balancing and failover - Real-Time Communication on AWS

profile picture
answered 17 days ago
profile picture
EXPERT
reviewed 16 days ago
  • Thank you, Adeleke. This is very helpful.

0

Route 53 Latency-Based Routing (LBR) does not inherently perform failover when the performance of one region degrades. LBR is designed to route users to the endpoint that has the lowest latency based on historical measurements, not current performance metrics.

Latency-Based Routing alone does not failover based on performance degradation. Solution: Add health checks and use Failover Routing Policy alongside Latency-Based Routing. Consider monitoring and automation with CloudWatch and Lambda to proactively manage endpoint health.

profile picture
EXPERT
answered 16 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions