Skip to content

API Gateway REST API, NLB integration intermittent 500 error

0

Hi, I have a following setup: API Gateway REST API -> VPC Link -> Network Load Balancer -> ECS

Network Load Balancer and ECS are in public subnet, I do not have any private subnets. VPC Link, Network Load Balancer, ECS service have the same three public subnets set, each subnet in separate AZ. Network Load Balancer is internal. VPC Link, Network Load Balancer, ECS service are in the same region. API works correctly when ECS task is deployed in AZ A, but I get intermittent 500 errors from api gateway when ECS task is deployed in AZ other than A. There's only one ecs task instance.

What might be cause of the problem?

1 Answer
0

The intermittent 500 errors you're experiencing with your API Gateway REST API to ECS setup could be caused by several factors:

  1. Cross-AZ connectivity issues: Since your setup works when the ECS task is in AZ A but fails intermittently when in other AZs, there might be connectivity problems between your VPC Link and the NLB targets in different availability zones.

  2. NLB target health: Check if your NLB targets are showing as healthy in all AZs. The high error rate could indicate that the NLB is having trouble establishing or maintaining connections with targets in AZs other than A. Look for metrics like TCP_Target_Reset_Count which might indicate connection issues.

  3. Security group or network configuration: Verify that security groups and network configurations allow proper traffic flow between all components across all AZs. Even though everything is in public subnets, there might be specific routing or security group rules affecting cross-AZ traffic.

  4. API Gateway configuration error: The 500 error message "Execution failed due to configuration error: There was an internal error while executing your request" often appears when there's a misconfiguration in how API Gateway connects to the VPC Link or how the VPC Link connects to the NLB.

  5. Health check configuration: Ensure your NLB health checks are properly configured for all AZs. Improper health check settings might cause the NLB to incorrectly mark targets as unhealthy in certain AZs.

To troubleshoot:

  • Check NLB target health status across all AZs
  • Review VPC Flow Logs to identify any dropped or rejected connections
  • Verify security groups and network ACLs are consistent across all AZs
  • Enable detailed CloudWatch logging for your API Gateway and NLB
  • Check for any routing issues between components in different AZs
  • Verify that the NLB listener and target group configurations are correct

If after these steps you're still experiencing issues, you might need to contact AWS Support for more in-depth troubleshooting as there could be account-specific configurations or limitations at play.
Sources
Configuration of NLB Targets and API Integration Request when connect Rest API to VPC endpoint using VPC Link, NLB | AWS re:Post
NLB Connectivity Error | AWS re:Post
How to debug 502 errors on ELB | AWS re:Post

answered a year ago
EXPERT
reviewed a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.