Intermittent 504 Errors with NLB in AWS Architecture - Seeking Insights

0

Hello AWS Community,

I'm facing a challenge with intermittent 504 errors in my AWS architecture and am looking for insights and suggestions.

Architecture Overview:

  • API Gateway: Acts as the entry point, connected via VPC link to the NLB.
  • Network Load Balancer (NLB): Redirects traffic to a target group, which includes the IP of an ECS Task in an ECS Cluster (Service).
  • TLS Termination: Initially set up at both the API Gateway and NLB (Listener at TLS:1030 forwarding to Target Group TCP:1030).

Issue: After adding additional services/listeners to the NLB, I encountered 504 errors, with only about one-third of the requests being successful. The NLB Metrics showed variable Target Group Reset Counts, ranging from 8 to 100.

Troubleshooting Steps:

1. Changing NLB Listener from TLS to TCP: Switched the NLB listener from TLS to TCP:1030, leading to TLS termination solely at the API Gateway. This resolved the 504 errors, and services became consistently available. However, the reason for this resolution remains unclear.

2. Distributing Load to Dedicated Load Balancers/VPC Links: I also tried separating new services onto dedicated load balancers with dedicated VPC links. This approach also resolved the 504 errors, but the Target Group Reset counts persisted. This leads me to believe that while distributing the load alleviated the immediate issue, it might not be addressing the root cause.

Seeking Help: While the issue seems mitigated, I'm still looking to understand the root cause. Specifically, I'm interested in insights on:

  • Why did TLS termination at both the API Gateway and NLB level lead to 504 errors and high Target Group Reset counts?

  • How did switching the listener from TLS to TCP at the NLB, and distributing services to dedicated load balancers, resolve these issues?

  • Could there be underlying issues related to load distribution or configuration that are not immediately apparent?

Any advice or theories from the community would be greatly appreciated. Thanks in advance for your help!

No Answers

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions