Getting 504 response timeouts via ALB but not directly.

0

Hello, Our production environment contains 2 ALBs: a public facing ALB and a private one. Both of these ALBs support HTTP/2.

Now I have a target group which supports HTTP/1.1 containing an ECS service. The very strange thing I'm observing is that:

  1. When requests are made to this service via either of the ALBs, approximately 1 out of 5 requests fail with a 504 gateway timeout.
  2. When I make requests to the IP address of the service directly (via an EC2 instance in the same VPC), I don't get any such timeouts.
  3. An older version of the same application works without 504s via any of the ALBs.
  4. The timeout on the ALBs is set to 30s. In the application it is set to 60s (nginx) and the proxied service also has the same value.
  5. I've compared the response headers in both servers, but they are identical.

My question here is, what should I be looking at as the potential culprit? I know the keep-alive caveats are a huge problem, but again, two different versions of the same application behave differently and I find there is very little to help me debug this.

Thanks.

已提問 1 年前檢視次數 528 次
1 個回答
0

Thank you for the detailed description.


To diagnose the problem, let's first confirm that the 504 error is being generated by the ALB. We can do this by checking if the server: awselb/2.0 header is present in the response or by reviewing the HTTPCode_ELB_504_Count metrics in CloudWatch.

If the 504 error is indeed being generated by the ALB, we can refer to this AWS document, which lists all possible causes of the error and provides guidance on how to resolve them. Additionally, this Knowledge Center article provides further guidance on fixing the issue. For instance, we should verify that all ALB nodes can connect to the targets, which may reside in different subnets with specific security groups and subnet ACLs.


As a side note, it may be helpful to temporarily enable ALB access logs, which can provide more information about the 504 requests for deeper analysis.

AWS
weidi
已回答 1 年前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南