TCP health check with NLB

0

Hi All, I have a situation, where the service A (spring cloud application on Tomcat) to be deployed on EC2 (auto scaling group) using CHEF (deployment time ~10 mins). The servers are behind the NLB (cross zone load balancing enabled, sticky session disabled).

Now the issue is, as soon as a server is brought in-service, the NLB passes the health check before the CHEF build completes; which means, the target becomes healthy but the service A deployment still in progress. This is causing an issue to other service B (running on different EC2 on diff ASG) which is trying to connect to service A. The Service B connection to service A fails if the request from NLB landed on the EC2 of service A in question.

Since there is no option with TCP health check to set the health check path, one of the root causes I could think of is, as soon as Tomcat gets deployed, the NLB gets the health check response, which is sufficient to make the target healthy, whereas the service is still getting deployed on tomcat.

Is there a way to handle this situation? Except replacing NLB with ALB.

(*PS: application uses Spring Cloud Netflix patterns - Eureka, Config, zuul etc)

2 Answers
0

Is there a way to delay registration until the build/deployment has completed? Alternately, is there a way to avoid starting Tomcat until the build/deployment has completed?

profile pictureAWS
EXPERT
answered 2 years ago
  • Yes, I'll give it a try but the issue is we use some cookbooks which are managed centrally by the chef engg. team however, as I said let me try this approach if that works. Thanks.

0

Hello,

You'll want to delay the instances in ASG-A from being registered to the NLB until the application is done being installed. You can do this by adding a Lifecycle Hook (LCH) to the ASG. The last step if your CHEF cookbook should be to complete the lifecycle action with the CONTINUE action so AutoScaling registers the instance to the NLB's target group and moves the instance to InService.

Also, since you mentioned it takes ~10 minutes, you may want to look into using a Warm Pool on the ASG if their isn't frequent changes to the application/data to be loaded on the instance. A warm pool lets you pre-launch instances into an ASG, configure them (via a LCH), and then they're stopped. When the desired capacity goes up, the pre-configured instances in the warm pool will be started, saving you lots of time. Keep in mind that the launching LCH will run again when the instance is being moved to InService, so you'll either need to run the UserData again to apply any new updates + complete the hook, or trigger another process (like a lambda function) to complete the LCH, otherwise the instance will sit there in Pending:Wait and you'll lose all the time savings

AWS
answered 2 years ago
  • Thanks. It helps. I'll be trying warm pool as well. The chef client runs automatically every 2 hrs to pull the latest changes from user-data, will what approach I need to take for completing the LCH.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions