Working ECS service config starts killing tasks because it cannot reach healthcheck

0

Last week, we started having problems after an external service we depend on crashed. We noticed that our ECS service was still not working after the external service fixed its problem. We struggled to find the reason why one of our ECS service was having troubles maintaining its tasks up and running. A simple "forced" redeployment with a single configuration change setting the number of tasks from 2 to 5 fixed the problem.

The next day, we tried to figure out what happened. It seems the ECS service was continuously killing the task because it was not able to reach its heathcheck endpoint. This healthcheck returns a simple 200 response without any dependency to external services. In CloudWatch, we can see logs of the process manager starting the application and logs of the server starting listening.

The problem occurred from May 8th at 8:30 PM to May 9th at 6:45 PM. Before the incident, the ECS service was running correctly since several days. The load was not different than usual. Performing a "forced redeployment" without changing the container image or any other configuration but the number of tasks fixed the issue. No more healthcheck problem at all. So we don't understand why the service was unstable before.

Do you please have any hint about what could have caused this behavior ?

asked a year ago267 views
1 Answer
0

Perhaps the container ran out of space due to exessive logging, memory etc. Your Applicaiton in your container may not be able to recover from an external issue. ALB will terminate the instance if indeed the health check fails.

Your container may have had internal issues. I believe you should investigate the behaviour of your container if the external service becomes unavailable

profile picture
EXPERT
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions