Skip to content

ecs-service-connect consistently returns "The container health check has failed", killing whole service

0

I'm running a container on ECS over EC2. The container I've created has not been presenting any issues, and does not have any health checks configured for it; however, I have this service connected with ECS Service Connect. The service is in awsvps network mode. What's happening is while the container I've put in continues to function fine, the ecs-service-connect container throws that it has failed its health check and, as a result, the entire service is brought down. Then, every time the service tries to stop those tasks and start new ones, it consistently stays unhealthy. The current solution when it enters this state is to drain the EC2 instance hosting the service. Once a new instance comes up, the services start on it and run fine for some variable period of time, as long as several weeks or as short as several days. Because this container is a sidecar and not a service instance, I have been unable to find it when looking at Cloudwatch or Cloud Map, and so can't find any logs indicating why the sidecar is unhealthy.

asked 9 months ago209 views
1 Answer
-1

The issue you're experiencing with the ECS Service Connect container consistently failing its health check and bringing down your entire service is a complex problem that can have multiple causes. Here are some potential reasons and solutions to consider:

  1. Health Check Configuration: Even though you haven't configured health checks for your main container, ECS Service Connect automatically adds its own health check for the sidecar container. This health check might be too strict or not properly configured for your specific use case.

  2. Resource Constraints: The EC2 instance hosting your tasks might be running low on resources, causing the Service Connect container to fail. This could explain why draining the instance and starting on a new one temporarily resolves the issue.

  3. Network Configuration: Since you're using awsvpc network mode, there might be network-related issues affecting the Service Connect container's ability to communicate properly.

  4. Service Connect Configuration: There could be misconfigurations in your Service Connect setup, such as incorrect port mappings or namespace settings.

To troubleshoot and resolve this issue:

  1. Review your Service Connect configuration, ensuring that port mappings, namespaces, and other settings are correct.

  2. Implement custom health checks for your main container. This can help you better understand the overall health of your task and potentially prevent unnecessary restarts.

  3. Monitor the resource utilization of your EC2 instances. Consider using larger instance types or adjusting your auto-scaling settings to ensure sufficient resources are available.

  4. Enable more detailed logging for your ECS tasks, including the Service Connect container. You may need to configure additional CloudWatch log groups to capture these logs.

  5. Use the ECS Exec feature to connect to your tasks and investigate the Service Connect container's status in real-time.

  6. Consider adjusting the Service Connect container's health check parameters, such as increasing the timeout or adjusting the interval between checks.

  7. Verify that your VPC, security groups, and network ACLs are properly configured to allow the necessary traffic for Service Connect to function correctly.

If the issue persists after trying these steps, you may need to reach out to AWS Support for more in-depth assistance, as they can provide specific guidance based on your account and configuration details.

Sources
ECS connect container unhealthy during new deployments to ECS EC2 | AWS re:Post
Troubleshoot service connect issues in Amazon ECS | AWS re:Post
Troubleshoot Amazon ECS service that won't reach steady state | AWS re:Post
Failure management - Container Build Lens

answered 9 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.