ECS Monitoring of services not tasks
I have monitoring in place for the running task count for my service currently. That being said I was wondering what the best way to monitor the service itself and not the task, I'm trying to build an alarm that will tell me when the entire service is down and not just the task count. Though if the service is down there would be no tasks up for that service so I may have already solved this? Just looking for some feedback on this and see if anyone has some insight or feedback.
The best way, IMHO, is to enable the containers insights, which will provide you with ECS/ContainersInsights metrics, which then will count number of running/pending/desired tasks and allow you to use these. I much recommend to use a combination of these values into a composite alarm, i.e compare running + pending against desired. Now if you haven't done that already, I highly recommend to set the values for HealthCheck on your containers, to avoid tasks running but actually not working. You can equally publish custom metrics, probably business related (i.e sales count, users, etc.). These metrics will only get published with a healthy service, so you can set an alarm to go into ALARM state if there are no data points for example.
I have enabled container insights. I really like the idea of the HealthChecks on the containers. Right now I have an alarm set for when desired task count is lower than what we want, which from my experience is kind of flaky for some reason? I'll look into setting healthchecks for the containers themselves. Thank you very much for the answer!
Hey @JohnPreston quick question, are you doing healthchecks in the task definitions or are you leaning on Route53 to do those checks?
@nbates I meant docker-compose like healthcheck (not Dockerfile ones), as described here: https://docs.docker.com/compose/compose-file/compose-file-v3/#healthcheck
ELBv2 healthchecks should definitely be used as well, where applicable.
@JohnPreston Ok I see what you're saying. I just got the ELB and ALB healthchecks in place as well with our status codes so we're not just monitoring that the service is up but it's also responding correctly. Thank you for the docker-compose healthcheck docs this will definitely be helpful!
Which role do I have to use for the Fargate tasks on AWS Batch?Accepted Answerasked 6 months ago
How to solve the ECS Error: You've reached the limit on the number of tasks you can run concurrently.asked 6 months ago
Monitoring The body of the Workmail emailasked 2 days ago
ECS Monitoring of services not tasksAccepted Answerasked 3 months ago
The logs aren't updating in cloudwatch log group for an ECS taskasked 14 days ago
ECS services not scaling in (scale in protection is disabled)asked 21 days ago
ECS Task Groupsasked 3 years ago
[service] was unable to place a task. Reason: You've reached the limit on the number of tasks you can run concurrently.asked 3 months ago
Are the environment variables used in the task definitions for the ECS service encrypted?Accepted Answerasked 6 years ago
Cannot deploy to Fargate with 4 tasks - Limit reached for concurrent tasksasked a year ago