ECS Monitoring of services not tasks

0

I have monitoring in place for the running task count for my service currently. That being said I was wondering what the best way to monitor the service itself and not the task, I'm trying to build an alarm that will tell me when the entire service is down and not just the task count. Though if the service is down there would be no tasks up for that service so I may have already solved this? Just looking for some feedback on this and see if anyone has some insight or feedback.

nbates
asked 2 years ago666 views
1 Answer
1
Accepted Answer

The best way, IMHO, is to enable the containers insights, which will provide you with ECS/ContainersInsights metrics, which then will count number of running/pending/desired tasks and allow you to use these. I much recommend to use a combination of these values into a composite alarm, i.e compare running + pending against desired. Now if you haven't done that already, I highly recommend to set the values for HealthCheck on your containers, to avoid tasks running but actually not working. You can equally publish custom metrics, probably business related (i.e sales count, users, etc.). These metrics will only get published with a healthy service, so you can set an alarm to go into ALARM state if there are no data points for example.

profile picture
answered 2 years ago
  • I have enabled container insights. I really like the idea of the HealthChecks on the containers. Right now I have an alarm set for when desired task count is lower than what we want, which from my experience is kind of flaky for some reason? I'll look into setting healthchecks for the containers themselves. Thank you very much for the answer!

  • Hey @JohnPreston quick question, are you doing healthchecks in the task definitions or are you leaning on Route53 to do those checks?

  • @nbates I meant docker-compose like healthcheck (not Dockerfile ones), as described here: https://docs.docker.com/compose/compose-file/compose-file-v3/#healthcheck

    ELBv2 healthchecks should definitely be used as well, where applicable.

  • @JohnPreston Ok I see what you're saying. I just got the ELB and ALB healthchecks in place as well with our status codes so we're not just monitoring that the service is up but it's also responding correctly. Thank you for the docker-compose healthcheck docs this will definitely be helpful!

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions