ECS Monitoring of services not tasks

0

I have monitoring in place for the running task count for my service currently. That being said I was wondering what the best way to monitor the service itself and not the task, I'm trying to build an alarm that will tell me when the entire service is down and not just the task count. Though if the service is down there would be no tasks up for that service so I may have already solved this? Just looking for some feedback on this and see if anyone has some insight or feedback.

nbates
已提问 2 年前724 查看次数
1 回答
1
已接受的回答

The best way, IMHO, is to enable the containers insights, which will provide you with ECS/ContainersInsights metrics, which then will count number of running/pending/desired tasks and allow you to use these. I much recommend to use a combination of these values into a composite alarm, i.e compare running + pending against desired. Now if you haven't done that already, I highly recommend to set the values for HealthCheck on your containers, to avoid tasks running but actually not working. You can equally publish custom metrics, probably business related (i.e sales count, users, etc.). These metrics will only get published with a healthy service, so you can set an alarm to go into ALARM state if there are no data points for example.

profile picture
已回答 2 年前
  • I have enabled container insights. I really like the idea of the HealthChecks on the containers. Right now I have an alarm set for when desired task count is lower than what we want, which from my experience is kind of flaky for some reason? I'll look into setting healthchecks for the containers themselves. Thank you very much for the answer!

  • Hey @JohnPreston quick question, are you doing healthchecks in the task definitions or are you leaning on Route53 to do those checks?

  • @nbates I meant docker-compose like healthcheck (not Dockerfile ones), as described here: https://docs.docker.com/compose/compose-file/compose-file-v3/#healthcheck

    ELBv2 healthchecks should definitely be used as well, where applicable.

  • @JohnPreston Ok I see what you're saying. I just got the ELB and ALB healthchecks in place as well with our status codes so we're not just monitoring that the service is up but it's also responding correctly. Thank you for the docker-compose healthcheck docs this will definitely be helpful!

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则

相关内容