CloudWatch memory usage alert triggers, but metrics show no corresponding event

0

We set up an alert to trigger when our ECS containers breach 95% of memory usage. On one instance this alert triggers now multiple times a day, even though the metrics show a pretty stable utilization around 73%. No spikes and no missing data.

Here is the data from today (April 24, 2024) with alerts around 17:30, but no corresponding spike in the above metric: Alarm overview 2024-04-24

And here the view in the metrics, showing all relevant items (memory reserved, memory utilized and the percentage calculation), but no events around 17:30: Metrics overview 2024-04-24

We need guidance on what is going on here. This alert is currently absolutely useless as it triggers without an actual problem.

Thanks

질문됨 25일 전131회 조회
2개 답변
6

I would like to suggest some changes to resolve this issue: -

1.Verify Alert Configuration: Check that the alert threshold is correctly set at 95% and targets the right ECS containers.

2.Confirm Metric Accuracy: Double-check the metric data in CloudWatch Metrics to ensure it accurately reflects memory usage.

3.Review ECS Setup: Check ECS container configurations and investigate for any memory-intensive tasks or issues.

4.Monitor Alarm State Changes: Look into CloudWatch Alarm History for patterns in alarm triggers.

5.Adjust Alert Actions: Review and adjust alert actions if necessary, ensuring they are appropriate.

go through with documents: - https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/AlarmThatSendsEmail.html#common-features-of-alarms

profile picture
답변함 25일 전
  • I double and triple-checked the whole setup and all looks correct. We have the same setup for a list of other ECS instances and we have this problem only for this ECS. It might have been temporary, because it hasn't happen since.

1
수락된 답변

Hello.

What is the setting for CloudWatch Alarm's treat missing data?
Depending on the contents of this setting, an alarm may occur even if the metrics appear normal.
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/AlarmThatSendsEmail.html#alarms-and-missing-data

You may also be able to check the reason for the alarm by looking at the CloudWatch Alarm history.
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/AlarmThatSendsEmail.html#common-features-of-alarms

profile picture
전문가
답변함 25일 전
  • It is set to treat missing data as missing, but we are evaluating percentiles with low samples. So I will take a look at that setting

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인