AWS CloudFormation CloudWatch Alarm Issue

0

I'm creating a stack for CW alarm for mem_percent_used disk utilization metrics. It worked when I tried creating an alarm manually. It doesn't work after I created an alarm using CF. I have noticed that ami id, device, and fstype are missing in my CF script. How do I supply these values? Also, if I have 50 instances can I have single alarm to monitor the disk usage metrics?

asked 2 years ago564 views
1 Answer
0

ami id, device, and fstype are missing in my CF script

Those values are the 'dimensions' of the metric: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/cloudwatch_concepts.html#Dimension

A metric is a unique combination of:

  • Namespace
  • MetricName
  • Dimension(s) (optional)
  • Unit (optional) "Optional" here means that not all metrics will have them, but the alarm configuration has to exactly match the metric configuration. So if a metric has no unit and 3 dimensions; the alarm must have no unit and the exact same 3 dimensions

I have 50 instances can I have single alarm to monitor the disk usage metrics?

You can't create a single alarm to track all 50 individual instances; but you can aggregation_dimensions to the CWAgent config. This makes it so you'll have 51 metrics. Each instance will push 2 datapoints at a time

  1. To its unique instance metric (where its instanceID is one of the dimensions)
  2. To a shared metric with the aggregation_dimensions

If these instances are in an ASG you can add something like this to the 'metrics' block of the config file:

        "aggregation_dimensions": [
            [
                "AutoScalingGroupName"
            ]
        ],

https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-Agent-common-scenarios.html#CloudWatch-Agent-aggregating-metrics

You can then make an alarm based on the 'min' or 'max' statistics, so that if any individual instances datapoint within that shared metric is above/below your threshold the alarm will trigger. There isn't any way to disaggrigate the datapoints at that point to see which instance caused it, so you can use something like a search expression or the metric explorer to search through the 50 individual metrics and figure out which one is high.
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/search-expression-syntax.html https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/search-expression-syntax.html

AWS
answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions