Trying to build out alarm automation and running into snag on storage alarms.

0

I have the Cloud Watch Agent installed on an ec2 instance for testing. Here is my config:

{
    "agent": {
        "metrics_collection_interval": 60,
        "run_as_user": "cwagent"
    },
    "metrics": {
        "append_dimensions": {
            "AutoScalingGroupName": "${aws:AutoScalingGroupName}",
            "ImageId": "${aws:ImageId}",
            "InstanceId": "${aws:InstanceId}",
            "InstanceType": "${aws:InstanceType}"
        },
        "aggregation_dimensions" : [["InstanceId","path"]],
        "metrics_collected": {
            "disk": {
                "measurement": [
                    "used_percent"
                ],
                "metrics_collection_interval": 60,
                "resources": [
                    "*"
                ],
                "ignore_file_system_types": [
                    "sysfs", "devtmpfs", "tmpfs", "overlay", "debugfs", "squashfs", "iso9660", "proc", "autofs", "tracefs"
                ],
                "drop_device": true
            },
            "mem": {
                "measurement": [
                    "mem_used_percent"
                ],
                "metrics_collection_interval": 60
            }
        }
    }
}

I'm trying to get the cloudwatch agent to send disk stats (which it is) but when I try to create an alarm with my lambda function, the created alarms don't receive any disk stats.

My function project is in type script and the create storage alarm function has two dependency functions—one to filter through tags to set alarm property values and another to create or update existing alarms. I won't include the tag filtering functions because that function works as expected and does not create the alarm. I will however provide the storage alarm function (manageStorageAlarmForInstance) and the function that takes passed parameters from the storage alarm function and actually creates those alarms.

//function to create or update alarms: 
async function createOrUpdateAlarm(
  alarmName: string,
  instanceId: string,
  props: AlarmProps
) {
  try {
    await cloudWatchClient.send(
      new PutMetricAlarmCommand({
        AlarmName: alarmName,
        ComparisonOperator: 'GreaterThanThreshold',
        EvaluationPeriods: props.evaluationPeriods,
        MetricName: props.metricName,
        Namespace: props.namespace,
        Period: props.period,
        Statistic: 'Average',
        Threshold: props.threshold,
        ActionsEnabled: false,
        Dimensions: props.dimensions,
      })
    );
    log
      .info()
      .str('alarmName', alarmName)
      .str('instanceId', instanceId)
      .num('threshold', props.threshold)
      .num('period', props.period)
      .num('evaluationPeriods', props.evaluationPeriods)
      .msg('Alarm configured');
  } catch (e) {
    log
      .error()
      .err(e)
      .str('alarmName', alarmName)
      .str('instanceId', instanceId)
      .msg('Failed to create or update alarm due to an error');
  }
} 

//function to create storage monitoring alarms: 
async function manageStorageAlarmForInstance(
  instanceId: string,
  instanceType: string,
  imageId: string,
  tags: Tag,
  type: AlarmClassification
): Promise<void> {
  const baseAlarmName = `autoAlarm-EC2-${instanceId}-${type}StorageUtilization`;
  const thresholdKey = `autoalarm:storage-free-percent-${type.toLowerCase()}`;
  const durationTimeKey = 'autoalarm:storage-percent-duration-time';
  const durationPeriodsKey = 'autoalarm:storage-percent-duration-periods';
  const defaultThreshold = type === 'Critical' ? 10 : 20;

  const alarmProps: AlarmProps = {
    threshold: defaultThreshold,
    period: 60,
    namespace: 'disk',
    evaluationPeriods: 5,
    metricName: 'used_percent',
    dimensions: [
      {Name: 'InstanceId', Value: instanceId},
      {Name: 'ImageId', Value: imageId},
      {Name: 'InstanceType', Value: instanceType},
      {Name: 'Path', Value: '/'},
    ],
  };

  try {
    configureAlarmPropsFromTags(
      alarmProps,
      tags,
      thresholdKey,
      durationTimeKey,
      durationPeriodsKey
    );
  } catch (e) {
    log.error().err(e).msg('Error configuring alarm props from tags');
    throw new Error('Error configuring alarm props from tags');
  }
//checks to see if alarm exists
  const alarmExists = await doesAlarmExist(baseAlarmName);
  if (
    !alarmExists ||
    (alarmExists && (await needsUpdate(baseAlarmName, alarmProps))) //needsUpdate just compares the alarm props against the current alarm values 
  ) {
    await createOrUpdateAlarm(baseAlarmName, instanceId, alarmProps);
    log
      .info()
      .str('alarmName', baseAlarmName)
      .str('instanceId', instanceId)
      .msg('Storage usage alarm configured or updated.');
  } else {
    log
      .info()
      .str('alarmName', baseAlarmName)
      .str('instanceId', instanceId)
      .msg('Storage usage alarm is already up-to-date');
  }
}

Any ideas on what's wrong with the way I'm creating my storage alarms? Why wont those alarms receive data from the cloud watch agent?

Thanks in advance for the assist.

feita há um mês153 visualizações
2 Respostas
0

Hello.

I think it is necessary to first check whether metrics are being acquired, not Alarm.
Are the target CloudWatch metrics being output?
If the metrics exist, there may be a problem with your Lambda code.

profile picture
ESPECIALISTA
respondido há um mês
profile picture
ESPECIALISTA
avaliado há um mês
  • You are correct, the metrics do exist. If I go into cloudwatch and look at creating an alarm from scratch, I can see all the mount points individually for the the ec2 instance with the cloudwatch agent installed reporting storage stats for each mount point respectively. Just trying to figure out how to configure my function in my lamba correctly to do the same.

0

Have you looked at this blog...solution ready made....just read instructions very carefully! https://aws.amazon.com/blogs/mt/use-tags-to-create-and-maintain-amazon-cloudwatch-alarms-for-amazon-ec2-instances-part-1/

njoylif
respondido há 23 dias

Você não está conectado. Fazer login para postar uma resposta.

Uma boa resposta responde claramente à pergunta, dá feedback construtivo e incentiva o crescimento profissional de quem perguntou.

Diretrizes para responder a perguntas