CloudWatch Alarm configuration

0

SCENARIO: I have a cloudwatch alarm action that triggers an SNS topic. The alarm metric is configured to filter CRITICAL events in a Lambda Log group. The Lambda (invoked every 15 minutes) checks for CloudFormation stacks in 'error' states and logs the critical event for each stack in the error state.

      Logs::MetricFilter
      FilterPattern: '{$.level="CRITICAL"}'
      MetricValue: 1

      CloudWatch::Alarm
      AlarmActions: Send to SNS Topic
      Period: 600
      TreatMissingData: notBreaching
      ComparisonOperator: GreaterThanOrEqualToThreshold
      Threshold: 1
      EvaluationPeriods: 1
      Statistic: Maximum

Cloudwatch alarm works as expected when 1 stack is in the error state:

  • Picks the CRITICAL event
  • ALARM changes state to 'In Alarm'
  • SNS Topic triggered

CHALLENGE: If any other stack goes into error (like 15 minutes later), and the initial stack is still in error, the Alarm doesn't act on it. i.e. trigger the SNS topic. I understand from research that this is normal behavior because " If your metric value is still in breach of your threshold, the alarm will remain in the ALARM state until it no longer breaches the threshold."

I have also tested this and confirmed - I used boto3 to set_alarm_state back to OK, invoked the Lambda manually, the Alarm state was changed back to 'In Alarm', and the SNS topic triggered.

QUESTION: is there any other suitable configuration or logic I can use to trigger the SNS topic for every stack in the error state?

1개 답변
1
수락된 답변

You could replace Lambda and CloudWatch with CloudFormation notifications to EventBridge. See: Using CloudFormation events to build custom workflows for post provisioning management.

profile pictureAWS
전문가
kentrad
답변함 일 년 전
  • This looks like a very viable solution. Thank you.

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠