Skip to content

Unexpected Delay in Alarm State Transition for LoadBalancer HealthyStateRouting

0

I have an alarm monitoring the LoadBalancer HealthyStateRouting metric. The alarm is set with a 1-minute period and a threshold of 1 out of 1 datapoint to trigger an alarm. However, as shown in the screenshot, the alarm state change takes a few minutes instead of occurring immediately. example

Even though the datapoints consistently report 0 when the system is unhealthy, there is a delay of about 3 minutes before the alarm transitions to the alarm state. This delay corresponds to three consecutive datapoints of 0 for HealthyStateRouting. Bu the way, the same delay occurs when transitioning back to the OK from ALARM.

I expect the alarm to transition immediately when the first datapoint of 0 is recorded. However, this delay suggests otherwise.

1 Answer
0

The observed 3-minute delay suggests that CloudWatch might be waiting for at least three valid data points to be present within its internal window, even if the alarm is set for "1 out of 1." This is often a safeguard against transient metric reporting issues. It effectively means that for a 1-minute period, it's not strictly "immediate" upon the very first reported bad datapoint, but rather after a short window confirms persistent bad data.

answered 10 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.