Received "ALARM: "FreeSpace90GBAlarm" in US East (Ohio)" but can't tell which RDS instance

0

Hi re:Post,

This morning we received an alarm we setup months ago, "ALARM: "FreeSpace90GBAlarm" in US East (Ohio)". For our RDS PostgreSQL (by the way there is not a Tag available below for "RDS PostgreSQL" )

But looking in the email text, and the cloudwatch alarm console, and the RDS console:

We are not able to identify which RDS instance the alarm is referring to !

Please advise.

Thank you for your time and help!

Best Regards,

Donald

Email Alarm text:

You are receiving this email because your Amazon CloudWatch Alarm "FreeSpace90GBAlarm" in the US East (Ohio) region has entered the ALARM state, because "Threshold Crossed: 1 out of the last 1 datapoints [2.0710719488E10 (13/10/24 04:52:00)] was less than or equal to the threshold (9.663676416E10) (minimum 1 datapoint for OK -> ALARM transition)." at "Sunday 13 October, 2024 04:53:52 UTC".

View this alarm in the AWS Management Console:
https://us-east-2.console.aws.amazon.com/cloudwatch/deeplink.js?region=us-east-2#alarmsV2:alarm/FreeSpace90GBAlarm

Alarm Details:
- Name:                       FreeSpace90GBAlarm
- Description:                FreeSpace90GB Alarm test to dc email
- State Change:               OK -> ALARM
- Reason for State Change:    Threshold Crossed: 1 out of the last 1 datapoints [2.0710719488E10 (13/10/24 04:52:00)] was less than or equal to the threshold (9.663676416E10) (minimum 1 datapoint for OK -> ALARM transition).
- Timestamp:                  Sunday 13 October, 2024 04:53:52 UTC
- AWS Account:                910286192445
- Alarm Arn:                  arn:aws:cloudwatch:us-east-2:910286192445:alarm:FreeSpace90GBAlarm

Threshold:
- The alarm is in the ALARM state when the metric is LessThanOrEqualToThreshold 9.663676416E10 for at least 1 of the last 1 period(s) of 60 seconds.

Monitored Metric:
- MetricNamespace:                     AWS/RDS
- MetricName:                          FreeStorageSpace
- Dimensions:                         
- Period:                              60 seconds
- Statistic:                           Minimum
- Unit:                                not specified
- TreatMissingData:                    breaching

Enter image description here Enter image description here Enter image description here

2 Answers
1
Accepted Answer

This response was not generated by gen AI. Ha ha.

I have these comments:

  1. To find the metric in the RDS console:
Navigate to your instance.
Select Monitoring
Check that "Monitoring" is selected in the dropdown.
You could search for FreeStorageSpace
You should see a graph of FreeStorageSpace
  1. It looks like you have set up the alarm from the CloudWatch console. However, the alarm e-mail says:
- Dimensions:    (none)

The alarm will not work unless Dimensions is set. The dimension of each alarm should refer to one DB instance. Try this:

CloudWatch
All alarms
Create alarm
Select metric
Enter FreeStorageSpace
You should see multiple choices, such as:
RDS > DBInstanceIdentifier
RDS > Across All Databases
RDS > DatabaseClass
RDS > EngineName
TrustedAdvisor > Check Metrics

Click on RDS > DBInstanceIdentifier
Select the DBInstanceIdentifier that you want
Click SelectMetric
Click Lower
Enter 96636764160
Click Next
Select your notification
Click Next
Enter a name
Click Next
Press Create Alarm
  1. Some metrics are absolute values, not percentages. In that case, the threshold would need to be different for each DB instance. It is impractical to configure alarms for a high number of DB Instances on the console because it is tedious and error-prone. Also, the approach does not account for new instances. You could automate the configuration with the python boto3 library. You could get the list of db instances and the sizes of each by calling rds describe_db_instances.

You could make a call to cloudwatch such as:

put_metric_alarm(
                AlarmName='FreeSpace90GBAlarm',
                AlarmDescription='FreeSpace90GB Alarm test to dc email',
                MetricName='FreeStorageSpace',
                Namespace='AWS/RDS',
                Dimensions=[
                    {
                        'Name': 'DBInstanceIdentifier',
                        'Value': db_instance_identifier
                    }
                ],
                Statistic='Minimum',
                Period=60,
                EvaluationPeriods=2,
                Threshold=threshold,
                ComparisonOperator='LessThanThreshold',
                AlarmActions=[
                    'arn:aws:sns:us-east-2:910286192445:some-topic'
                ]
            )

where db_instance_identifier comes from the loop and threshold you compute based on db instance AllocatedStorage.

I implemented this a few years back, but deleted that code on complexity grounds.

  1. RDS has percent-based events instead. You can subscribe to them on the RDS console. The RDS event thresholds are not configurable. The storage threshold starts at 10% remaining. You get an alert at each 1% step down to 0%.
answered 2 months ago
  • Thank you very much Brian for your in depth and verbose reply! Makes sense that we don't know which instance because we did not specify that in the alarm we set up, duh! :) I guess we thought the alarm applied to any instance but that is not the case. Regarding the code example you gave us for Dimensions " Dimensions=[ { 'Name': 'DBInstanceIdentifier', 'Value': db_instance_identifier } ], " So that would only work for (1) db instance, correct? Oops, missed this " The dimension of each alarm should refer to one DB instance" so only 1, got it. Is there a way to set (1) alarm that would apply to any db instance that meets the alarms thresholds, ala "Across All Databases", or do we have to setup an alarm for each db instance? ( no problem if that is the case). Thanks again, Donald

0

I understand your frustration with not being able to identify which specific RDS PostgreSQL instance triggered the "FreeSpace90GBAlarm" in the US East (Ohio) region. This is indeed a common issue when setting up CloudWatch alarms without proper identification metrics. Here are some steps you can take to identify the affected instance and improve your alarm setup:

  1. Check the CloudWatch alarm details: In the CloudWatch console, locate the "FreeSpace90GBAlarm" and check its configuration. Look for any dimensions or tags that might help identify the specific RDS instance.

  2. Review your RDS instances: In the RDS console, list all your PostgreSQL instances in the US East (Ohio) region. Look for instances with low free storage space that matches the alarm threshold.

  3. Use CloudWatch Metrics: In the CloudWatch console, go to the Metrics section and find the AWS/RDS namespace. Look for the FreeStorageSpace metric and check which instance has a value close to or below the alarm threshold (96.63 GB in this case).

  4. Improve your alarm configuration: To avoid this issue in the future, modify your CloudWatch alarm to include the DBInstanceIdentifier dimension. This will allow you to easily identify which specific RDS instance triggered the alarm.

  5. Use meaningful alarm names: Consider renaming your alarms to include the instance identifier, e.g., "FreeSpace90GB-[InstanceName]Alarm". This will make it immediately clear which instance is affected when you receive an alarm notification.

  6. Leverage tags: If you're not already doing so, use tags on your RDS instances to categorize and identify them easily. You can then include these tags in your CloudWatch alarms for better identification.

  7. Set up CloudWatch Logs: Enable CloudWatch Logs for your RDS PostgreSQL instances. This will allow you to correlate log data with the alarm, potentially providing more context about the issue.

By implementing these suggestions, you'll be able to more easily identify which RDS instance is triggering alarms in the future, and you'll have a more robust monitoring setup overall.
Sources
Introducing Amazon CloudWatch Alarm Recommendations | AWS Cloud Operations & Migrations Blog
Security Hub controls for Amazon RDS - AWS Security Hub

profile picture
answered 2 months ago
  • Thank you for your quick reply. Regarding #2: "Review your RDS instances: In the RDS console, list all your PostgreSQL instances in the US East (Ohio) region. Look for instances with low free storage space that matches the alarm threshold." We cannot find any metric there in the RDS console which lists any storage variable, amount, or any indication that any instance has low free storage space. Where is this metric located in the RDS console? Thanks! Best Regards, Donald

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions