Skip to content

EC2 Namespace, EBS/ disk Metrics stopped reporting on Cloudwatch | June 21, 2025 9:30PM GMT

0

EBSByteBalance%, IO balance% and disk write/read IO & bytes dropped to 0% without any process, and all AWS/EC2 EBS metrics stopped at around 9:30 PM GMT on June 21 for t2.micro EC2 instance (Namespace: AWS/EC2 on CloudWatch). This happened without any significant process running - it started with EBS bytes balance % depleting to 0 steadily (never happened before), leading to the Instance crashing and becoming unresponsive to SSH login - upon reboot these metrics disappeared altogether. EBS mounts are attached and working perfectly - no functional loss whatsoever. Have tried fetching EC2<>EBS metrics via Cloudwatch API/ CLI/ Dashboard - not visible anywhere. Tried fetching these metrics after rebooting the instance multiple times. These metrics are still not visible as of 23 June EOD. Please confirm whether a AWS backend operation/ scheduled maintanance affected this or are these metrics deprecated for t2.micro instances? If not, how to fix this for the instance? Please help!

1 Answer
0

Hello @vibster-tradex

[Step 1]: Check AWS Service Health Dashboard The first step is to check the AWS Service Health Dashboard for any reported outages or maintenance activities that might have affected CloudWatch metrics collection. This can be done by visiting the AWS Management Console and navigating to the Health Dashboard. Look for any events that occurred around June 21, 2025, 9:30 PM GMT, specifically related to CloudWatch, EC2, or EBS in the region where the EC2 instance is located. If there was a known issue, this would explain the missing metrics.

[Step 2]: Verify CloudWatch Agent Configuration Connect to the EC2 instance via SSH (if possible) and check the status of the CloudWatch agent.

Check Agent Status:

sudo systemctl status amazon-cloudwatch-agent If the agent is not running, start it:

sudo systemctl start amazon-cloudwatch-agent If the agent fails to start, proceed to the next steps to examine the configuration and logs.

Check Agent Configuration File: The CloudWatch agent configuration file is typically located at /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json. Examine this file to ensure it is correctly configured to collect the desired metrics. Pay close attention to the metrics_collected section and ensure that disk I/O metrics are included. A minimal configuration might look like this:

{ "agent": { "metrics_collection_interval": 60, "run_as_user": "cwagent" }, "metrics": { "append_dimensions": { "InstanceId": "${instance_id}" }, "metrics_collected": { "disk": { "measurement": [ "disk_free", "disk_total", "disk_used", "disk_used_percent" ], "metrics_collection_interval": 60, "resources": [ "" ] }, "diskio": { "measurement": [ "io_time", "read_bytes", "write_bytes", "reads", "writes" ], "metrics_collection_interval": 60, "resources": [ "" ] } } } } Reload Agent Configuration: After modifying the configuration file, reload the agent:

sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -s -c file:/opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json [Step 3]: Check IAM Role Permissions Verify that the IAM role associated with the EC2 instance has the necessary permissions to write metrics to CloudWatch. The role should include the CloudWatchAgentServerPolicy or a custom policy with equivalent permissions. Specifically, the policy should allow the cloudwatch:PutMetricData action.

Example IAM Policy:

{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "cloudwatch:PutMetricData", "cloudwatch:GetMetricData", "cloudwatch:GetMetricStatistics", "cloudwatch:ListMetrics", "cloudwatch:DescribeAlarms" ], "Resource": "" }, { "Effect": "Allow", "Action": [ "ec2:DescribeTags" ], "Resource": "" } ] } [Step 4]: Review CloudWatch Logs Examine the CloudWatch agent logs for any errors or warnings. The logs are typically located at /opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log. Look for any messages indicating problems with metrics collection or delivery. Common issues include permission errors, configuration errors, or network connectivity problems.

If the Answer is helpful, please click Accept Answer & UPVOTE, this can be beneficial to other community members.

answered 7 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.