跳至内容

How to extend EC2 CloudWatch custom metrics and dashboards to Auto Scaling Group so new instances automatically install agent, push CPU/memory/disk metrics, and update dashboards ?

1

I have created a terraform script that launches a new EC2 instance, installs the CloudWatch agent, pushes CPU, memory, and disk metrics, and also creates a custom dashboard for these metrics. Now, I would like to extend this setup to the Auto Scaling Group (ASG) level. Specifically, whenever the load increases on the current instance, the ASG should automatically launch a new instance. On every newly launched instance, the CloudWatch agent should be installed automatically to push the custom metrics, and the custom dashboard should also be created.

已提问 2 个月前52 查看次数
3 回答
0

Bake the CloudWatch Agent into the Launch Template

ASGs don’t run “user data” every time in the same way you did for a single instance, so the Launch Template (or Launch Configuration) is where you install the agent.

In Terraform, update your launch template’s user_data to include:

#!/bin/bash
yum install -y amazon-cloudwatch-agent
/opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl \
  -a fetch-config -m ec2 \
  -c ssm:AmazonCloudWatch-linux \
  -s

The -c ssm:AmazonCloudWatch-linux pulls the config from Systems Manager (SSM) Parameter Store — so you don’t hardcode JSON in user data. That way, all instances in the ASG stay consistent.

  1. Store the CloudWatch Agent Config in SSM

Put your amazon-cloudwatch-agent.json into SSM Parameter Store (or use the provided AWS managed one).

Terraform example:

resource "aws_ssm_parameter" "cw_agent_config" {
  name  = "AmazonCloudWatch-linux"
  type  = "String"
  value = file("cw-agent-config.json")
}

This way, when the ASG spins up new instances, each pulls the same config at boot.

  1. IAM Role for Instances Your ASG instances must have an Instance Profile with permissions for: CloudWatchAgentServerPolicy

AmazonSSMManagedInstanceCore (so they can fetch the config and push metrics).

  1. Dashboards

Dashboards don’t auto-replicate per instance. Instead, design them to use wildcard dimensions or reference the ASG name rather than a specific instance ID.

Example:


{
  "metrics": [
    [ "CWAgent", "mem_used_percent", "AutoScalingGroupName", "my-asg" ]
  ]
}

That way, when new instances join the group, the dashboard automatically updates to include them.

  1. Terraform Wiring

Launch Template + ASG:

resource "aws_launch_template" "example" {
  name_prefix   = "asg-cw-"
  image_id      = data.aws_ami.amazon_linux.id
  instance_type = "t3.micro"
  user_data     = base64encode(file("user_data.sh"))
  iam_instance_profile {
    name = aws_iam_instance_profile.asg_profile.name
  }
}

resource "aws_autoscaling_group" "example" {
  desired_capacity     = 2
  max_size             = 5
  min_size             = 1
  launch_template {
    id      = aws_launch_template.example.id
    version = "$Latest"
  }
  vpc_zone_identifier = [subnet-123333, subnet67687688]
}

Dashboard: use Terraform aws_cloudwatch_dashboard with ASG-based metrics.

Result: Every new instance launched by the ASG auto-installs the CloudWatch agent, pushes CPU/memory/disk metrics, and your dashboards keep showing them without you having to update anything manually.

已回答 2 个月前
  • tried but not working can u share email id so i can share code file with you

  • Tried not working

0

Sure, please share the module snippet here

已回答 2 个月前
  • resource "aws_autoscaling_group" "web" { name="web-asg" vpc_zone_identifier=["subnet"] min_size=1 max_size=3 desired_capacity=1 launch_template { id=aws_launch_template.web.id version="$Latest" } } resource "aws_autoscaling_policy" "up" { name="scale-up" scaling_adjustment=1 adjustment_type="ChangeInCapacity" cooldown=300 autoscaling_group_name=aws_autoscaling_group.web.name } resource "aws_autoscaling_policy" "down" { name="scale-down" scaling_adjustment=-1 adjustment_type="ChangeInCapacity" cooldown=300 autoscaling_group_name=aws_autoscaling_group.web.name } resource "aws_cloudwatch_metric_alarm" "cpu_high" { alarm_name="HighCPU" metric_name="CPUUtilization" namespace="AWS/EC2" comparison_operator="GreaterThanThreshold" threshold=70 period=60 statistic="Average" evaluation_periods=2 alarm_actions=[aws_autoscaling_policy.up.arn] dimensions={AutoScalingGroupName=aws_autoscaling_group.web.name} } resource "aws_cloudwatch_metric_alarm" "cpu_low" { alarm_name="LowCPU" metric_name="CPUUtilization" namespace="AWS/EC2" comparison_operator="LessThanThreshold" threshold=20 period=60 statistic="Average" evaluation_periods=2 alarm_actions=[aws_autoscaling_policy.down.arn] dimensions={AutoScalingGroupName=aws_autoscaling_group.web.name} } resource "aws_cloudwatch_dashboard" "main" { dashboard_name="asg-dash" dashboard_body=jsonencode({ widgets=[{ type="metric",properties={ metrics=[["AWS/EC2","CPUUtilization","AutoScalingGroupName

  • Not working kindly give me proper solution

0

How to Extend Properly

  • Create an SSM Parameter with the CloudWatch Agent config (includes CPU, memory, disk, with ASG dimension).
  • Attach CloudWatchAgentServerPolicy + SSMManagedInstanceCore to the ASG’s EC2 role.
  • Update Launch Template with user_data that installs and starts the agent on boot, pulling config from SSM.
  • Update Dashboard to point to CWAgent namespace metrics (cpu_usage_active, mem_used_percent, disk_used_percent) with AutoScalingGroupName.
  • (Optional) Autoscaling Policies, instead of raw EC2 CPU, you can scale on memory/disk because you’ll now have those metrics.

=======================================================================

SSM Parameter for CW Agent config

resource "aws_ssm_parameter" "cw_agent_config" { name = "/my-asg/cwagent-config" type = "String" value = <<EOT { "metrics": { "append_dimensions": { "AutoScalingGroupName": "${aws:AutoScalingGroupName}" }, "metrics_collected": { "cpu": { "measurement": ["cpu_usage_active"] }, "mem": { "measurement": ["mem_used_percent"] }, "disk": { "measurement": ["used_percent"], "resources": ["*"] } } } } EOT }

IAM Role + Policies

resource "aws_iam_role" "ec2_role" { name = "asg-ec2-role" assume_role_policy = jsonencode({ Version = "2012-10-17", Statement = [{ Effect = "Allow", Action = "sts:AssumeRole", Principal = { Service = "ec2.amazonaws.com" } }] }) }

resource "aws_iam_role_policy_attachment" "cw_agent" { role = aws_iam_role.ec2_role.name policy_arn = "arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy" }

resource "aws_iam_role_policy_attachment" "ssm" { role = aws_iam_role.ec2_role.name policy_arn = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore" }

resource "aws_iam_instance_profile" "ec2_profile" { name = "asg-ec2-profile" role = aws_iam_role.ec2_role.name }

Launch Template with UserData to install CW Agent

resource "aws_launch_template" "web" { name_prefix = "web-lt" image_id = "ami-33333" instance_type = "t3.micro" iam_instance_profile { name = aws_iam_instance_profile.ec2_profile.name } user_data = base64encode(<<EOF #!/bin/bash yum install -y amazon-cloudwatch-agent /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl
-a fetch-config -m ec2 -c ssm:${aws_ssm_parameter.cw_agent_config.name} -s EOF ) }

Auto Scaling Group

resource "aws_autoscaling_group" "web" { name = "web-asg" vpc_zone_identifier = ["subnet-123333"] min_size = 1 max_size = 3 desired_capacity = 1

launch_template { id = aws_launch_template.web.id version = "$Latest" } }

**sample ** resource "aws_cloudwatch_dashboard" "asg_dashboard" { dashboard_name = "asg-dash" dashboard_body = jsonencode({ widgets = [ { type = "metric", properties = { metrics = [ [ "CWAgent", "cpu_usage_active", "AutoScalingGroupName", aws_autoscaling_group.web.name ] ], title = "ASG CPU Usage (%)" } }, { type = "metric", properties = { metrics = [ [ "CWAgent", "mem_used_percent", "AutoScalingGroupName", aws_autoscaling_group.web.name ] ], title = "ASG Memory Usage (%)" } }, { type = "metric", properties = { metrics = [ [ "CWAgent", "used_percent", "AutoScalingGroupName", aws_autoscaling_group.web.name ] ], title = "ASG Disk Usage (%)" } } ] }) }

已回答 1 个月前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。