使用AWS re:Post即您表示您同意 AWS re:Post 使用条款

Terraform Auto Scaling Group with create_before_destroy - timeout error

0

My terrafrom with autoscaling_group looks like this below.

My goal: application updates without downtime with only 1 EC2 instance.

Healtcheck shows that the new isntance is healthy: Everything works ok when it comes to updating ami, then launuch template and creating a new auto scaling group, deleting the old one and replacing ec2 in target groups.

Healtcheck shows that the new isntance is unhealthy: I wanted to see what happens if the new ami is damaged and healtcheck shows that the EC2 is unhealthy. Result: terrafrom terminates after timeout with error:

timeout while waiting for state to become 'ok' (last state: 'want at least 1 healthy instance(s) registered to Load Balancer, have 0', timeout: 5m0s)

This is still ok but worst of all it leaves behind a new ASG and EC2 created, which I have to clean manually. Is there any way to check when executing terrafrom that the EC2 instance is still unhealthy after some time, abort its creation and destroy it?

resource "aws_launch_template" "foo" {

  name_prefix = "demo-app-${data.aws_ami.debian.id}"
  ebs_optimized = true
  image_id = data.aws_ami.debian.id
  instance_type = "t3.micro"

  iam_instance_profile {
    name = aws_iam_instance_profile.test_profile.name
  }

  monitoring {
    enabled = true
  }

  vpc_security_group_ids = [module.sg.ec2_security_group_id]

  tag_specifications {
    resource_type = "instance"

    tags = {
      Name = "test"
    }
  }
  user_data = filebase64("setup_app.sh")

  lifecycle {
    create_before_destroy = true
  }
}


resource "aws_autoscaling_group" "worker" {
  name = "${aws_launch_template.foo.name}-asg-test3"

  min_size             = 1
  desired_capacity     = 1
  max_size             = 1
  min_elb_capacity     = 1
  wait_for_capacity_timeout = "5m"
  health_check_type    = "EC2"
  force_delete         = true
  vpc_zone_identifier  = module.vpc.private_subnets
  target_group_arns         = [aws_lb_target_group.main_tg.arn]

  launch_template {
    id      = aws_launch_template.foo.id
    version = "$Latest"
  }

  lifecycle {
    create_before_destroy = true
    prevent_destroy       = false
  }

}

I tried something like this inside the resource aws_autoscaling_group but it doesn't work:

  provisioner "local-exec" {
    when = create
    command = "timeout 2m terraform destroy -target=aws_autoscaling_group.worker -auto-approve -force"
  }
已提问 1 年前1306 查看次数
1 回答
0

Option 1: You can add a timeout related to create and delete operations to a specific resource inside terraform like this. Add this to your autoscaling group and check.

 timeouts {
    create = "60m"
    delete = "2h"
  }

Option 2: if the option1 doesn't work, you could have a lambda triggered based on the lifecycle hook - termination event of the ASG to clean up the old Asg. The old Asg could be identified based on any of the ASG metadata (like Tags -name)

Check this for steps - https://docs.aws.amazon.com/autoscaling/ec2/userguide/tutorial-lifecycle-hook-lambda.html

已回答 1 年前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则