ECS agent never register instance with loop in userdata script

0

Hi,

I'm trying to launch an EC2 based ECS cluster with the docker rexray plugin installed using a cloudformation adapted from https://aws.amazon.com/blogs/compute/amazon-ecs-and-docker-volume-drivers-amazon-ebs/ .

In the blog post, the user-data script waits for the ecs service to become responseive by curling in loop ecs metada url (http://localhost:51678/v1/metadata) and in my cloudformation script I'm doing exactly the same.

... 
Properties:
      UserData:
        Fn::Base64: !Sub |
          #!/bin/bash
          yum install -y aws-cfn-bootstrap
          yum update -y
          /opt/aws/bin/cfn-init -v --stack ${AWS::StackName} --resource ECSInstanceConfiguration --region ${AWS::Region}
          
          /opt/aws/bin/cfn-signal -e $? --stack ${AWS::StackName} --resource ECSScalingGroup --region ${AWS::Region}
          exec 2>>/var/log/ecs-agent-install.log
          set -x
          until curl -s http://localhost:51678/v1/metadata; do sleep 1; done
          # service ecs stop
          # docker plugin install rexray/ebs REXRAY_PREEMPT=true EBS_REGION=${AWS::Region} --grant-all permissions
          # service docker restart
          # service ecs start
      AssociatePublicIpAddress: true
      EbsOptimized: 'true'

When I launch a new stack with my user-data without the loop "until curl" part the ecs agent starts pretty quickly and register the instance with my ecs cluster properly. However by just adding the loop part, the ecs service never becomes responsive and template goes one until it fails by timeout.

I am using an ECS optimized ami (ami-04e333c875fae9d77 region sa-east-1).

What may be preventing the ecs service from starting properly? How should I adapt my script in order to install the docker plugin to add extra EBS for my volumes?

the entire LaunchConfiguration declaration can be seen below.

ECSInstanceConfiguration:
    Type: AWS::AutoScaling::LaunchConfiguration
    Metadata:
      AWS::CloudFormation::Init:
        config:
          packages:
            yum:
              jq: []
          commands:
            01_enable_ecs_cluster:
              command: !Sub | 
                cat <<EOF >> /etc/ecs/ecs.config
                ECS_CLUSTER=${ECSCluster}
                ECS_ENABLE_TASK_IAM_ROLE=true 
                ECS_ENABLE_CONTAINER_METADATA=true 
                ECS_CONTAINER_INSTANCE_PROPAGATE_TAGS_FROM=ec2_instance
                EOF
          files:
            /etc/cfn/cfn-hup.conf:
              content: !Sub |
                [main]
                stack=${AWS::StackId}
                region=${AWS::Region}
              mode: 00400
              owner: root
              group: root
            /etc/cfn/hooks.d/cfn-auto-reloader.conf:
              content: !Sub |
                [cfn-auto-reloader-hook]
                triggers=post.update
                path=Resources.ECSInstanceConfiguration.Metadata.AWS::CloudFormation::Init
                action=/opt/aws/bin/cfn-init -v --stack ${AWS::StackName} --resource ECSInstanceConfiguration --region ${AWS::Region}
                runas=root
          services:
            sysvinit:
              cfn-hup:
                enabled: 'true'
                ensureRunning: 'true'
                files:
                - "/etc/cfn/cfn-hup.conf"
                - "/etc/cfn/hooks.d/cfn-auto-reloader.conf"     
    Properties:
      UserData:
        Fn::Base64: !Sub |
          #!/bin/bash
          yum install -y aws-cfn-bootstrap
          yum update -y
          /opt/aws/bin/cfn-init -v --stack ${AWS::StackName} --resource ECSInstanceConfiguration --region ${AWS::Region}
          
          /opt/aws/bin/cfn-signal -e $? --stack ${AWS::StackName} --resource ECSScalingGroup --region ${AWS::Region}
          RET=0
          exec 2>>/var/log/ecs-agent-install.log
          set -x
          until curl -s http://localhost:51678/v1/metadata; do sleep 1; done
          # #docker plugin install --alias cloudstor:aws --grant-all-permissions docker4x/cloudstor:18.03.0-ce-aws1 CLOUD_PLATFORM=AWS AWS_REGION=${AWS::Region} EFS_SUPPORTED=0 DEBUG=1
          # service ecs stop
          # docker plugin install rexray/ebs REXRAY_PREEMPT=true EBS_REGION=${AWS::Region} --grant-all-permissions
          # service docker restart
          # service ecs start
          
      AssociatePublicIpAddress: true
      EbsOptimized: 'true'
      ImageId: !Ref ClusterInstanceImageIdParameter
      InstanceType: !Ref ClusterInstanceTypeParameter
      IamInstanceProfile: !Ref ECSInstanceProfile
      KeyName: !Ref KeyNameParameter
      SecurityGroups:
      - !Ref ECSInstanceSecurityGroup
asked 5 years ago722 views
1 Answer
0

Ok, the issue was that ecs service initialization depends on the finalization of user-data scripts. In order to change this behavior the proper user-data script is for ECS optimized Linux AMI 2 is:

 #!/bin/bash
          yum install -y aws-cfn-bootstrap
          yum update -y
          /opt/aws/bin/cfn-init -v --stack ${AWS::StackName} --resource ECSInstanceConfiguration --region ${AWS::Region}
          
          sed -i '/After=cloud-final.service/d' /usr/lib/systemd/system/ecs.service
          systemctl daemon-reload
          exec 2>>/var/log/ecs-agent-reload.log
          set -x
          until curl -s http://localhost:51678/v1/metadata; do sleep 1; done
          docker plugin install rexray/ebs REXRAY_PREEMPT=true EBS_REGION=${AWS::Region} --grant-all-permissions
          systemctl restart docker
          systemctl restart ecs
          /opt/aws/bin/cfn-signal -e $? --stack ${AWS::StackName} --resource ECSScalingGroup --region ${AWS::Region}
answered 5 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions