Skip to content

Windows EC2 instance doesn't auto-register to ECS cluster

0

HI I am trying to spin up an ECS cluster to run a Docker image hosted on AWS ECR. However, there is no apparent error besides service <xyz> was unable to place a task because no container instance met all of its requirements. Reason: No Container Instances were found in your cluster.

I also executed the AWSSupport-TroubleshootECSContainerInstance runbook to get AWS System Managers to run a test for this which did not find any issue.

Few things that I double checked

  • AMI being used is the an ECS-Optimized AMI (ami-05b458f59b6df9a7a & ami-01fed81ccabd6c52a)
  • Subnet has access to Internet
  • EC2 has the required role AmazonEC2ContainerServiceforEC2Role to communicate with ECS Service
  • EC2 userdata has required code to register (see yaml below) and is populated with correct cluster name when in Running.

Results of the YAML are

  • Clusters and its service gets created
  • ECS service is in ACTIVE state and deployment stays in in progress status
  • Autoscaling group gets created
  • EC2 instance is created and in Running state (no ECS related log in System logs though)

Few odds I noticed that might give hint to what is wrong with the set up

  • Stack gets stuck at ECSService CONSISTENCY_CHECK and take at least an hour to execute (possible stuck in a loop). Probably retrying the task again and again.
  • ECS service has an event service <xyz> was unable to place a task because no container instance met all of its requirements. Reason: No Container Instances were found in your cluster. For more information
  • ECS service health and metrics tab shows "x Failed task / 0 Completed tasks" where x increases every few minutes
  • ECS Service deployment status stays in 'in progress'
  • ECS Server - container, capacity tables are empty
  • RDP to EC2 instance - There is no ProgramData folder in C drive, No service that looks like ECS is running.

YAML

AWSTemplateFormatVersion: '2010-09-09'
Description: Deploys Docker image on ECS EC2 (Windows Server 2022).
Parameters:
  ProjectName:
    Description: Name prefix for shared resources.
    Type: String
    Default: test-project-1
    AllowedPattern: ^[^\s]+$
  KeyName:
    Description: EC2 KeyPair
    Type: AWS::EC2::KeyPair::KeyName
  ImageUrl:
    Description: Docker image URL
    Type: String
  AWSRegion:
    Description: AWS Region
    Type: String
    AllowedValues:
      - eu-west-2
      - eu-west-3
  VpcId:
    Description: VPC ID
    Type: AWS::EC2::VPC::Id
  SubnetIds:
    Description: Subnet IDs (comma-separated)
    Type: List<AWS::EC2::Subnet::Id>
  RdsSgId:
    Description: Security Group ID for RDS
    Type: AWS::EC2::SecurityGroup::Id
Resources:
  ECSCluster:
    Type: AWS::ECS::Cluster
  ECSInstanceRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Effect: Allow
            Principal:
              Service: ec2.amazonaws.com
            Action: sts:AssumeRole
      ManagedPolicyArns:
        - arn:aws:iam::aws:policy/service-role/AmazonEC2ContainerServiceforEC2Role
        - arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly
        - arn:aws:iam::aws:policy/AmazonRDSFullAccess
        - arn:aws:iam::aws:policy/CloudWatchLogsFullAccess
      Policies:
        - PolicyName: AllowECRPull
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
              - Effect: Allow
                Action:
                  - ecr:GetAuthorizationToken
                  - ecr:BatchCheckLayerAvailability
                  - ecr:GetDownloadUrlForLayer
                  - ecr:BatchGetImage
                Resource: '*'
  ECSInstanceProfile:
    Type: AWS::IAM::InstanceProfile
    Properties:
      Roles:
        - !Ref ECSInstanceRole
  ECSInstanceSecurityGroup:
    Type: AWS::EC2::SecurityGroup
    Properties:
      GroupDescription: Security group for ECS EC2 instances
      VpcId: !Ref VpcId
      SecurityGroupIngress:
        - IpProtocol: tcp
          FromPort: 22
          ToPort: 22
          CidrIp: 0.0.0.0/0
        - IpProtocol: tcp
          FromPort: 80
          ToPort: 80
          CidrIp: 0.0.0.0/0
  RDSIngressRule:
    Type: AWS::EC2::SecurityGroupIngress
    Properties:
      GroupId: !Ref RdsSgId
      IpProtocol: tcp
      FromPort: 1433
      ToPort: 1433
      SourceSecurityGroupId: !Ref ECSInstanceSecurityGroup
  ECSLaunchTemplate:
    Type: AWS::EC2::LaunchTemplate
    Properties:
      LaunchTemplateData:
        ImageId: ami-05b458f59b6df9a7a
        InstanceType: t3.medium
        KeyName: !Ref KeyName
        IamInstanceProfile:
          Arn: !GetAtt ECSInstanceProfile.Arn
        SecurityGroupIds:
          - !Ref ECSInstanceSecurityGroup
        UserData: !Base64
          Fn::Sub: |-
            <powershell>
              echo "ECS_CLUSTER=${ECSCluster}" > C:\ProgramData\Amazon\ECS\ecs.config
              Start-Service -Name AmazonECS
            </powershell>
  ECSAutoScalingGroup:
    Type: AWS::AutoScaling::AutoScalingGroup
    Properties:
      VPCZoneIdentifier: !Ref SubnetIds
      LaunchTemplate:
        LaunchTemplateId: !Ref ECSLaunchTemplate
        Version: !GetAtt ECSLaunchTemplate.LatestVersionNumber
      MinSize: 1
      DesiredCapacity: 1
      MaxSize: 1
  ECSTaskDefinition:
    Type: AWS::ECS::TaskDefinition
    Properties:
      RequiresCompatibilities:
        - EC2
      Cpu: 512
      Memory: 1024
      NetworkMode: bridge
      ContainerDefinitions:
        - Name: test
          Image: !Ref ImageUrl
          Essential: true
          Memory: 512
          PortMappings:
            - ContainerPort: 80
          LogConfiguration:
            LogDriver: awslogs
            Options:
              awslogs-group: !Sub /ecs/${ProjectName}/test
              awslogs-region: !Ref AWSRegion
              awslogs-stream-prefix: ecs
  ECSService:
    Type: AWS::ECS::Service
    Properties:
      Cluster: !Ref ECSCluster
      TaskDefinition: !Ref ECSTaskDefinition
      DesiredCount: 1
      LaunchType: EC2
Outputs:
  ClusterName:
    Description: ECS Cluster Name
    Value: !Ref ECSCluster
  ServiceName:
    Description: ECS Service Name
    Value: !Ref ECSService
2 Answers
0

Based on your userdata, I suspect this is an windows AMI. The security group is incorrect as you opening port 22 (ssh port), however, that's not causing the error. (Also, make sure the AMI is ECS optimised)

For the error, can you check your check your EC2 security group and make sure it had outbound rule 443 to 0.0.0.0/0. It should use the default rule based on your cloudformation but it can also be overwritten / removed by other automation.

To troubleshoot, login to the EC2. Then run your userdata commands and see if those work or fails.

Go to the instance → Connect with RDP.
Open Services → check that AmazonECS is running.
Open PowerShell → run Get-Content C:\ProgramData\Amazon\ECS\log\ecs-agent.log

Also, you have AWSRegion parameter with multiple regions allowed. Make sure, the related parameters like ECR repository is also in the same region. Confirm that the ECR repository exist and contains the image.

EXPERT
answered a year ago
    • I got the AMI using this command. The AMI has docker pre-installed aws ssm get-parameters --names "/aws/service/ami-windows-latest/Windows_Server-2022-English-Full-ECS_Optimized" --region eu-west-2.
    • Re-troubleshoot: That concerns me as the EC2 neither has ProgramData directory (so no logs) nor it has AmazonECS service (tried running it but got not found error) I wonder why? As ECS Optmized AMI it should have ECS agent.
    • Yes, everything is in eu-west-2.
    • I did get ECR related issue but resolved by adding required permissions.
    • 443 outbound rule - I read by default EC2 allows all outbound traffic - I will verify anyway

    Yeah.. that 22 port is a copy paste issue.

0

The issue you're experiencing with your Windows EC2 instance not registering to your ECS cluster is likely related to the ECS agent configuration or initialization on your Windows instance. Here are several potential causes and solutions:

  1. Windows ECS Agent Configuration: The UserData script in your template might not be correctly configuring the ECS agent. For Windows instances, the PowerShell commands need to ensure the ECS agent is properly configured and started.

  2. Windows-specific ECS Configuration: Your UserData script is creating the ecs.config file, but you mentioned there's no ProgramData folder in the C drive. This suggests the ECS agent might not be initializing properly on your Windows instance.

  3. Network Connectivity: Even though your subnet has access to the internet, there might be security group rules, network ACLs, or routing issues preventing the EC2 instance from communicating with the ECS service endpoints.

  4. Windows ECS-Optimized AMI: Ensure you're using the correct Windows ECS-optimized AMI for your region. The AMIs you mentioned (ami-05b458f59b6df9a7a & ami-01fed81ccabd6c52a) should be verified as the latest Windows ECS-optimized AMIs.

  5. IAM Role Permissions: While you've attached the AmazonEC2ContainerServiceforEC2Role policy, there might be additional permissions needed for Windows instances.

Recommended actions:

  1. Modify your UserData script to include more robust ECS agent initialization:
<powershell>
# Create the ECS config directory if it doesn't exist
New-Item -Path "C:\ProgramData\Amazon\ECS" -ItemType Directory -Force
# Set the cluster name
[System.IO.File]::WriteAllText("C:\ProgramData\Amazon\ECS\ecs.config", "ECS_CLUSTER=${ECSCluster}")
# Restart the ECS service
Restart-Service AmazonECS -Force
</powershell>
  1. Check the Windows Event Logs on the instance for any ECS agent errors.

  2. Verify network connectivity from the instance to the ECS service endpoints.

  3. Consider adding a VPC endpoint for ECS if your instances are in a private subnet.

  4. If possible, try using a Linux-based ECS instance first to verify your overall cluster configuration is working correctly.
    Sources
    ECS cluster starts an EC2 instance, but it still says "Container instances 0" | AWS re:Post
    EC2 instance does not show up at ECS container instance | AWS re:Post
    EC2 scale from auto scaling group not start ecs agent | AWS re:Post

answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.