How do I create an SSM Agent to run as a sidecar in ECS Fargate for FIS test

0

Hi so I have an ECS Fargate resource that I want to test using Fault Injection Simulator. I want to do the CPU_stress test but get this error: At least one ECS Task is not registered as a SSM Managed Instance. SSM Agent must be running as a sidecar in the task, and the task must be registered within Systems Manager as a Managed Instance. I have the following set up: In my IAM Role for ECS Task execution I have permission policies for ECSExec and for SSM. I also have a Task Definition created with 2 containers. So I am unsure what the issue is. Also unsure how else to set up the SSM Agent as a side care? Or to registered the Systems Manager as a managed instance since this is Fargate and there aren't instances in my set up. Any thoughts?

asked 7 months ago1068 views
4 Answers
1

Hello

I understand that currently when attempting to run a CPU stress test the following error appears -> "At least one ECS Task is not registered as a SSM Managed Instance. SSM Agent must be running as a sidecar in the task, and the task must be registered within Systems Manager as a Managed Instance". Please do correct me if I have misunderstood in any way.

I replicated an environment which makes use of the additional SSM agent container and below listed are the steps which were used:

1. Create an ECS cluster
2. Create an Amazon ECS task execution IAM role and add the AmazonECSTaskExecutionRolePolicy managed policy.
3. Create and add the following permissions to the AWS FIS experiment role:
    ssm:SendCommand
    ssm:ListCommands
    ssm:CancelCommand
    (In my environment, the FIS role only had the following managed policy  -> "AWSFaultInjectionSimulatorECSAccess" attached)
4. Create and add the following permissions to the Amazon ECS task IAM role:
    ssm:CreateActivation
    ssm:AddTagsToResource
    iam:PassRole
5. Create and add the following permissions to the managed instance role which would be attached to tasks registered as managed instances:
    ssm:DeleteActivation
    ssm:DeregisterManagedInstance
    (In my environment, the managed instance role had the following managed policy -> "AmazonSSMManagedInstanceCore" attached in addition to the permissions listed above)
6. Create an ECS task definition which makes use of the additional SSM agent container. Below is the SSM agent container definition which was used within the ECS task definition in my environment: 

===================SSM agent container JSON===================
{
    "name": "amazon-ssm-agent",
    "image": "public.ecr.aws/amazon-ssm-agent/amazon-ssm-agent:latest",
    "cpu": 0,
    "links": [],
    "portMappings": [],
    "essential": false,
    "entryPoint": [],
    "command": [
        "/bin/bash",
        "-c",
        "set -e; yum upgrade -y; yum install jq procps awscli -y; term_handler() { echo \"Deleting SSM activation $ACTIVATION_ID\"; if ! aws ssm delete-activation --activation-id $ACTIVATION_ID --region $ECS_TASK_REGION; then echo \"SSM activation $ACTIVATION_ID failed to be deleted\" 1>&2; fi; MANAGED_INSTANCE_ID=$(jq -e -r .ManagedInstanceID /var/lib/amazon/ssm/registration); echo \"Deregistering SSM Managed Instance $MANAGED_INSTANCE_ID\"; if ! aws ssm deregister-managed-instance --instance-id $MANAGED_INSTANCE_ID --region $ECS_TASK_REGION; then echo \"SSM Managed Instance $MANAGED_INSTANCE_ID failed to be deregistered\" 1>&2; fi; kill -SIGTERM $SSM_AGENT_PID; }; trap term_handler SIGTERM SIGINT; if [[ -z $MANAGED_INSTANCE_ROLE_NAME ]]; then echo \"Environment variable MANAGED_INSTANCE_ROLE_NAME not set, exiting\" 1>&2; exit 1; fi; if ! ps ax | grep amazon-ssm-agent | grep -v grep > /dev/null; then if [[ -n $ECS_CONTAINER_METADATA_URI_V4 ]] ; then echo \"Found ECS Container Metadata, running activation with metadata\"; TASK_METADATA=$(curl \"${ECS_CONTAINER_METADATA_URI_V4}/task\"); ECS_TASK_AVAILABILITY_ZONE=$(echo $TASK_METADATA | jq -e -r '.AvailabilityZone'); ECS_TASK_ARN=$(echo $TASK_METADATA | jq -e -r '.TaskARN'); ECS_TASK_REGION=$(echo $ECS_TASK_AVAILABILITY_ZONE | sed 's/.$//'); ECS_TASK_AVAILABILITY_ZONE_REGEX='^(af|ap|ca|cn|eu|me|sa|us|us-gov)-(central|north|(north(east|west))|south|south(east|west)|east|west)-[0-9]{1}[a-z]{1}$'; if ! [[ $ECS_TASK_AVAILABILITY_ZONE =~ $ECS_TASK_AVAILABILITY_ZONE_REGEX ]]; then echo \"Error extracting Availability Zone from ECS Container Metadata, exiting\" 1>&2; exit 1; fi; ECS_TASK_ARN_REGEX='^arn:(aws|aws-cn|aws-us-gov):ecs:[a-z0-9-]+:[0-9]{12}:task/[a-zA-Z0-9_-]+/[a-zA-Z0-9]+$'; if ! [[ $ECS_TASK_ARN =~ $ECS_TASK_ARN_REGEX ]]; then echo \"Error extracting Task ARN from ECS Container Metadata, exiting\" 1>&2; exit 1; fi; CREATE_ACTIVATION_OUTPUT=$(aws ssm create-activation --iam-role $MANAGED_INSTANCE_ROLE_NAME --tags Key=ECS_TASK_AVAILABILITY_ZONE,Value=$ECS_TASK_AVAILABILITY_ZONE Key=ECS_TASK_ARN,Value=$ECS_TASK_ARN Key=FAULT_INJECTION_SIDECAR,Value=true --region $ECS_TASK_REGION); ACTIVATION_CODE=$(echo $CREATE_ACTIVATION_OUTPUT | jq -e -r .ActivationCode); ACTIVATION_ID=$(echo $CREATE_ACTIVATION_OUTPUT | jq -e -r .ActivationId); if ! amazon-ssm-agent -register -code $ACTIVATION_CODE -id $ACTIVATION_ID -region $ECS_TASK_REGION; then echo \"Failed to register with AWS Systems Manager (SSM), exiting\" 1>&2; exit 1; fi; amazon-ssm-agent & SSM_AGENT_PID=$!; wait $SSM_AGENT_PID; else echo \"ECS Container Metadata not found, exiting\" 1>&2; exit 1; fi; else echo \"SSM agent is already running, exiting\" 1>&2; exit 1; fi"
    ],
    "environment": [
        {
            "name": "MANAGED_INSTANCE_ROLE_NAME",
            "value": "<SSMManagedInstanceRole>"
        }
    ],
    "environmentFiles": [],
    "mountPoints": [],
    "volumesFrom": [],
    "secrets": [],
    "dnsServers": [],
    "dnsSearchDomains": [],
    "extraHosts": [],
    "dockerSecurityOptions": [],
    "dockerLabels": {},
    "ulimits": [],
    "systemControls": []
}
===================End SSM agent container JSON===================

7. Kindly ensure that the SSM agent container definition has a similar configuration as above, while also changing the "<SSMManagedInstanceRole>" to the managed instance role used within the environment. 

8. Deploy a service which makes use of the relevant task definition. 

Additionally, below is a documentation which refers to the requirements needed in order to make use of the AWS FIS aws:ecs:task actions

https://docs.aws.amazon.com/fis/latest/userguide/ecs-task-actions.html

If adding the above permissions as well as the SSM agent container configuration does not solve the problem, we would require details that are non-public information. Please open a support case with AWS using the following link: https://support.console.aws.amazon.com/support/home#/case/create

I trust that this infromation will be of use to you, however if you have any other questions or concerns, please feel free to reach out.

Have a great day further!

AWS
SUPPORT ENGINEER
answered 7 months ago
0

Please can you explain a little more about your VPC setup and your Fargate service configuration from a network point of view.

SSM agent needs to be able to connect to the end points either over the internet or private endpoints. This could be a cause.

profile picture
EXPERT
answered 7 months ago
0

Hi, thank you for the answer! Yes so I have set this up and have tried this previously. And to confirm that here: 5. Create and add the following permissions to the managed instance role which would be attached to tasks registered as managed instances: ssm:DeleteActivation ssm:DeregisterManagedInstance and here: "<SSMManagedInstanceRole>" to the managed instance role This refers to the TaskExecution Role that will have ECS exec permissions. Correct?

If so, then have done all of this and still get this error: At least one ECS Task is not registered as a SSM Managed Instance. SSM Agent must be running as a sidecar in the task, and the task must be registered within Systems Manager as a Managed Instance.

Any thoughts one what else could be cause it? When I check the sidecar container it doesn't run and instead says: Stopped | Exit code: 1 Thanks

answered 7 months ago
0

Hi with regards to VPC set up - have a DNS, which is mapped to the VPC Endpoint. The VPC Endpoint routes the request through an NLB, and ALB which distributes the request to ECS on Fargate.

answered 7 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions