Skip to content

How do I troubleshoot errors that I receive when I use ECS Exec on my Fargate tasks?

6 minute read
1

I want to troubleshoot errors that I receive when I use Amazon Elastic Container Service (Amazon ECS) Exec on my AWS Fargate tasks.

Short description

When you use ECS Exec on Fargate tasks, you might receive one of the following error messages:

  • "An error occurred (InvalidParameterException) when calling the ExecuteCommand operation: The execute command failed because execute command was not enabled when the task was run or the execute command agent isn't running. Wait and try again or run a new task with execute command enabled and try again."
  • "An error occurred (TargetNotConnectedException) when calling the ExecuteCommand operation: The execute command failed due to an internal error. Try again later."

Resolution

Note: If you receive errors when you run AWS Command Line Interface (AWS CLI) commands, then see Troubleshooting errors for the AWS CLI. Also, make sure that you're using the most recent AWS CLI version.

To resolve common errors that occur when you use ECS Exec on Fargate tasks, it's a best practice to use AWS CloudShell. CloudShell comes preinstalled with the AWS Systems Manager Session Manager Agent (SSM Agent) and the AWS CLI.

InvalidParameterException error

If the ExecuteCommand option for your Fargate task is deactivated, then you receive the InvalidParameterException error.

To resolve this issue, complete the following steps:

  1. Run the describe-tasks command to check whether the enableExecuteCommand parameter is set to true or false:
    aws ecs describe-tasks --cluster example-cluster-name --tasks example-task-id| grep enableExecuteCommand
    Note: Replace example-cluster-name with your cluster and example-task-id with your task ID.
  2. If the enableExecuteCommand parameter is false, then run the following update-service command to update the parameter to true:
    aws ecs update-service --cluster example-cluster-name --service example-service --region example-region --enable-execute-command --force-new-deployment
    Note: Replace example-cluster-name with your cluster, example-service with your service, and example-region with your AWS Region. The force-new-deployment option creates a new deployment that starts new tasks and stops earlier tasks based on the service's deployment configuration. If your services use blue/green deployment through AWS CodeDeploy, then instead of force-new-deployment, initiate a CODE_DEPLOY deployment. You can't use force-new-deployment for blue/green deployment because this option launches a rolling update.
  3. Run the following describe-tasks command to check the status of ExecuteCommandAgent:
    aws ecs describe-tasks --cluster example-cluster-name --tasks example-task-id | grep -A 6 managedAgents
    Note: Replace example-cluster-name with your cluster and example-task-id with your task ID.
  4. Check the command's output to check the state of the ExecuteCommand agent. If the lastStatus of ExecuteCommandAgent isn't RUNNING, then check the ExecuteCommandAgent agent logs to identify the root cause. Proceed to the Generate logs for ECS Exec to identify issues troubleshooting steps to generate the ExecuteCommandAgent logs.
    If ExecuteCommandAgent can't retrieve credentials because you configured a proxy in the container, then add the following NO_PROXY option to your container instance configuration files:
    env no_proxy=169.254.169.254,169.254.170.2

TargetNotConnectedExceptionerror

To resolve a TargetNotConnectionException error, take the following actions.

Add the required permissions and confirm that the networking configuration is correct

Complete the following steps:

  1. Add the required permissions to your Amazon ECS task AWS Identity and Access Management (IAM) role. If the task IAM role already has the required permissions, then check whether any service control policies (SCPs) block the task's connection to SSM Agent.
  2. If you use Amazon Virtual Private Cloud (Amazon VPC) interface endpoints with Amazon ECS, then create the following endpoints:
    ec2messages.region.amazonaws.com
    ssm.region.amazonaws.com
    ssmmessages.region.amazonaws.com
    Note: Replace region with your Region.
  3. To confirm that your AWS CLI environment and Amazon ECS cluster or task are ready for ECS Exec, run the check-ecs-exec.sh script. For information about prerequisites and usage, see Amazon ECS Exec Checker on the GitHub website.
    The output of the check-ecs-exec.sh script shows what you must resolve before you use ECS Exec. Example output:
    Prerequisites for check-ecs-exec.sh v0.7-------------------------------------------------------------  jq      | OK (/usr/bin/jq)
      AWS CLI | OK (/usr/local/bin/aws)
    
    -------------------------------------------------------------
    Prerequisites for the AWS CLI to use ECS Exec
    -------------------------------------------------------------
      AWS CLI Version        | OK (aws-cli/2.11.0 Python/3.11.2 Linux/4.14.255-291-231.527.amzn2.x86_64 exec-env/CloudShell exe/x86_64.amzn.2 prompt/off)
      Session Manager Plugin | OK (1.2.398.0)
    
    -------------------------------------------------------------
    Checks on ECS task and other resources
    -------------------------------------------------------------
    Region : us-east-1
    Cluster: Fargate-Testing
    Task   : ca27e41ea3f54fd1804ca00feffa178d
    -------------------------------------------------------------
      Cluster Configuration  | Audit Logging Not Configured
      Can I ExecuteCommand?  | arn:aws:iam::12345678:role/Admin
         ecs:ExecuteCommand: allowed
         ssm:StartSession denied?: allowed
      Task Status            | RUNNING
      Launch Type            | Fargate
      Platform Version       | 1.4.0
      Exec Enabled for Task  | NO
      Container-Level Checks | 
        ----------
          Managed Agent Status - SKIPPED
        ----------
        ----------
          Init Process Enabled (Exec-check:2)
        ----------
             1. Disabled - "nginx"
        ----------
          Read-Only Root Filesystem (Exec-check:2)
        ----------
             1. Disabled - "nginx"
      Task Role Permissions  | arn:aws:iam::12345678:role/L3-session
         ssmmessages:CreateControlChannel: implicitDeny
         ssmmessages:CreateDataChannel: implicitDeny
         ssmmessages:OpenControlChannel: implicitDeny
         ssmmessages:OpenDataChannel: implicitDeny
      VPC Endpoints          | SKIPPED (vpc-abcd - No additional VPC endpoints required)
      Environment Variables  | (Exec-check:2)
           1. container "nginx"
           - AWS_ACCESS_KEY: not defined
           - AWS_ACCESS_KEY_ID: not defined
           - AWS_SECRET_ACCESS_KEY: not defined
    The preceding output shows that ECS Exec is turned off for the task and that the task role doesn't have the required Systems Manager permissions. Note: You must set the ReadonlyRootFilesystem parameter to false in the task definition to run ECS Exec. If ReadonlyRootFileSystem is true, then the SSM Agent can't create the required directories.

Check whether you configured IAM user credentials at the container level, such as an access key or secret access key. SSM Agent uses the AWS SDK for Java when it checks authentication. If you configure the access key or secret access key in the container instance as environment variables, then you override task-level permissions. To use ECS Exec, the IAM credentials at the container level must provide permissions for the SSM Agent.

Use ECS Exec to get into the container with the correct shell

Different base images can have different shells within them. If you use the incorrect shell, then you receive errors. Make sure that you're using your correct shell based on your application image.

To use ECS Exec to get into the container, run the execute-command command:

aws ecs execute-command --region example-region --cluster example-cluster --container example-container --task example-task --command "example_shell" --interactive

Note: Replace example-region with your Region, example-cluster with your cluster name, example-container with your container instance name, and example-task with your task name.

Generate logs for ECS Exec to identify issues

To determine why ECS Exec isn't working, run the following command in the environment section of the container definition to generate SSM Agent logs:

Console:

bin/bash,-c,sleep 2m && cat /var/log/amazon/ssm/amazon-ssm-agent.log

JSON:

"/bin/bash","-c","sleep 2m && cat /var/log/amazon/ssm/amazon-ssm-agent.log"

Note: Different applications have different shells and editors. Modify the preceding command parameters for your application's requirements.

If you use awslogs log driver, then the preceding commands generate SSM Agent logs, and transfer them to the Amazon CloudWatch log group. If you use other log drivers or logging endpoints, then the SSM Agent logs transfer to those locations.

JSON example:

"entryPoint": [],      "portMappings": [],      "command": [
        "bin/bash",
        "-c",
        "sleep 2m && cat /var/log/amazon/ssm/amazon-ssm-agent.log"
      ],

Related information

Using ECS Exec

AWS OFFICIALUpdated 6 months ago
2 Comments

TargetNotConnectedException can also occur in an additional context.

If you add the following folders as bind mounts to be able to use ECS Exec when readOnlyRootFileSystem is enabled,

/managed-agents 
/var/lib/amazon/ssm
/var/log/amazon/ssm

then make sure, if you have multiple containers in the task, that you don't use the same host volumes for multiple containers. Sharing volumes between containers for that purpose will lead the Managed Agent to fail and eventually to the TargetNotConnectedException too.

replied a year ago

Thank you for your comment. We'll review and update the Knowledge Center article as needed.

AWS
MODERATOR
replied a year ago