How do I troubleshoot errors I receive when I use ECS Exec on my Fargate tasks?

5 minute read
0

I want to troubleshoot errors that I receive when I use Amazon Elastic Container Service (Amazon ECS) Exec on my AWS Fargate tasks.

Short description

When you use Amazon ECS Exec on Fargate tasks, you might receive the following error messages:

  • An error occurred (InvalidParameterException) when calling the ExecuteCommand operation: The execute command failed because execute command was not enabled when the task was run or the execute command agent isn't running. Wait and try again or run a new task with execute command enabled and try again.
  • An error occurred (TargetNotConnectedException) when calling the ExecuteCommand operation: The execute command failed due to an internal error. Try again later.

Resolution

To resolve the preceding common errors, it's a best practice to use AWS CloudShell. CloudShell comes preinstalled with the Session Manager plugin, a capability of AWS Systems Manager, and the AWS Command Line Interface (AWS CLI).

Note:

InvalidParameterException error

If the ExecuteCommand option for your Fargate task is turned off, then you receive the InvalidParameterException error.

To resolve this issue, complete the following steps:

  1. Run the describe-tasks command to check whether the enableExecuteCommand parameter is set to true or false:

    aws ecs describe-tasks --cluster example-cluster-name --tasks example-task-id| grep enableExecuteCommand
  2. If the enableExecuteCommand parameter is false, then run the update-service command to update the parameter to true:

    aws ecs update-service --cluster example-cluster-name --service example-service --region example-region --enable-execute-command --force-new-deployment

    Note: The force-new-deployment option creates a new deployment that starts new tasks and stops old tasks based on the service's deployment configuration. For more information, see Deploy Amazon ECS services by replacing tasks.

TargetNotConnectedExceptionerror

To resolve a TargetNotConnectionException error, take the following actions:

Add the required permissions and confirm that the networking configuration is correct

Complete the following steps:

  1. Use the following policy to add the required Systems Manager permissions for your Amazon ECS task IAM role:

    {   "Version": "2012-10-17",
       "Statement": [
           {
           "Effect": "Allow",
           "Action": [
                "ssmmessages:CreateControlChannel",
                "ssmmessages:CreateDataChannel",
                "ssmmessages:OpenControlChannel",
                "ssmmessages:OpenDataChannel"
           ],
          "Resource": "*"
          }
       ]
    }
  2. If you use interface Amazon Virtual Private Cloud (Amazon VPC) endpoints with Amazon ECS, then create the following endpoints for Session Manager:
    ec2messages.region.amazonaws.com
    ssm.region.amazonaws.com
    ssmmessages.region.amazonaws.com

  3. To confirm that your AWS CLI environment and Amazon ECS cluster or task are ready for Amazon ECS Exec, run the check-ecs-exec.sh script. Make sure that you meet the prerequisites. For more information, see Amazon ECS Exec Checker on the GitHub website.
    Note: The output of the check-ecs-exec.sh script shows what you must resolve before you use ECS Exec.
    Example output:

    Prerequisites for check-ecs-exec.sh v0.7-------------------------------------------------------------
      jq      | OK (/usr/bin/jq)
      AWS CLI | OK (/usr/local/bin/aws)
    
    -------------------------------------------------------------
    Prerequisites for the AWS CLI to use ECS Exec
    -------------------------------------------------------------
      AWS CLI Version        | OK (aws-cli/2.11.0 Python/3.11.2 Linux/4.14.255-291-231.527.amzn2.x86_64 exec-env/CloudShell exe/x86_64.amzn.2 prompt/off)
      Session Manager Plugin | OK (1.2.398.0)
    
    -------------------------------------------------------------
    Checks on ECS task and other resources
    -------------------------------------------------------------
    Region : us-east-1
    Cluster: Fargate-Testing
    Task   : ca27e41ea3f54fd1804ca00feffa178d
    -------------------------------------------------------------
      Cluster Configuration  | Audit Logging Not Configured
      Can I ExecuteCommand?  | arn:aws:iam::12345678:role/Admin
         ecs:ExecuteCommand: allowed
         ssm:StartSession denied?: allowed
      Task Status            | RUNNING
      Launch Type            | Fargate
      Platform Version       | 1.4.0
      Exec Enabled for Task  | NO
      Container-Level Checks | 
        ----------
          Managed Agent Status - SKIPPED
        ----------
        ----------
          Init Process Enabled (Exec-check:2)
        ----------
             1. Disabled - "nginx"
        ----------
          Read-Only Root Filesystem (Exec-check:2)
        ----------
             1. Disabled - "nginx"
      Task Role Permissions  | arn:aws:iam::12345678:role/L3-session
         ssmmessages:CreateControlChannel: implicitDeny
         ssmmessages:CreateDataChannel: implicitDeny
         ssmmessages:OpenControlChannel: implicitDeny
         ssmmessages:OpenDataChannel: implicitDeny
      VPC Endpoints          | SKIPPED (vpc-abcd - No additional VPC endpoints required)
      Environment Variables  | (Exec-check:2)
           1. container "nginx"
           - AWS_ACCESS_KEY: not defined
           - AWS_ACCESS_KEY_ID: not defined
           - AWS_SECRET_ACCESS_KEY: not defined

    The preceding output shows that ECS Exec is turned off for the task and that the task role doesn't have the required Systems Manager permissions.

  4. Check if you configured IAM user credentials at the container level, such as an access key or secret access key specification. When you configure IAM user credentials at the container level, you override the permissions at the task level and this causes an error.

Use ECS Exec to get into the container with the correct shell

Different base images can have different shells within them. If you use incorrect shells, then you might receive errors. Make sure that you're using your correct shell according to your application image.

To use ECS Exec to get into the container, run the execute-command command:

aws ecs execute-command --region example-region --cluster example-cluster --container example-container --task example-task --command "example_shell" --interactive

Generate logs for ECS Exec to identify issues

To determine why ECS Exec isn't working within your Fargate task, generate AWS Systems Manager Agent (SSM Agent) logs. Run the following command in the environment section of the container definition:

Console:

bin/bash,-c,sleep 2m && cat /var/log/amazon/ssm/amazon-ssm-agent.log

JSON:

"/bin/bash","-c","sleep 2m && cat /var/log/amazon/ssm/amazon-ssm-agent.log"

If you use awslogs log driver, then the preceding commands generate SSM Agent logs, and transfer them to the Amazon CloudWatch log group. If you use other log drivers or logging endpoints, then the SSM Agent logs transfer to those locations.

JSON Example:

"entryPoint": [],      "portMappings": [],
      "command": [
        "bin/bash",
        "-c",
        "sleep 2m && cat /var/log/amazon/ssm/amazon-ssm-agent.log"
      ],

Note: Different applications have different shells and editors. Review and modify command parameters for your application's requirements.

Related information

Using ECS Exec

AWS OFFICIAL
AWS OFFICIALUpdated a month ago
2 Comments

TargetNotConnectedException can also occur in an additional context.

If you add the following folders as bind mounts to be able to use ECS Exec when readOnlyRootFileSystem is enabled,

/managed-agents 
/var/lib/amazon/ssm
/var/log/amazon/ssm

then make sure, if you have multiple containers in the task, that you don't use the same host volumes for multiple containers. Sharing volumes between containers for that purpose will lead the Managed Agent to fail and eventually to the TargetNotConnectedException too.

replied 3 months ago

Thank you for your comment. We'll review and update the Knowledge Center article as needed.

profile pictureAWS
MODERATOR
replied 3 months ago