How do I troubleshoot Service Connect issues in Amazon ECS?

6 minute read
0

My Amazon Elastic Container Service (Amazon ECS) services can't connect to another service.

Resolution

Note: If you receive errors when you run AWS Command Line Interface (AWS CLI) commands, then see Troubleshooting errors for the AWS CLI. Also, make sure that you're using the most recent AWS CLI version.

Misconfiguration or networking issues cause service-to-service communication issues. To resolve these issues, take the following troubleshooting actions.

Check your Service Connect configuration

Verify that you activated Service Connect

Prerequisites: To use Amazon ECS Service Connect, make sure that you meet to the Service Connect requirements.

To check whether you activated Service Connect for your Amazon ECS services, run the following describe-services AWS CLI command:

aws ecs describe-services --cluster cluster-name --services service-name

Note: Replace cluster-name with your cluster name and service-name with your service name.

Example output:

"serviceConnectConfiguration": {

"enabled": true,

Note: If the Amazon ECS service must allow network traffic from other services, then make sure to set the Service Connect configuration as a client-server service. To check for this configuration, review the command output for "services" under "serviceConnectionConfiguration".

Check the namespace of your services

To use Service Connect, you must configure your Amazon ECS services in the same namespace. Also, make sure that your client service and client-server service are in the same namespace. To check the namespace of your services, run the following describe-services command:

aws ecs describe-services --cluster cluster-name --services service-name | grep namespace

Note: Replace cluster-name with your cluster name and service-name with your service name.

In the output, check the value for namespace. To update the namespace for a service, use the Amazon ECS console to update the Service Connect configuration setting. Or, run the following update-service command:

aws ecs update-service --cluster cluster-name --service service-name --service-connect-configuration enabled=true,namespace=Namespace-name --force-new-deployment

Note: Replace cluster-name with your cluster name, service-name with your service name, and Namespace-name with your namespace.

If the client service can't resolve the client-server service DNS, then you receive one of the following error messages:

  • "server can't find DNS: NXDOMAIN"
  • "server can't find example.core.staging.local: NXDOMAIN"

To resolve this issue, run the following get-namespace command to verify that you registered the namespace in AWS Cloud Map:

aws servicediscovery get-namespace --id namespace

Note: Replace namespace with your namespace ID.

Check the command's output to view the available namespaces in your AWS account and AWS Region.

To confirm that you registered your tasks in the instance namespace, complete the following steps:

  1. To get the namespace ID, run the following list-namespaces command:

    aws servicediscovery list-namespaces
  2. Use the AWS Cloud Map console or the AWS CLI to list the services in the namespace.

  3. To view the service's registered instances, run the following list-instances command:

    aws servicediscovery list-instances --service-id srv-serviceID

    Note: Replace serviceID with your service ID.

  4. If you didn't register your instance, then run the following update-service command to redeploy the tasks:

    aws ecs update-service --cluster cluster-name --service service-name --region region-name --force-new-deployment

    Note: Replace cluster-name with your cluster name, service-name with your service name, and region-name with your Region.
    Or, use the Amazon ECS console to update the service and choose Force new deployment.

Check your port mapping names

If you don't set your port mapping name in the task definition, then you receive the following error message:

"No port aliases found. Select a different task definition family and revision that has port mappings configured to use client and server mode."

To resolve this issue, update the task definition, and add a value for the name parameter under portMappings.

Example task definition:

"portMappings": [ { "name": "portmappingnameexample", "containerPort": 3000, "hostPort": 3000, "protocol": "tcp" }

Check your network ACL and security group settings

Make sure that your network access control list (network ACL) and security groups use the following configurations:

Check the connectivity between your services tasks

Complete the following steps:

  1. If you run tasks on AWS Fargate, then activate ECS Exec. If you run tasks on Amazon Elastic Compute Cloud (Amazon EC2), then proceed to step 3.

  2. Run the following execute-command command to remotely connect to the container:

    aws ecs execute-command --cluster cluster-name \
    --task task-id \
    --container container-name \
    --interactive \
    --command "/bin/sh"

    Note: Replace cluster-name with your cluster name, task-id with your task ID, and container-name with your container instance.

  3. To make sure that your connection uses the Service Connect proxy, run the following command:

    curl -I http://$IPaddress:portnumber/healthcheck

    Note: Replace IPaddress with the task's private IP address, portnumber with the container instance port, and healthcheck with the container health check path.

    Check the command's output for the server: envoy header to confirm that the connection uses the proxy. If your connection doesn't use the proxy, then make sure that your Service Connect configuration is correct.

  4. To open the /etc/hosts file, run the following command:

    cat /etc/hosts

    Check the command's output to make sure that you can see the endpoints of other services. If you don't see the service endpoints, then make sure that your Service Connect configuration is correct.

  5. If you change the Service Connect configuration, then run the following update-service command to redeploy the tasks:

    aws ecs update-service --cluster cluster-name --service service-name --region region-name --force-new-deployment

    Note: Replace cluster-name with your cluster name, service-name with your service name, and region-name with your Region.
    Or, use the Amazon ECS console to update the service and choose Force new deployment.

Redeploy your client service tasks

If you receive the "Could not resolve host" error message, then your tasks can't resolve the service endpoints. Instead, you might receive the "ping: bad address 'DNS'" error message.

To resolve these issues, redeploy the existing client service tasks.

Review your application logs

Check your application logs for connectivity or runtime errors. Amazon ECS exports logs to different destinations based on your log driver. If you use the awslogs driver, then Amazon ECS exports the logs to Amazon CloudWatch.

Related information

Amazon ECS Service Connect components

AWS OFFICIAL
AWS OFFICIALUpdated a month ago