How do I get my Amazon ECS tasks that use the Amazon EC2 launch type to pass the Application Load Balancer health check?

10 minute read
0

I want to troubleshoot and resolve issues with Application Load Balancer health checks for Amazon Elastic Container Service (Amazon ECS) tasks that run on my Amazon Elastic Compute Cloud (Amazon EC2) instances.

Short description

When your Amazon ECS task fails the load balancer health check, you receive one of the following errors from your Amazon ECS service event message:

  • "(service AWS-service) (port 8080) is unhealthy in (target-group arn:aws:elasticloadbalancing:us-east-1:111111111111:targetgroup/aws-targetgroup/123456789) due to (reason Health checks failed with these codes: [502 or 504]) or (request timeout)"
  • "(service AWS-Service) (port 8080) is unhealthy in target-group tf-20190411170 due to (reason Health checks failed)"
  • "(service AWS-Service) (instance i-1234567890abcdefg) (port 443) is unhealthy in (target-group arn:aws:elasticloadbalancing:us-east-1:111111111111:targetgroup/aws-targetgroup/123456789) due to (reason Health checks failed)"

You might receive the following error from your Amazon ECS task console:

"Task failed ELB health checks in (target-group arn:aws:elasticloadbalancing:us-east-1:111111111111:targetgroup/aws-targetgroup/123456789)"

For failed container health checks, see How do I troubleshoot container health check failures for Amazon ECS tasks?

To determine why your Amazon ECS task stopped, see Viewing Amazon ECS stopped task errors and Why is my Amazon ECS task stopped?

Resolution

Note: If you receive errors when you run AWS Command Line Interface (AWS CLI) commands, then see Troubleshooting errors for the AWS CLI. Also, make sure that you're using the most recent AWS CLI version.

Configure different security groups

It's a best practice to configure different security groups to allow all traffic between your load balancers and container instances or task elastic network interface. You can also configure your container instances to accept traffic on the port that's specified for the task.

In your configuration, check the following settings:

  • The security group that's associated with your load balancer allows outbound traffic to your container instances or task network interface on the registered port. Also, allow outbound traffic to your container instances on the health check port.
  • Allow inbound traffic on the task host port range from the security group that's associated with your load balancer.

Turn on the Availability Zone for your load balancer

When you configure an Availability Zone for your load balancer, Elastic Load Balancing creates a load balancer node in the Availability Zone. If you register targets in the Availability Zone but don't turn on the Availability Zone, then the registered targets don't receive traffic. For more information, see Availability Zones and load balancer nodes.

To determine the Availability Zones that your load balancer is configured for, complete the following steps:

  1. Open the Amazon EC2 console.
  2. In the navigation pane, under Load Balancing, choose Load balancers.
  3. Select the load balancer that you're using for your Amazon ECS service.
  4. On the Description tab, you can view the Availability Zones.

Or, run the describe-load-balancers AWS CLI command:

aws elbv2 describe-load-balancers --load-balancer-arns EXAMPLE-ALB-ARN --query 'LoadBalancers[*].AvailabilityZones[].{Subnet:SubnetId}'

Note: Replace EXAMPLE-ALB-ARN with your Application Load Balancer's ARN.

To determine the Availability Zones that your container instances are configured for, complete the following steps:

  1. Open the Amazon EC2 console.
  2. In the navigation pane, under Auto Scaling, choose Auto Scaling groups.
  3. Select the container instance Auto Scaling group that's associated to your cluster.
  4. On the Details tab, under Network, verify that the Availability Zones match the Availability Zones for your load balancer.

Or, run the describe-auto-scaling-groups AWS CLI command:

aws autoscaling describe-auto-scaling-groups --auto-scaling-group-names EXAMPLE-ASG-NAME --query 'AutoScalingGroups[*].{Subnets:VPCZoneIdentifier}' --output text

Note: Replace EXAMPLE-ASG-NAME with your Auto Scaling group's name.

To modify your cluster's Availability Zones, complete the following steps:

  1. Open the AWS CloudFormation console.
  2. Select your cluster's CloudFormation stack.
  3. Update the stack.
  4. Under Specify stack details page, update your Subnet IDs configuration.

To determine the Availability Zones that your task is configured for, complete the following steps:

  1. Open the Amazon ECS console.

  2. In the navigation pane, choose Clusters, and then select the cluster that contains your service.

  3. On the Services tab of your cluster's page, in the Service Name column, select the service that you want to check.

  4. Choose the Configuration and Networking tab.

  5. Under Network configuration, view the configured subnets.

  6. Open the Amazon Virtual Private Cloud (Amazon VPC) console to view additional information that isn't available in the ECS console.

  7. Run the describe-services command to verify that your subnets' Availability Zones match your load balancer's Availability Zones:

    aws ecs describe-services --cluster EXAMPLE-CLUSTER-NAME --service EXAMPLE-SERVICE-NAME --query
    'services[*].deployments[].networkConfiguration[].awsvpcConfiguration.{Subnets:subnets}'

    Note: Replace EXAMPLE-CLUSTER-NAME with your cluster's name and EXAMPLE-SERVICE-NAME with your service's name.

You can't use the Amazon ECS console to change the subnet configuration of an Amazon ECS service. Instead, run the AWS CLI update-service command.

Configure your Network ACL to allow traffic between subnets

The subnets for your load balancer and your container instance or task network interface might be different.

To allow traffic between the subnets, use the following network access control list (network ACL) configurations:

  • The network ACL that's associated with your load balancer's subnets must allow inbound traffic on the ephemeral ports (1024-65535) and listener port.
  • The network ACL must allow outbound traffic on the health check and ephemeral ports.
  • The network ACL that's associated with the subnets for your container instance or task network interface for the awsvpc must allow inbound traffic on the health check port.
  • The network ACL must allow outbound traffic on the ephemeral ports.

For more information about network ACLs, see Control subnet traffic with network access control lists.

Check the health check settings of your target group

To check that you correctly configured the health check settings for your target group, complete the following steps:

  1. Open the Amazon EC2 console.
  2. In the navigation pane, under Load Balancing, choose Target groups.
  3. Select your target group.
    Important: Use a new target group. Because Amazon ECS automatically registers and deregisters the ECS task with the target group, don't manually add targets to the target group.
  4. On the Health checks tab, take the following actions:
    Check that you correctly configured the Port and Path fields.
    Note: If you don't correctly configure the fields, then Amazon ECS might ask your load balancer to deregister the task because of failing health checks.
    For Port, choose traffic port.
    Note: If you choose Override, then confirm that the port matches the task host port.
    For Timeout, make sure that the response timeout value is correct.
    Note: If the value is lower than the amount of time required for a response, then the health check fails.

Check the status and configuration of the application in your ECS container

Confirm that the application responds to your load balancer health check

Take the following actions:

  • Check that you correctly configured the ping port and the health check path for your target group.
  • Monitor the CPU and memory utilization metrics for the Amazon ECS service. If your application is slow or times out, then increase the task resource quotas, scale out the service, optimize your application, or use a larger instance type.
  • Set a minimum health check grace period so that the service scheduler ignores the health checks for a predefined time period after you initiate a task.
    Note: Your Amazon ECS task might require a longer health check grace period to register the Application Load Balancer.
  • Check your application logs for application errors. For more information, see Send Amazon ECS logs to CloudWatch.

Confirm that the application returns the correct status code

When the load balancer sends an HTTP GET request to the health check path, the application in your ECS container returns the default 200 OK status code. If you receive a non-HTTP error message, then your application isn't listening to the HTTP traffic. You might receive an HTTP status code that's different from what you specified in the Matcher setting. If you receive another status code, then your application is listening to the HTTP traffic but isn't returning a status code for a healthy target.

Note: If you use an Application Load Balancer, then you can update the Matcher setting to a status code other than 200. For more information, see Health checks for Application Load Balancer target groups.

To confirm that the application in your ECS container returns the correct status code, complete the following steps:

  1. Use SSH, Session Manager, a capability of AWS Systems Manager, or EC2 Instance Connect to connect to your container instance.

  2. (Optional) Run the following command for your operating system (OS) to install curl.
    Amazon Linux and other RPM-based distributions:

    sudo yum -y install curl

    Debian-based systems, such as Ubuntu:

    sudo apt-get install curl
  3. Run the following command to get the container ID:

    docker ps

    Note: The port for the local listener is displayed in the command's output under PORTS at the end of the sequence.

  4. If you use the BRIDGE network mode, then run the docker inspect command to get the container's IP address:

    IPADDR=$(docker inspect --format='{{.NetworkSettings.IPAddress}}' 112233445566)

    Note: The container's IP address is saved in IPADDR. Replace 112233445566 with the Container ID number from the docker ps command's output. If you use awsvpc, then use the task IP address that's assigned to the task network interface. If you use the HOST network mode, then use the IP address of the host container instance that the task is exposed through.

  5. To get the status code, run a curl command that includes IPADDR and the local listener's port:

    curl -I http://${IPADDR}:8080/health

    Note: In the preceding example command, replace 8080 your listener's port.

Check the status of your container instance

If you get the following event message from your Amazon ECS service event, then check the status of your container instance:

"(service AWS-Service) (instance i-1234567890abcdefg) (port 443) is unhealthy in (target-group arn:aws:elasticloadbalancing:us-east-1:111111111111:targetgroup/aws-targetgroup/123456789) due to (reason Health checks failed)"

Check the status of your container instance on the Amazon EC2 console. If your instance fails the system status checks, then stop and start your instance.

Temporarily activate the Application Load Balancer access logs

Temporarily activate the Application Load Balancer access logs to check for the following issues:

  • Determine whether the Application Load Balancer is sending health checks to the correct path or port and whether the targets are responding correctly.
  • Analyze the HTTP status codes that the targets return to identify application-level issues, such as misconfigured routes or server-side errors.
  • Check that the health checks reached the target to determine whether there are network-related issues.
  • Determine whether the response time exceeds the configured health check timeout.

Troubleshoot other causes

If the preceding resolution doesn't resolve your issue, then see Troubleshooting service load balancers in Amazon ECS.

Related information

Create a target group for your Application Load Balancer

Use load balancing to distribute Amazon ECS service traffic

HTTP 504: Gateway timeout

How do I troubleshoot 504 errors that are returned when I use an Application Load Balancer?

AWS OFFICIAL
AWS OFFICIALUpdated 2 months ago