Help us improve the AWS re:Post Knowledge Center by sharing your feedback in a brief survey. Your input can influence how we create and update our content to better support your AWS journey.
How do I troubleshoot issues when I use the AWS Load Balancer Controller to create a load balancer?
I want to troubleshoot issues that occur when I try to create a load balancer with the AWS Load Balancer Controller.
Short description
The AWS Load Balancer Controller manages Elastic Load Balancing for an Amazon Elastic Kubernetes Service (Amazon EKS) cluster.
The controller provides the following resources:
- An Application Load Balancer when you create a Kubernetes ingress.
- A Network Load Balancer when you create a Kubernetes service of the LoadBalancer type.
Note: With AWS Load Balancer Controller version 2.3.0 or later, you can create a Network Load Balancer with either the instance or IP target type.
Resolution
Make sure that you meet all prerequisites to install and use AWS Load Balancer Controller
For a list of initial actions to take, see Prerequisites.
Run the following command to verify that you successfully deployed the AWS Load Balancer Controller:
kubectl get deployment -n kube-system aws-load-balancer-controller
Note: It's a best practice to use version 2.4.4 or later.
Example output:
NAME READY UP-TO-DATE AVAILABLE AGE aws-load-balancer-controller 2/2 2 2 84s
If you're using an Application Load Balancer, then check that you have at least two subnets in different Availability Zones. A Network Load Balancer must have at least one subnet. The subnets must have at least eight available IP addresses. For more information, see Create a virtual private cloud (VPC).
You must use the following tag in certain scenarios:
- Key: "kubernetes.io/cluster/cluster-name"
- Value: "shared" or "owned"
Application Load Balancers
You must tag one security group in the following scenarios:
- You use multiple security groups that are attached to a worker node.
- You use the AWS Load Balancer Controller version v2.1.1 or earlier.
Network Load Balancers
If you use the AWS Load Balancer Controller version v2.1.1 or earlier, then you must add tags to the subnets.
If you didn't specify subnet IDs in your service or ingress annotations, then make sure that your subnets have the required tags for subnet auto discovery. For more information, see Subnet auto discovery on the GitHub website.
For a private subnets, use the following tags:
- Key: "kubernetes.io/role/internal-elb"
- Value: "1"
For public subnets, use the following tags:
- Key: "kubernetes.io/role/elb"
- Value: "1"
Check the annotations of the ingress or service object
Make sure that the annotations on the service object or ingress object are correct.
Note: In the following commands, replace SERVICE-NAME, INGRESS-NAME, and NAMESPACE with your values.
To view the service object, run the following command:
kubectl describe service SERVICE-NAME -n NAMESPACE
To view the ingress object, run the following command:
kubectl describe ingress INGRESS-NAME -n NAMESPACE
To edit the service object, run the following command:
kubectl edit service SERVICE-NAME -n NAMESPACE
To edit the ingress object, run the following command:
kubectl edit ingress INGRESS-NAME -n NAMESPACE
Other annotations use default values. For a list of annotations that AWS Load Balancer Controller supports for Application Load Balancers, see Ingress annotations on the GitHub website. For a list of supported annotations for Network Load Balancers, see Service annotations on the GitHub website.
Application Load Balancer
In Kubernetes versions earlier than 1.18, ingress classes used the kubernetes.io/ingress.class annotation that references the ingress controller name. Ingress classes in all later versions of Kubernetes use the ingressClassName annotation that references the ingress class resource.
For more information, see Deprecated kubernetes.io/ingress.class annotation on the GitHub website.
Network Load Balancer
Use the following annotations:
- With IP targets, use service.beta.kubernetes.io/aws-load-balancer-type: "external" and service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: "ip".
- With instance targets, use service.beta.kubernetes.io/aws-load-balancer-type: "external" and service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: "instance".
Troubleshoot issues when you create the ingress or service type load balancer in Amazon EKS
You receive the "AccessDenied" error
You receive the following error message:
"Failed deploy model due to AccessDenied"
This error occurs because the elasticloadbalancing:AddTags permission to create resources changed. To resolve the issue, attach the latest AWS Identity and Access Management (IAM) policy to the AWSLoadBalancerController role. To get the latest policy, see the IAM policy JSON on the GitHub website.
For more information, see Create IAM role using eksctl.
The load balancer isn't supported in the Availability Zone
If you specify a subnet in a constrained Availability Zone, then you might receive an error message that's similar to the following one:
"Load balancers with type 'network' are not supported in availability-zone-name"
To resolve this issue, specify a subnet in another Availability Zone that isn't constrained. Then, use cross-zone load balancing to distribute traffic to targets in the constrained Availability Zone.
To use different subnets, add the kubernetes.io/role/internal-elb=1 tag for subnets that you use to create an internal Network Load Balancer. For more information, see Tag a Network Load Balancer.
Or, add the following annotation to specify the subnets in the service manifest file:
service.beta.kubernetes.io/aws-load-balancer-subnets: subnet-xxxx, mySubnet
Note: Replace subnet-xxxx with your subnet ID and mySubnet with your subnet name.
You can't use auto-discovery for subnets
If you don't tag your subnets for auto discovery, then you might receive the following error message:
"couldn't auto-discover subnets: unable to resolve at least one subnet"
The AWS Load Balancer Controller automatically discovers network subnets by default. For Application Load Balancers, you must have at least two subnets across different Availability Zones. A Network Load Balancer requires only one subnet.
For automatic discovery to work, you must apply the appropriate tags to your subnets. The controller selects one subnet from each Availability Zone. If an Availability Zone has multiple tagged subnets, then the controller chooses only one based on alphabetical Subnet IDs.
For more information about the required subnet tags for private and public subnets, see Subnet auto discovery on the GitHub website.
There's a certificate manager or webhook configuration issue
If your webhook validation fails, then you might receive the following error message:
"Internal error occurred: failed calling webhook "vingress.elbv2.k8s.aws": Post "https://aws-load-balancer-webhook-service.kube-system.svc:443/validate-networking-v1beta1-ingress?timeout=10s": x509: certificate has expired or is not yet valid"
This error occurs when there are issues with the certificates that AWS Certificate Manager (ACM) manages for your webhooks.
To resolve this issue, check whether the Certificate Manager pods are running.
To get the pod status, run the following command:
kubectl describe pod your-pod-name -n your-namespace
To collect logs, run the following command:
kubectl logs your-pod-name -n your-namespace
Note: In the preceding commands, replace your-pod-name with the name of your Pod and your-namespace with the name of your namespace.
The target group binding creation failed
If your target group binding creation fails, then you might receive the following error message:
"Warning FailedDeployModel 11m (x2 over 39m) ingress Failed deploy model due to Internal error occurred: failed calling webhook "vtargetgroupbinding.elbv2.k8s.aws": failed to call webhook: Post "https://aws-load-balancer-webhook-service.kube-system.svc:443/validate-elbv2-k8s-aws-v1beta1-targetgroupbinding?timeout=10s": context deadline exceeded"
This error occurs when security group restrictions block access to the webhook service. The service uses port 9443 by default.
To resolve this issue, modify your node security group. Allow inbound traffic from the control plane security group on port 9443. For more information, see Controller configuration options on the GitHub website.
AssumeRoleWithWebIdentity failed for the node role
If your node role can't assume the role that you specified in the service account, then you might receive the following error message:
"WebIdentityErr: failed to retrieve credentials\ncaused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity\n\tstatus code: 403, request id: c6241a7d-d8a8-452c-bb67-bf1ff9bab0c0"
This error occurs because you incorrectly configured IAM Roles for Service Accounts (IRSA).
To resolve this issue, use the correct role in the service account, and define a trust policy for the role.
For more information, see Why do I get the "WebIdentityErr" error when I use the AWS Load Balancer Controller in Amazon EKS? and How do I troubleshoot an OIDC provider and IRSA in Amazon EKS?
The data is insufficient in the controller pod logs
If you require more debug information than what the default controller pod logs provide, then add the --log-level debug flag to your controller pod configuration.
For more information, see Controller command line flags on the GitHub website.
Review the AWS Load Balancer Controller pod's logs for additional information
To review the AWS Load Balancer Controller logs, run the following command:
kubectl logs -n kube-system deployment.apps/aws-load-balancer-controller
If there's an issue, then you get a "Reconciler error". You also get a detailed error message that states why an ingress object or load balancer service fails to create or update.
This failure can occur for the following reasons:
- If the error occurs when the controller tries to make AWS API calls, then there's a permissions or a connectivity issue. Review the controller's IAM permissions. Then, make sure that security groups or network access control lists (network ACLs) don't explicitly deny outbound connections.
- If the error occurs in the object's configuration, then you incorrectly configured the ingress or service specification or annotations. Review the annotations for Application Load Balancer or Network Load Balancer on the GitHub website.
If none of the controller pods show logs, then run the following command to confirm that the controller pods are running:
kubectl get deployment -n kube-system aws-load-balancer-controller
Upgrade to a supported controller version
If you use a version of the AWS Load Balancer Controller that's no longer supported, then you can't upgrade to a later version. Instead, you must remove the existing controller and then install the latest version.
Use AWS Load Balancer Controller rather than the legacy cloud provider
Kubernetes includes a legacy cloud provider for AWS that can provide Classic Load Balancers. If you don't install the AWS Load Balancer Controller, then Kubernetes uses the legacy cloud provider. However, it's a best practice to use the AWS Load Balancer Controller.
AWS Load Balancer Controller version 2.5 and later are the default controller for Kubernetes service resources with the LoadBalancer type. They create a Network Load Balancer for each service. The latest versions also implement a mutating webhook for services. They set the spec.loadBalancerClass field to service.k8s.aws/nlb for the new type LoadBalancer services.
To upgrade to AWS Load Balancer Controller, run the following command:
helm upgrade aws-load-balancer-controller eks/aws-load-balancer-controller -n kube-system --set clusterName=CLUSTER-NAME --set serviceAccount.create=false --set serviceAccount.name=aws-load-balancer-controller --set enableServiceMutatorWebhook=false
Note: Replace CLUSTER-NAME with the name of your cluster.
If you must use the legacy cloud provider, then set the enableServiceMutatorWebhook Helm chart value to false so that doesn't provide new Classic Load Balancers. Only the existing Classic Load Balancers continue to function.
Verify that you created a Fargate profile for the namespace where the ingress or service object is
When target pods are running on AWS Fargate, you must include the IP target type. To verify that you have a Fargate profile for the namespace where the ingress or service object is, run the following command:
eksctl get fargateprofile --cluster CLUSTER-NAME -o yaml
Note: Replace CLUSTER-NAME with the name of your cluster.
To create a Fargate profile, run the following command:
eksctl create fargateprofile --cluster CLUSTER-NAME --region REGION --name FARGATE-PROFILE-NAME --namespace NAMESPACE
Note: Replace CLUSTER-NAME, REGION, FARGATE-PROFILE-NAME, and NAMESPACE with your values.
Check that you meet the requirements to route traffic
To make sure that you meet all the requirements, see Prerequisites for Application Load Balancers and Prerequisites for Network Load Balancers. For example, if you use an Application Load Balancer, then the Service object must specify the NodePort or LoadBalancer to use instance traffic mode.
Amazon EKS adds the following rules to the node's security group:
- An inbound rule for client traffic
- An inbound rule for each load balancer subnet in the VPC for each Network Load Balancer that you create for health checks
If the rules that Amazon EKS adds causes your security group to exceed the maximum number of rules, then your load balancer deployment might fail.
- Topics
- Containers
- Language
- English
