How do I resolve leader election issues with the AWS Load Balancer Controller in Amazon EKS?

2 minute read
0

My AWS Load Balancer Controller pods are failing, and I receive the "Forbidden" error in Amazon Elastic Kubernetes Service (Amazon EKS). Or, I receive a "problem running manager: error leader election lost" error in the Load Balancer controller pod logs.

Resolution

You receive the "Forbidden" error when AWS Load Balancer controller pods fail

The Forbidden error occurs when the aws-load-balancer-controller-role Amazon EKS cluster role doesn't have permission to lease resources from coordination.k8s.io apiGroups. The new inbound resource doesn't work as expected, and modifications to the existing ingress resources don't take effect.

You receive the following error:

"E0830 08:37:41.717952 1 leaderelection.go:330] error retrieving resource lock kube-system/aws-load-balancer-controller-leader: leases.coordination.k8s.io "aws-load-balancer-controller-leader" is forbidden: User "system:serviceaccount:kube-system:aws-load-balancer-controller" cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "kube-system""

The error is continuously written in the controller pod logs, and no other errors exist. A controller pad restart doesn't resolve the issue.

Note: Update the AWS Load Balancer Controller to version 2.5.x. Also, make sure that controller pods are in the Active state.

To resolve the issue, complete the following steps:

  1. Add the following entries to aws-load-balancer-controller-role:

    ...
    ...
    - apiGroups: ["coordination.k8s.io"]
      resources: [leases]
      verbs: [get, list, watch, create, patch, update]
  2. Restart the controller pod.

To add resources, run the following command:

kubectl edit clusterrole aws-load-balancer-controller-role

To restart the deployment, run the following command:

kubectl rollout restart deployment  -n 

You receive the "problem running manager: error leader election lost" in the controller pod log

You get the problem running manager error message when multiple instances of the same controller are active in the cluster. The leader election process determined that one of the instances lost its leader status. When multiple instances are active, it's expected that an instance loses its leader status. The leader election mechanism makes sure that only one instance performs specific critical operations. You don't need to take action.

To make sure that there's always an elected leader pod, set the replica count from 2 to 3 or to another odd number.

AWS OFFICIAL
AWS OFFICIALUpdated 25 days ago