I have an EKS (version 1.30) with ingress-nginx
installed using Helm
There are 45 nodes in my cluster
Looking at my "Target instances" on the ELB page, only 2-4 are active while the rest are "out-of-service"
Health status:
Instance has failed at least the unhealthy threshold number of health checks consecutively.
Why is it happening and how can I fix it?
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: ingress-nginx
namespace: infra
spec:
chart:
spec:
chart: ingress-nginx
reconcileStrategy: ChartVersion
sourceRef:
kind: HelmRepository
name: ingress-nginx
version: 4.10.1
interval: 1m0s
values:
controller:
replicaCount: 4
config:
use-forwarded-headers: "true"
use-proxy-protocol: "true"
service:
externalTrafficPolicy: Cluster
targetPorts:
http: http
https: http
annotations:
service.beta.kubernetes.io/aws-load-balancer-backend-protocol: tcp
service.beta.kubernetes.io/aws-load-balancer-ssl-cert: arn:aws:acm:eu-west-1:***:certificate/***
service.beta.kubernetes.io/aws-load-balancer-ssl-ports: "https"
service.beta.kubernetes.io/aws-load-balancer-proxy-protocol: "*"
kubectl get svc -n infra ingress-nginx-controller
apiVersion: v1
kind: Service
metadata:
annotations:
meta.helm.sh/release-name: ingress-nginx
meta.helm.sh/release-namespace: infra
service.beta.kubernetes.io/aws-load-balancer-backend-protocol: tcp
service.beta.kubernetes.io/aws-load-balancer-proxy-protocol: '*'
service.beta.kubernetes.io/aws-load-balancer-ssl-cert: arn:aws:acm:***:certificate/***
service.beta.kubernetes.io/aws-load-balancer-ssl-ports: https
creationTimestamp: "2024-07-04T17:03:45Z"
finalizers:
- service.kubernetes.io/load-balancer-cleanup
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
app.kubernetes.io/version: 1.10.1
helm.sh/chart: ingress-nginx-4.10.1
helm.toolkit.fluxcd.io/name: ingress-nginx
helm.toolkit.fluxcd.io/namespace: infra
name: ingress-nginx-controller
namespace: infra
resourceVersion: "74014"
uid: ***
spec:
allocateLoadBalancerNodePorts: true
clusterIP: ***
clusterIPs:
- ***
externalTrafficPolicy: Cluster
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- appProtocol: http
name: http
nodePort: 30343
port: 80
protocol: TCP
targetPort: http
- appProtocol: https
name: https
nodePort: 32133
port: 443
protocol: TCP
targetPort: http
selector:
app.kubernetes.io/component: controller
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/name: ingress-nginx
sessionAffinity: None
type: LoadBalancer
status:
loadBalancer:
ingress:
- hostname: ***.eu-west-1.elb.amazonaws.com
In our previous cluster (1.22) we had the same setup with 35 nodes (4 nginx-ingress pods) and all of them were healthy, why is it different?
What is the
externalTrafficPolicy
value in the ingress-nginx service? (according to the snippet you provided it'sCluster
but can you confirm this is whats actually being used and notLocal
) Best if you can share the output ofkubectl get svc -n ingress-nginx ingress-nginx -o yaml
attached to the question