Kubernetes ALB Ingress marks ports other than 80 as unhealthy

0

I have a Kubernetes cluster running on AWS EKS, and I'm using an Application Load Balancer (ALB) with Ingress to route traffic to different services in my cluster. I've configured paths in my Ingress resource to forward traffic to different services on various ports:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: example-ingress
  annotations:
    alb.ingress.kubernetes.io/scheme: internet-facing
    alb.ingress.kubernetes.io/target-type: ip
spec:
  ingressClassName: alb
  rules:
  - http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: service-webapp
            port:
              number: 80
      - path: /api
        pathType: Prefix
        backend:
          service:
            name: service-api
            port:
              number: 3000
      - path: /ml
        pathType: Prefix
        backend:
          service:
            name: service-ml
            port:
              number: 8000
      - path: /socket
        pathType: Prefix
        backend:
          service:
            name: service-socket
            port:
              number: 8002

However, I'm facing an issue where the ALB marks ports other than 80 as unhealthy. For example, requests to /api, /ml, and /socket/ paths are being marked as unhealthy, while requests to / path (port 80) are working fine.

I've checked the following:

  1. Ingress Configuration: The Ingress resource seems to be configured correctly, with paths pointing to services on different ports.

  2. AWS Security Groups and Network ACLs: I've ensured that the AWS Security Groups and Network ACLs allow traffic on the required ports. I have ingress rules for ports 80 and 443 configured in my security group, but traffic on other ports (3000, 8000, 8002) might be blocked.

  3. Service Health Checks: Health checks are configured for my backend services, but they might not be correctly monitoring the endpoints on ports other than 80.

  4. Logging and Monitoring: Logging and monitoring are enabled for my ALB and backend services, but I haven't found any relevant information that could help diagnose the issue.

  5. Pod Readiness Probes: Readiness probes are configured for my backend services, but I'm not sure if they're correctly monitoring the endpoints on ports other than 80.

I have tried to address the issue by adding aws_network_acl_rule entries for ports 3000 and 8000, and also updated the security groups to include ingress rules for these ports. However, this didn't resolve the issue. In fact, when I added rules for ports 3000 and 8000, requests to port 80 also stopped working, indicating that there might be some interference or conflict between the configurations.

I'm not entirely sure why adding rules for ports 3000 and 8000 would affect port 80, as they are distinct ports with separate configurations. It's possible that there's an issue with how the rules are being applied or interpreted by the ALB and Kubernetes Ingress.

I've double-checked the syntax and configuration of the aws_network_acl_rule and security group rules to ensure they are correct. However, I'm still facing the same issue with ports other than 80 being marked as unhealthy.

The followings are the configurations that I tried to update in my terraform script.

Helm ALB Controller

# alb controller
resource "helm_release" "alb_controller" {
  name       = "alb-controller"
  chart      = "aws-load-balancer-controller"
  repository = "https://aws.github.io/eks-charts"
  version    = "1.4.6"
  # Set values for the ALB controller
  set {
    name  = "autoDiscoverAwsRegion"
    value = "true"
  }
  set {
    name  = "vpcId"
    value = aws_vpc.alb.id
  }
  set {
    name  = "clusterName"
    value = aws_eks_cluster.main.name
  }
  set {
    name  = "subnetTags.kubernetes.io/role/elb"
    value = "1"
  }
  # Define namespace for the ALB controller
  namespace = "kube-system"
}

Security Group for Cluster

resource "aws_security_group" "cluster_sg" {
  name   = var.cluster_sg_name
  vpc_id = aws_vpc.main.id

  ingress {
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = [aws_vpc.alb.cidr_block]
  }
  ingress {
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = [aws_vpc.alb.cidr_block]
  }
  ingress {
    from_port   = 8000
    to_port     = 8000
    protocol    = "tcp"
    cidr_blocks = [aws_vpc.alb.cidr_block]
  }
  ingress {
    from_port   = 3000
    to_port     = 3000
    protocol    = "tcp"
    cidr_blocks = [aws_vpc.alb.cidr_block]
  }
  ingress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    self        = true
  }
  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

Security Group For Load Balancer

resource "aws_security_group" "alb_sg" {
  name        = "alb-sg"
  description = "Security group for ALB"
  vpc_id      = aws_vpc.main.id

  ingress {
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }
  ingress {
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }
  ingress {
    from_port   = 8000
    to_port     = 8000
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }
  ingress {
    from_port   = 3000
    to_port     = 3000
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
    from_port       = 0
    to_port         = 0
    protocol        = "-1"
    cidr_blocks     = ["0.0.0.0/0"]
  }

  tags = {
    Name = "alb-sg"
  }
}

Network Rules

resource "aws_network_acl_rule" "allow_alb_traffic_inbound" {
  network_acl_id  = aws_vpc.main.default_network_acl_id
  rule_number     = 200
  protocol        = "tcp"
  rule_action     = "allow"
  cidr_block      = aws_vpc.alb.cidr_block
  from_port       = 80
  to_port         = 80
  egress          = false
}

resource "aws_network_acl_rule" "allow_alb_traffic_outbound" {
  network_acl_id  = aws_vpc.main.default_network_acl_id
  rule_number     = 200
  protocol        = "tcp"
  rule_action     = "allow"
  cidr_block      = aws_vpc.alb.cidr_block
  from_port       = 80
  to_port         = 80
  egress          = true
}

resource "aws_network_acl_rule" "allow_alb_traffic_inbound_8000" {
  network_acl_id  = aws_vpc.main.default_network_acl_id
  rule_number     = 201
  protocol        = "tcp"
  rule_action     = "allow"
  cidr_block      = aws_vpc.alb.cidr_block
  from_port       = 8000
  to_port         = 8000
  egress          = false
}

resource "aws_network_acl_rule" "allow_alb_traffic_outbound_8000" {
  network_acl_id  = aws_vpc.main.default_network_acl_id
  rule_number     = 201
  protocol        = "tcp"
  rule_action     = "allow"
  cidr_block      = aws_vpc.alb.cidr_block
  from_port       = 8000
  to_port         = 8000
  egress          = true
}

resource "aws_network_acl_rule" "allow_alb_traffic_inbound_3000" {
  network_acl_id  = aws_vpc.main.default_network_acl_id
  rule_number     = 202
  protocol        = "tcp"
  rule_action     = "allow"
  cidr_block      = aws_vpc.alb.cidr_block
  from_port       = 3000
  to_port         = 3000
  egress          = false
}

resource "aws_network_acl_rule" "allow_alb_traffic_outbound_3000" {
  network_acl_id  = aws_vpc.main.default_network_acl_id
  rule_number     = 202
  protocol        = "tcp"
  rule_action     = "allow"
  cidr_block      = aws_vpc.alb.cidr_block
  from_port       = 3000
  to_port         = 3000
  egress          = true
}

I'm unsure what else to check or how to resolve this issue. Any insights or suggestions on how to troubleshoot and fix this problem would be greatly appreciated. Thank you!

Screen shot of my Application load Balancer

1 Answer
0

Hi,

Can you connect to the pod hosting the containers hosting the code responding to pyapi, api and and do a curl (with proper tcp ports) on those endpoints to see if they respond properly to health check request when made locally.

I say that because your dashboard says "not found": so, I'd personally at the origin (i.e. locally on same node) and work backward from there toward the outside of your cluster (i.e. the AöB)

Best,

Didier

profile pictureAWS
EXPERT
answered 21 days ago
  • Hi @Didier, I currently don't have access to the system (as it is Sunday). I will definitely try, but I have a doubt, when I do kubectl get pods, I can see all the pods as running. Infact I can see the IPs of these pods as well. Am I correct to assume that the pods are healthy, and they should be able to connect to the ingress.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions