Skip to content

Karpenter-provisioned nodes cannot join EKS cluster (Terraform)

1

We’re trying to configure Karpenter via Terraform for our EKS cluster, but the nodes provisioned by Karpenter fail to join the cluster. From what we observe, the nodes are being launched without the correct security group attached, which prevents them from connecting to the control plane.

We are using the official terraform-aws-modules/eks module, version 20.36.0. Here is a snippet of our EKS module configuration:

module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  version = "20.36.0"

  cluster_name    = "${var.environment}-eks-cluster"
  cluster_version = "1.32"

  cluster_endpoint_public_access = true

  vpc_id     = var.vpc_id
  subnet_ids = var.private_subnet_ids

  access_entries = {
    admin-access = {
      principal_arn = "arn:aws:iam::<ACCOUNT_ID>:user/<USERNAME>"
      policy_associations = {
        admin = {
          policy_arn = "arn:aws:eks::aws:cluster-access-policy/AmazonEKSClusterAdminPolicy"
          access_scope = {
            type = "cluster"
          }
        }
      }
    }
  }

  eks_managed_node_groups = {
    "${var.environment}-eks-node-group" = {
      use_name_prefix = false
      desired_size    = 1
      max_size        = 1
      min_size        = 1

      instance_types = ["t3a.medium"]

      tags = {
        Name = "${var.environment}-eks-node"
      }
    }
  }

  tags = var.tags
}
2 Answers
1
Accepted Answer

If you don't explicitly define node_security_group_tags in your EKS Terraform module, Karpenter won't know which security group it should assign to the nodes it creates. As a result, the instances will likely come up with the default VPC security group, and that usually doesn't have the required rules to let them talk to the EKS control plane.

Karpenter figures out which subnets and security groups to use based on a tag like this:

"karpenter.sh/discovery" = "<your-cluster-name>"

This tag needs to be present both on your private subnets and on the security group you want Karpenter to attach to the instances. Without that, it can't discover them, and your nodes stay in a NotReady state.

You can read more in the Karpenter docs here:

https://karpenter.sh/docs/getting-started/getting-started-with-karpenter/#3-create-a-cluster (search for "karpenter.sh/discovery")

To fix this, just add the following block to your EKS module config:

node_security_group_tags = {
  "karpenter.sh/discovery" = "${var.environment}-eks-cluster"
}

That way, the security group created by the EKS module gets the discovery tag automatically, and Karpenter will pick it up correctly. Once that's in place, new nodes should launch with the proper SG and successfully join the cluster.

answered a month ago
AWS
EXPERT
reviewed a month ago
1

The issue with your Karpenter-provisioned nodes failing to join the EKS cluster is likely related to one or more of the following common problems:

  1. Security Group Configuration: As you've observed, the nodes may be launching without the correct security groups. Karpenter needs to be configured to use the cluster security group to allow proper communication with the control plane.

  2. Bootstrap Script Configuration: The nodes may not have the correct bootstrap script in their UserData. The bootstrap script must contain the correct EKS cluster name for nodes to join successfully.

  3. IAM Role Configuration: The IAM role used by Karpenter-provisioned nodes might not have the necessary permissions, or it might not be properly mapped in the cluster's access entries.

To resolve these issues:

  1. Security Group Configuration:
  • Ensure that Karpenter is configured to use the cluster security group
  • Verify that the security group allows necessary traffic between the nodes and the control plane
  1. Bootstrap Script Configuration:
  • Check that Karpenter is correctly passing the cluster name to the bootstrap script in the UserData
  • The bootstrap script should include something like: /etc/eks/bootstrap.sh your-cluster-name
  1. IAM Role Configuration:
  • Verify that the IAM role used by Karpenter for provisioning nodes has all required EKS permissions
  • Ensure the node IAM role ARN (not the instance profile ARN) is properly configured in your access entries
  • If you're using the aws-auth ConfigMap instead of access entries, make sure the node role is properly mapped
  1. Network Configuration:
  • Ensure that the VPC where the nodes are created has the correct DHCP options
  • Verify that the AWS STS endpoint for your region is enabled for your account
  • Check that the subnets used by Karpenter have proper routing to the EKS control plane

You can debug the issue by:

  1. Checking the kubelet logs on the failing nodes with journalctl -u kubelet -f
  2. Looking for errors like "Failed to ensure lease exists", "Unable to register node with API server", or "Failed to contact API server"
  3. Using kubectl get nodeclaim and kubectl describe nodeclaim <node_claim> to identify issues preventing nodes from joining

If you're using EKS Auto Mode with Karpenter, these commands will help identify instances that didn't join the cluster and provide more detailed error information.
Sources
EKS node creation failed | AWS re:Post
Troubleshooting hybrid nodes - Amazon EKS
Troubleshoot problems with Amazon EKS clusters and nodes - Amazon EKS
Troubleshoot EKS Auto Mode - Amazon EKS
Troubleshoot EKS node join cluster errors | AWS re:Post

answered a month ago
AWS
EXPERT
reviewed a month ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.