re:Invent 2025 - Amazon EKS Auto Mode: Evolving Kubernetes ops to enable innovation
Amazon EKS Auto Mode removes the operational overhead of managing Kubernetes infrastructure so your teams can focus on building applications. This post covers the key announcements and capabilities demonstrated at CNS354 at AWS re:Invent 2025.
Running Kubernetes in production has always required more than just writing application code. Platform teams spend significant time provisioning nodes, installing plugins, patching operating systems, managing GPU drivers, and keeping cluster components up to date. For large organizations, this overhead multiplies across dozens or hundreds of clusters. In this post, we'll walk through how Amazon EKS Auto Mode shifts this operational responsibility to AWS, explore a live GPU inference deployment that shows what "batteries included" means in practice, and hear how Capital One adopted Auto Mode in a highly regulated enterprise environment. The session featured Sai Vennam, Principal Containers Specialist at AWS, Alex Kestner, Principal Product Manager - Technical at AWS, and Daniel Levine, Senior Lead Software Engineer at Capital One.
From managed control plane to managed infrastructure
When AWS launched Amazon EKS at re:Invent 2017, the goal was to offload the undifferentiated work of hosting the Kubernetes control plane. Over the following eight years, more than 250 feature launches expanded EKS to cover managed node groups, open-source projects like Karpenter, Amazon EKS Add-ons, and much more. Yet customers consistently told AWS that data-plane operations remained their burden: plugin installation, node lifecycle, OS patching, and component upgrades.
EKS Auto Mode, launched at re:Invent 2024, addresses this directly. With a single API call or console click, Auto Mode provides fully AWS-managed, Kubernetes-conformant compute, networking, and storage for any EKS cluster. Every Auto Mode cluster ships pre-configured with application load balancing, Amazon EBS CSI (Container Storage Interface) for persistent volumes, Karpenter-powered compute autoscaling, GPU support including NVIDIA device plugin and DCGM (Data Center GPU Manager) Exporter, and all core networking components including kube-proxy, VPC CNI (Container Network Interface), and CoreDNS. None of these run as pods in your data plane. They run in the AWS-managed control plane, meaning you pay nothing for them and are responsible for nothing when it comes to upgrades.
The shift in the shared responsibility model is meaningful. With standard EKS, AWS managed the control plane and customers managed everything above it: cluster capabilities, compute lifecycle, OS patching, health monitoring, and repair. Auto Mode moves that boundary considerably. AWS takes on OS patching, node health monitoring, node repair, and the full lifecycle of cluster components. You remain responsible for your application, application-level security and observability, and any additional plugins your workload requires.
EC2 instances in Auto Mode launch in your account as EC2 Managed Instances. This is the same operational model used by Amazon ECS and AWS Lambda, where the EC2 service holds the lifecycle responsibility while instances continue to appear as normal EC2 resources in your account with standard billing and visibility.
Infrastructure that moves at the speed of your ideas
The session opened with a live demonstration that illustrated the zero-to-production experience. Starting from an entirely empty cluster (zero pods, zero nodes), Sai deployed a retail storefront UI and scaled it to 10 replicas. Karpenter selected and booted a c6a.large and then a c6a.xlarge within seconds, with no scripted instance list and no Auto Scaling Group (ASG) configuration required.
The more revealing demonstration involved deploying a 20-billion-parameter open-source GPT model for LLM (Large Language Model) inference. Sai created a single NodePool YAML specifying G5/G6 instance classes with NVIDIA GPUs, applied a deployment, and watched Karpenter select a g5.4xlarge spot instance automatically. The 14 GB container image pulled in just over one minute. This is where SOCI (Seekable OCI) parallel pull and unpack, one of the new Auto Mode features launched this fall, made a visible difference. By downloading multiple container image layers concurrently, SOCI reduces total pull time by up to 60% compared to sequential pulls. For AI/ML workloads where time-to-first-token is a critical metric, this is significant. SOCI is automatically enabled on GPU instances (G, P, and Trainium series) and instances with local NVMe storage, with zero configuration required.
With no prior NVIDIA kernel alignment, no device plugin manifest, and no DCGM Exporter setup, the model was running and responding to inference requests within minutes. This is what "batteries included" means in practice: the infrastructure is GPU-aware, and you simply describe your workload requirements.
Adoption at enterprise scale: Capital One's journey
Daniel Levine described a decade-long Kubernetes journey at Capital One that reflects the experience of many large enterprises. Early Kubernetes adoption was fragmented across multiple platform teams with different tooling choices, creating high SRE (Site Reliability Engineering) effort, constant churn in compliance container management, and scalability friction.
Capital One's response was a federated model: a centralised tooling team that curated Kubernetes tooling for all internal platform teams, using a "do it with them, not for them" philosophy. This reduced arbitrary uniqueness and raised the operational floor, but infrastructure management and compliance container maintenance remained expensive. Teams were spending hundreds of hours managing EBS CSI driver versions, load balancer controller upgrades, and troubleshooting managed node group update failures.
Capital One's approach to Auto Mode adoption was deliberate. They began by dogfooding it on their own central delivery platform, gathered data on how it performed with their compliance software stack, and only expanded once they had evidence rather than promises. The two primary gains were automated infrastructure management through Karpenter (right-sizing instances, eliminating over-provisioned capacity) and simplified container management (AWS now owns the lifecycle of EBS CSI, the load balancer controller, and other table-stakes components).
The metric Daniel highlighted was not a number. It was silence. Slack channels that previously surfaced constant questions about stuck node group updates, crashlooping drivers, or version mismatches are now quiet. Capital One is now evaluating Auto Mode for ML workloads and multi-tenant platforms, where the cluster-agnostic scheduling model provides the highest return.
What shipped since launch
Alex Kestner outlined six areas of investment over the past year, each driven by direct customer feedback.
Regional availability. EKS Auto Mode is now available in all commercial EKS regions (excluding the AWS China regions), both AWS GovCloud regions, and AWS Local Zones. This opens Auto Mode to public sector workloads, federal compliance requirements, and latency-sensitive edge deployments.
Advanced networking and configuration. The most-requested additions since launch focused on giving customers control over networking without sacrificing operational simplicity. You can now assign separate subnets and security groups to pods via podSubnetSelectorTerms and podSecurityGroupSelectorTerms (equivalent to VPC CNI custom networking). Additional options include toggling public IP address association for nodes, configuring forward proxy routing for traffic leaving nodes, and providing private certificate authority (CA) material through the certificateBundles parameter for internal PKI (Public Key Infrastructure) environments.
Security and compliance. Auto Mode now supports custom AWS Key Management Service (AWS KMS) encryption for both root and data volumes. FIPS (Federal Information Processing Standards) validated cryptographic modules are available via the advancedSecurity.fips setting in the NodeClass, which is critical for GovCloud and federal compliance attestation. An instance-profile-only IAM model means teams can run Auto Mode without needing permissions to create or attach IAM roles, which is a practical application of least-privilege access for large organisations.
Capacity management. Auto Mode can now prioritize On-Demand Capacity Reservations and Capacity Blocks for ML via capacityReservationSelectorTerms, ensuring pre-purchased capacity is consumed before on-demand or spot instances. A new Static Capacity mode lets you maintain a fixed number of instances regardless of pod scheduling activity, supporting mission-critical workloads that require pre-provisioned resources.
Getting started
EKS Auto Mode is available today across all supported regions. The console's Quick Configuration flow creates a production-ready cluster with auto-generated IAM roles in a single form submission. For teams already running EKS, Auto Mode can be enabled on existing clusters.
The engineering priorities for Auto Mode are driven by customer feedback through the EKS public roadmap on GitHub. If your use case requires a specific configuration option that Auto Mode does not yet support, filing an issue there is the direct path to influencing what ships next.
For complete configuration details, see the Amazon EKS Auto Mode user guide and the Auto Mode release notes.
Watch the full session: AWS re:Invent 2025 - Amazon EKS Auto Mode: Evolving Kubernetes ops to enable innovation (CNS354)
- Topics
- ComputeContainers
- Language
- English
Relevant content
- Accepted Answerasked a year ago
- Accepted Answerasked a year ago
AWS OFFICIALUpdated 2 years ago
AWS OFFICIALUpdated 2 years ago