re:Invent 2025 - Navigate the Cloud Landscape with Fargate and ECS Managed Instances
Fargate provides a fully managed container compute experience, but it has limits: no GPU support, no privileged containers, and a 120 GB memory ceiling. ECS Managed Instances, launched two months before this session, fills that gap. This post covers when to use each option, what the security and cost implications are, and how Qube Research & Technologies scaled to tens of thousands of running ECS services.
Amazon Elastic Container Service (Amazon ECS) now runs approximately three billion container tasks per week globally. About 65% of customers who start new containerized workloads on AWS choose ECS, and the majority use AWS Fargate for compute. Fargate simplifies operations by abstracting the underlying instances entirely: you specify vCPU and memory, and ECS handles the rest. But Fargate has specific constraints, and a meaningful segment of workloads cannot run on it today. Mats Lannér, Director of Software Development for Amazon ECS, Alexandr Moroz, Product Manager for Amazon ECS, and Ruben di Battista, Quantitative Technologist at Qube Research & Technologies, walked through the new ECS Managed Instances option and the reasoning behind it. In this post, we'll cover how Managed Instances works, what it changes for security, compliance, and cost, and how QRT went from a four-person team to running tens of thousands of ECS services.
ECS Managed Instances: when Fargate is not enough
ECS Managed Instances is a new capacity provider that sits between Fargate and self-managed Amazon EC2. Like Fargate, the compute is fully managed: ECS provisions, patches, and replaces instances automatically. Like EC2, you have visibility into the instances and control over instance type selection. The key constraint is that you cannot take mutating actions on these instances directly; lifecycle management goes through the ECS API.
Instances appear in your EC2 console but cannot be modified by you. ECS replaces them approximately every two weeks to maintain patching SLAs, shifting your tasks to new instances transparently. You control when this happens by defining EC2 event windows, for example restricting replacements to a 4-hour window on Sunday mornings. An instance that becomes idle remains available for up to one hour (configurable) before being released, because tasks launched on an already-running instance start approximately three times faster than tasks requiring a cold instance provisioned from scratch.
For instance type selection, you have two paths. The default capacity provider lets ECS pick cost-optimized types based on your task requirements. A custom capacity provider uses attribute-based instance type selection, where you define characteristics such as GPU type, minimum vCPU count, network optimization requirements, or CPU architecture. ECS selects matching instance types at launch, which avoids maintaining a fixed list as the EC2 catalog evolves.
The workloads Managed Instances enables that Fargate does not support include: privileged containers, eBPF (Extended Berkeley Packet Filter) based observability and security agents, GPU workloads, and tasks requiring more than 120 GB of memory. It uses Bottlerocket as the container operating system, a purpose-built Linux distribution from AWS with a minimal package set designed for containerized workloads. Note that Fargate Spot, which offers up to 80% savings for fault-tolerant workloads, is not yet available with Managed Instances (Update: it is now supported and it was announced after re:Invent). If you are currently running ECS with self-managed EC2, the session recommendation is to evaluate Managed Instances as a replacement, since ECS takes on the patching and lifecycle work you currently handle yourself.
Security, compliance, and cost
The shift to Fargate or Managed Instances changes the shared responsibility model in a meaningful way. With self-managed EC2, you are responsible for capacity provisioning, selecting and configuring Amazon Machine Images (AMIs), running ECS agents, patching instances, and monitoring them. With Fargate and Managed Instances, AWS takes on these responsibilities. You remain responsible for your application, security group configuration, and IAM roles.
Fargate provides the strongest isolation: every task runs on a separate EC2 instance provisioned for that task only, never shared, and discarded after the task stops. This eliminates cross-task data contamination. Managed Instances runs multiple tasks on a single instance using bin packing to optimize utilization. If task isolation between different applications is a hard requirement, you can use separate capacity providers to achieve it. Both Fargate and Managed Instances use AWS VPC networking mode, which assigns each task an individual IP address. This enables VPC flow logs and security group rules scoped to individual tasks, giving you network visibility that is not achievable with bridge networking mode.
The compliance implication of this shift is practical: when you use Fargate or Managed Instances, the evidence you need to provide for compliance audits covers only your application. AWS provides evidence for the infrastructure layer, which significantly shortens the compliance checklist.
On cost, Fargate delivers 100% utilization for what you request: you pay for exactly the vCPU and memory specified, with no bin packing decisions. Seekable OCI lazy loading speeds up task starts on Fargate by beginning container execution before the image is fully downloaded. For Managed Instances, ECS continuously scans your cluster for underutilized instances, compacts running tasks onto fewer instances, and shuts down idle ones. This active optimization runs automatically and can be disabled per service if tasks need to run uninterrupted.
A specific cost signal from the session: tasks completing in under two minutes are significantly more expensive on Fargate than on Managed Instances. On Fargate, you pay for a startup window that includes image download time on a cold instance for each task. Managed Instances avoids this through container image caching on already-running instances and the instance warm period. If your workload includes high volumes of short-lived tasks, Managed Instances is the more cost-efficient option. Managed Instances also supports EC2 Reserved Instances, Savings Plans, and On-Demand Capacity Reservations, giving you additional purchasing flexibility that Fargate does not offer.
How Qube Research & Technologies scaled to tens of thousands of ECS services
Ruben di Battista shared QRT's journey from a small team building a new systematic trading platform to operating several tens of thousands of running ECS services across a multi-account, multi-cluster deployment.
When QRT started, the engineering team was four people with financial domain expertise rather than cloud infrastructure backgrounds. They chose ECS and Fargate because the serverless model let them focus on their core business, Fargate's default security posture (task isolation, no SSH access, encryption) matched financial industry compliance requirements, and the service could scale from a small starting point without requiring infrastructure expertise upfront.
The architecture QRT built centers on two service types. Quant Computational Units (QCUs) perform data processing and algorithmic computation. Each QCU is paired with a head node service that handles request aggregation and caching, reducing database load by fetching the maximum requested time range and slicing locally rather than querying per request. Each service runs exactly one task to avoid consensus problems that arise when multiple instances publish overlapping data. Communication uses gRPC (Google's open-source remote procedure call framework) throughout, and service discovery runs through AWS Cloud Map, registering a DNS entry per deployment.
As the deployment grew from single-cluster to multi-cluster and then multi-account, QRT hit ECS service and vCPU quotas per account. They evolved to a shared VPC stretched across accounts, with head node accounts and QCU accounts separated. Service discovery works across accounts via DNS (registered on the account where the VPC was created) or through the Cloud Map API for cross-VPC lookups.
At the scale of tens of thousands of services, QRT moved away from sidecar-per-service telemetry collection. Amazon CloudWatch log subscriptions and metric streams (with Container Insights enabled) forward telemetry from each account to a centralized account, where Amazon Data Firehose routes it to their third-party observability vendor. For application-level traces and metrics, a centralized OpenTelemetry collector cluster running in the shared VPC receives push-based data from services across accounts and forwards it over AWS PrivateLink. Centralizing collection removed the need to size sidecars per service and gave QRT a single configuration point across the entire fleet.
Looking ahead, QRT is evaluating ECS Managed Instances for GPU workloads and tasks exceeding Fargate's 120 GB memory limit, and is planning a multi-region expansion to improve compute flexibility and resilience.
- Topics
- ServerlessContainers
- Language
- English
Relevant content
- Accepted Answerasked 2 years ago
- asked 5 months ago
AWS OFFICIALUpdated 2 years ago