re:Invent 2025 - Build AI-powered developer experiences with MCP on ECS and EKS
AWS announced fully managed, remotely hosted Model Context Protocol (MCP) servers for Amazon ECS and Amazon EKS at re:Invent 2025. These servers give AI coding assistants real-time awareness of your container infrastructure, enabling natural language operations and in-console troubleshooting powered by Amazon Q Developer.
AI coding assistants are increasingly part of how teams build and operate container workloads, but they have a structural limitation: they're trained on data up to a fixed cutoff date, and they weren't designed to navigate a cloud console. Ask an AI assistant to help troubleshoot a failing ECS task or check whether your EKS cluster is ready for a version upgrade, and you'll typically get generic guidance drawn from older training data. Steve Kendrex, Senior Manager of Product Management, and George John, Senior Product Manager, presented session CNS358 at AWS re:Invent 2025 to address exactly this gap. In this post, we'll walk through why AWS built MCP servers for container services, how the hosted architecture works, what you can do with the EKS and ECS MCP servers today, and the new in-console troubleshooting experience powered by Amazon Q Developer.
Why AI tools need a purpose-built interface for container services
Standard large language models (LLMs) face two fundamental challenges when interacting with AWS services. First, they don't have access to data beyond their training cutoff, so anything released after that date, whether a new feature, a new API, or a behavior change, may produce unreliable guidance. Second, even for well-documented services, LLMs tend to rely on their training data rather than fetching current documentation, which means they frequently gravitate toward older patterns and miss newer, more effective approaches.
For container services specifically, this matters in concrete ways. Amazon EKS and Amazon ECS evolve quickly. EKS alone has four major tool categories worth teaching an agent: cluster lifecycle management, Kubernetes resource management, documentation retrieval, and troubleshooting. When an agent lacks an up-to-date interface to these services, it defaults to what it knows, often recommending kubectl commands it already understands rather than applying the most current guidance for your specific situation.
The Model Context Protocol (MCP), introduced by Anthropic as an open standard for AI-tool integration, gives a way to solve this. Rather than asking developers to inject context manually or hoping the agent finds the right documentation through web search, an MCP server exposes a defined set of tools and resources that any MCP-compatible client can discover and use. The analogy is close to what REST APIs did for web backends: a standardized contract between the AI client and the data or action it needs.
Hosted MCP servers for EKS and ECS
Earlier in 2025, AWS released open-source, locally installed MCP servers for Amazon EKS, Amazon ECS, Finch, and AWS Lambda through the AWS Labs GitHub repository. These remain available and are a good starting point for individual developers. The hosted servers announced at re:Invent address a different set of needs: enterprise security requirements, centralized patching control, and the ability for third-party agents and SaaS tools to connect without requiring a local installation on every developer machine.
The hosted MCP servers for Amazon EKS and Amazon ECS are available in Preview in all commercial AWS regions outside of China and GovCloud. Each service has a separate regional endpoint, and connecting to both from the same MCP client requires only two configuration blocks, not two separate proxy processes.
Authentication uses AWS IAM with SigV4 request signing. Because MCP protocol does not natively support SigV4, AWS provides a lightweight proxy that handles signing on behalf of the MCP client. You install the proxy once via the Python Package Index using uvx, configure it with your AWS profile and the target region, and it handles credential management for all AWS-hosted MCP servers you configure. The configuration for the EKS server points to the eks-mcp endpoint in your chosen region; ECS uses ecs-mcp at a corresponding regional endpoint.
IAM permissions are intentionally tiered. The InvokeMCP permission allows the client to discover available tools. CallReadOnlyTools grants access to read operations: listing clusters, describing resources, fetching logs and metrics. CallPrivilegedTools adds write and mutating operations such as creating clusters or applying Kubernetes manifests. For production environments, the recommendation from the session is to start with only InvokeMCP and CallReadOnlyTools. A pre-built IAM managed policy called AmazonEKSMCPReadOnlyAccess covers all permissions needed by the read-only tool set, including calls to Amazon CloudWatch, AWS STS, and Amazon EC2 that some tools make internally.
What the EKS and ECS MCP servers can do
The EKS MCP server exposes 20 tools organized across four areas. Cluster management tools cover creating and describing clusters and add-ons. Kubernetes resource tools let your AI assistant create, delete, describe, and list native Kubernetes objects like pods, services, and namespaces, without requiring a configured kubeconfig or kubectl installation. Troubleshooting tools include two that stand out in practice.
The search_eks_documentation tool connects to a live index of AWS documentation, blog posts, and what's new announcements. When an LLM hits a knowledge gap on a recently released feature, this tool retrieves current content and injects it into the response. The second, eks_troubleshooting_guide, accesses a knowledge base built from internal runbooks accumulated through managing a large number of Kubernetes clusters at AWS. This gives the agent structured troubleshooting steps for common failure patterns rather than having it reason from first principles.
The demo in the session showed this in action with a load balancer service stuck in a pending state. The agent called the troubleshooting guide, followed the runbook steps to gather events and describe the service object, and identified that required Kubernetes tags were missing from the associated subnets. It returned the root cause and exact remediation steps.
The EKS MCP server also handles upgrade readiness assessments. In the demo, a single prompt asking for upgrade readiness on a 1.31 cluster produced a structured report that identified a blocking add-on version incompatibility with the target version, listed passing checks across cluster health and kube-proxy skew, and produced an ordered list of required steps before the upgrade could proceed safely.
The ECS MCP server is currently read-only, so CallPrivilegedTools is not applicable. Its tools fall into three categories: operational tools for deployment status and network configuration retrieval, resource management tools for inspecting task definitions and service configurations, and troubleshooting tools targeting the most common ECS failure patterns. Several tools call Amazon EC2 APIs internally (for network configuration details), so the required IAM permissions extend beyond ECS-specific actions, with the full list available in the ECS user guide.
Troubleshooting directly from the console
Beyond IDE integration, the hosted MCP servers power a new troubleshooting experience inside the AWS Management Console. Across multiple places in both the EKS and ECS consoles where errors or degraded states appear, there is now an "Inspect with Amazon Q" button. Clicking it opens an Amazon Q Developer chat panel pre-populated with context about the specific failure. The agent then calls EKS or ECS MCP tools automatically to gather telemetry, consult relevant runbooks, and surface a root cause with recommended remediation steps.
In the ECS demo, a task stuck in a stopped state due to an image pull failure triggered this flow. Amazon Q called ECS tools to describe the task definition, retrieve task health data, and pull relevant knowledge. After a few retries on one API call, it identified the contributing network configuration factors and provided step-by-step guidance for resolving the issue. Critically, Q does not automatically apply fixes. It presents findings and lets you decide whether the recommended action is correct for your environment.
The same integration is available at multiple entry points in the EKS console: the observability dashboard for cluster health issues, upgrade insights, control plane monitoring, and node health.
Getting started
To connect your MCP client to the hosted EKS or ECS MCP server, you need three things: an IAM identity with the appropriate permissions attached, the aws-mcp-servers-auth-proxy package installed via uvx, and an MCP configuration block in your client (such as Kiro, Cursor, or Cline) pointing to the regional endpoint with your AWS profile specified. The EKS user guide and ECS user guide each include copy-ready configuration snippets for Mac and Windows. The one-time setup typically takes a few minutes, after which all configured tools are immediately discoverable and usable through natural language in your AI assistant.
The MCP servers are available today in Preview. The locally hosted open-source versions are on the AWS Labs GitHub repository for teams that prefer to run them locally.
Watch the full session recording: CNS358 - Build AI-powered developer experiences with MCP on ECS and EKS
Relevant content
- asked a year ago
