Help us improve the AWS re:Post Knowledge Center by sharing your feedback in a brief survey. Your input can influence how we create and update our content to better support your AWS journey.
Accelerating AWS Health Maturity: A Practitioner's Guide
Accelerating AWS Health maturity improves security and resilience by helping teams proactively respond to changes that could cause service disruptions, security vulnerabilities, or cost increases. This implementation guide shares proven workflow patterns for systematically managing AWS Health events across distributed teams to achieve operational excellence.
Overview
Accelerating AWS Health maturity improves your security and resilience posture by proactively helping your teams respond to changes that could otherwise lead to service disruptions, security vulnerabilities, or billing increases. By integrating AWS Health events, especially planned or scheduled operational events, into your cloud operation processes, you can more seamlessly respond to essential updates like certificate rotations, end of standard support, and version upgrades—enabling your teams to address these changes independently.
Many organizations have developed mature workflow patterns that help distributed teams systematically address affected resources in an organized manner. This implementation guide shares proven approaches to help you systematically prioritize and resolve planned changes communicated by AWS Health, supporting your security and resilience posture through operational excellence.
Understanding organizational patterns for implementation
Your organizational operating model is the primary factor in determining the right implementation approach. We've identified three distinct operating models:
To identify which pattern fits your organization, answer these questions:
-
Does your central platform team have mandate authority over operational standards and tools?
- YES → Pattern 1: Hub and Spoke (Most Common)
- NO → Continue to Question 2
-
Do you have multiple autonomous business units with independent P&L and tech leadership?
- YES → Pattern 3: Siloed (Autonomous Units)
- NO → Continue to Question 3
-
Are operations organized by functional specialty teams (database, networking, compute, etc.)?
- YES → Pattern 2: Cross-Functional
- NO → Review all three patterns below or check Hybrid Scenarios
Pattern 1: Hub and Spoke (Most Common)
Organizational Characteristics:
- Central platform team (Cloud Center of Excellence (CCoE), or Platform Engineering team) with authority over operational standards
- Distributed DevOps/application teams responsible for building and deploying applications
- Shared/common tooling mandated across all operations teams (ITSM, monitoring, logging, etc.)
- Central team sets standards; distributed teams follow them
- Strong governance and compliance culture
- Clear ownership model with centralized coordination
Implementation Approach:
Central platform team implements Health event integration into shared tooling (ITSM/Operational Tooling) and mandates adoption across all distributed teams. Distributed teams consume standardized operational capabilities provided by the platform team.
Why this is most common: Organizations with mature cloud practices typically evolve toward centralized operational standards while maintaining distributed application development.
Pattern 2: Cross-Functional
Organizational Characteristics:
- Functional teams organized by technology domain (database team, compute team, networking team, security team)
- Application teams responsible for building features and applications
- Operations handled by functional specialists (e.g., database team manages all RDS/databases across applications)
- Application teams own their features but rely on functional teams for operational concerns
- Tool choices may vary by functional team
Implementation Approach:
Health events routed to appropriate functional teams based on AWS service type (RDS events → database team, EC2 events → compute team, etc.). Each functional team may use different tools and processes. Application teams notified when their resources are affected, but functional teams handle remediation.
Key consideration: Requires coordination across functional and application team boundaries. Works best when functional teams have clear SLAs and escalation paths.
Pattern 3: Siloed (Autonomous Units)
Organizational Characteristics:
- Multiple independent units (business units, product lines, geographic regions) each with complete autonomy
- Each unit owns everything from development through operations
- Separate P&L ownership and independent technology leadership per unit
- Minimal central governance or mandate authority
- Different maturity levels and tool choices across units
- Each unit operates as a separate company within the larger organization
Implementation Approach:
"Lighthouse" strategy - Start with most mature or willing unit, demonstrate success, let other units voluntarily adopt based on proven value. No forced standardization. Each unit implements independently using their own tools and processes. Central team provides guidance and shares best practices but doesn't mandate adoption.
Key consideration:
Success spreads through demonstration, not mandate. Focus on showing ROI and operational benefits to drive organic adoption.
What if your organization doesn't fit these patterns?
Many organizations find themselves in hybrid scenarios or transition states that don't clearly match one pattern. Here's how to proceed:
Common Hybrid Scenarios
Scenario 1: Partial Shared Tooling • Example: "We have shared ServiceNow for compliance tracking, but teams choose their own monitoring tools (ITSM or other Operational tooling)." • Recommendation: Start with Hub and Spoke for ITSM integration (the mandated tool) but allow flexibility for notification channels. Central team owns ticketing; distributed teams choose how they receive alerts.
Scenario 2: Regional Variation • Example: "Our US operations follow Hub and Spoke, but EMEA is more autonomous." • Recommendation: Implement pattern per region. Don't force uniformity. Hub and Spoke for US, Siloed for EMEA. Share learnings but respect regional operating models.
Scenario 3: Transitioning Between Patterns • Example: "We're moving from Siloed to Hub and Spoke but not there yet." • Recommendation: Start with Siloed + Level 2 for quick wins. As centralized tooling gets adopted, migrate units to Hub and Spoke + Level 3 incrementally. Don't wait for perfect centralization.
Scenario 4: Mixed Functional and Application Teams • Example: "Our database and networking teams are centralized, but our app teams manage their own compute resources." • Recommendation: Use hybrid routing: RDS/Aurora events → centralized database team (Cross-Functional), EC2/ECS events → distributed app teams via tags (Hub and Spoke).
Scenario 5: Small Organization Planning to Scale • Example: "We're 200 accounts today but growing fast. What should we build?" • Recommendation: Start simple with AWS User Notifications (Level 2) now. Plan for Hub and Spoke + Level 3 when you hit 500+ accounts or 20+ person ops team. Don't over-engineer early.
Implementation architecture patterns
The following section provides detailed architecture patterns for different combinations of organizational operating models and maturity levels. Choose tools your teams already use and are trained on.
Hub and Spoke + Level 2: Centralized Notifications
When to use: Hub-and-spoke organization wants to reduce email overload but isn't ready for full accountability tracking.
Architecture components:
- AWS Health API (organizational view)
- EventBridge rules filtering by event type
- Lambda function with basic routing logic
- Slack/Teams/Email notifications to central platform team
- No ITSM integration, no compliance dashboards
Hub and Spoke + Level 3: Enterprise Implementation (Full Stack)
When to use: Hub-and-spoke organizations with enterprise ITSM and resource tagging in place.
Architecture components:
- AWS Health API (organizational view)
- EventBridge + Lambda for event processing
- AWS Config for resource inventory
- Resource tags for ownership (Owner, Application, Environment)
- DynamoDB for routing rules
- ServiceNow/JIRA for ticketing
- Custom dashboards for compliance tracking
Siloed + Level 2: Per-Unit Notifications
When to use: Siloed organization where each autonomous unit wants independent notification setup.
Architecture components:
- Shared: AWS Health API + EventBridge org-level rules
- Per-Unit: EventBridge rules filtering for unit-specific accounts
- Per-Unit: Notification channels (Slack, Teams, or Email)
- No shared ITSM, no centralized tracking
Siloed + Level 3: Per-Unit ITSM Integration
When to use: Siloed organization where units want full accountability tracking with autonomy.
Implementation approach:
- Start with most mature unit ("lighthouse")
- Document success and benefits
- Use success story to drive adoption in other units
- Allow tool variation across units (don't force standardization)
Each unit implements Hub and Spoke + Level 3 architecture independently with unit-specific tool choices.
Cross-Functional + Level 2: Service-Based Routing
When to use: Cross-functional organization with specialized teams (database team, compute team, networking team) handling operations.
Architecture components:
- AWS Health API
- EventBridge rules filtering by AWS service
- Lambda routing logic based on service type
- Slack/Teams channels per functional team
- No ITSM integration initially
Cross-Functional + Level 3: Service-Based ITSM Integration
When to use: Cross-functional organization wanting full accountability with service-specific routing to functional teams.
Architecture components:
- AWS Health API with EventBridge
- Service-based routing to functional team queues
- ITSM integration (JIRA, ServiceNow) with team-specific projects/queues
- Application teams cc'd on events affecting their resources
- SLA tracking per functional team
Key consideration: Requires coordination between functional teams (who remediate) and application teams (who own the workload). Clear escalation paths essential.
Getting started
This implementation guide provides proven patterns and best practices.
Remember:
- Start simple - Most organizations should target Level 2 first, then Level 3
- Organizational pattern matters more than account count - Structure drives architecture
- Level 3 is the sweet spot - Provides accountability without automation complexity
- Level 4 is optional - Most organizations find Level 3 sufficient
- Use tools your teams already know - Don't introduce new platforms unnecessarily
For additional resources and discussion, visit the AWS Health community forums
- Topics
- Management & Governance
- Tags
- AWS Health
- Language
- English
Relevant content
AWS OFFICIALUpdated 4 months ago- Accepted Answerasked a year ago