Centralize and analyze logs at scale using Amazon CloudWatch unified data store
Large enterprise organizations managing hundreds of AWS accounts face significant challenges with log aggregation, management, and analysis. This article covers how Amazon CloudWatch unified data store, log centralization, and enablement rules work together to consolidate operational, security, and compliance logs from multiple accounts and Regions into one location, and how to run analytics on that data using CloudWatch Log Analytics, Amazon Athena, or any Apache Iceberg-compatible tool
The challenge
Enterprise customers with several hundred AWS accounts typically struggle with:
- Fragmented log data spread across hundreds of accounts and multiple Regions, making incident investigation slow and error-prone.
- Custom-built aggregation pipelines where each source account needs its own subscription filter, cross-account IAM role, CloudWatch Logs destination resource, and a streaming layer (such as a Kinesis Data Stream or Firehose delivery stream per account) to forward logs to a central location. Adding a new account means repeating this setup, and a failed delivery in any single account requires individual troubleshooting.
- High operational overhead from onboarding new accounts, troubleshooting failed deliveries, and managing cross-account permissions.
- Multiple data stores for different use cases (operations, security, compliance), leading to data duplication, inconsistent formats, and complex ETL pipelines.
- Slow mean time to resolution (MTTR) because teams must search through multiple accounts and correlate events across different Regions during incidents.
Solution overview
Amazon CloudWatch provides three capabilities that address these challenges:
- Enablement rules automatically turn on logging for AWS resources across your organization, so you get consistent telemetry coverage without touching each resource individually.
- Centralization rules replicate log data from multiple accounts and Regions into a single destination account using rules that integrate with AWS Organizations.
- Unified data store consolidates operational, security, and compliance data from AWS services and third-party sources into one location with built-in transformation pipelines, faceted queries, and Apache Iceberg-compatible access through Amazon S3 Tables.
These three capabilities remove the need for custom aggregation infrastructure and give you a single location for log analytics across your entire organization.
Architecture
The following diagram shows how logs flow from source accounts to a central account and become available for analytics:
┌─────────────────────────────────────────────┐
│ AWS Organization │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │Account A │ │Account B │ │Account N │...│
│ │us-east-1 │ │eu-west-1 │ │ap-south-1│ │
│ │ │ │ │ │ │ │
│ │• VPC Flow│ │• VPC Flow│ │• VPC Flow│ │
│ │• CTrail │ │• CTrail │ │• CTrail │ │
│ │• Route53 │ │• Route53 │ │• Route53 │ │
│ │• EKS Logs│ │• EKS Logs│ │• EKS Logs│ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
│ └─────────────┼─────────────┘ │
│ │ │
│ Centralization Rules │
│ (Org-wide / OU / account) │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────┐ │
│ │ Central Logging Account │ │
│ │ │ │
│ │ ┌────────────────────────────────┐ │ │
│ │ │ CloudWatch Unified Data Store │ │ │
│ │ │ • Pipelines │ │ │
│ │ │ • Connectors │ │ │
│ │ │ • Faceted queries │ │ │
│ │ └───────────┬──────────┬─────────┘ │ │
│ │ │ │ │ │
│ │ ▼ ▼ │ │
│ │ ┌───────────┐ ┌───────────────┐ │ │
│ │ │Log │ │S3 Tables │ │ │
│ │ │Analytics │ │(Iceberg) │ │ │
│ │ │• LogsQL │ │• Athena │ │ │
│ │ │• SQL │ │• Redshift │ │ │
│ │ │• PPL │ │• Spark │ │ │
│ │ └───────────┘ └───────────────┘ │ │
│ └──────────────────────────────────────┘ │
└─────────────────────────────────────────────┘
Understanding centralization rules and enablement rules
CloudWatch provides two types of organization-wide rules that work together for centralized logging.
Enablement rules
Enablement rules control whether log data is collected. They automatically configure telemetry collection on AWS resources.
- Turn on logging for new and existing resources that match the rule scope.
- Can be scoped to the organization, an OU, or an individual account.
- Support resource tag filters to target specific resources.
- Follow a hierarchy: organization-level rules take precedence, then OU-level, then account-level.
- Use AWS Config (via an internal service-linked recorder at no additional charge) to discover resources and track compliance.
Centralization rules
Centralization rules control where log data is stored. They replicate logs from source accounts into a central destination account.
- Copy log data from multiple source accounts and Regions into a single destination account.
- Integrate with AWS Organizations. You can scope them to the entire organization, specific OUs, or individual accounts.
- Pick up new accounts automatically as they join the organization or OU.
- Add
@aws.accountand@aws.regionsystem fields to every log event for traceability. - Only process new log data arriving after rule creation (no backfill of historical data).
- No additional charge for the first copy to the primary destination Region.
How they work together
- Enablement rules make sure logs are being generated. For example, they guarantee that every VPC has flow logs turned on.
- Centralization rules then replicate those logs from each source account into your central logging account for unified analysis.
| Aspect | Enablement rules | Centralization rules |
|---|---|---|
| Purpose | Ensure logs are being generated on resources | Aggregate existing logs into a central account |
| Manages | Whether logs are collected | Where logs are stored |
| Configuration location | CloudWatch → Ingestion → Enablement rules | CloudWatch → Settings → Organization |
| Backfill | Applies to existing and new resources | No, only new data after rule creation |
| Integration | AWS Organizations + AWS Config | AWS Organizations |
For a step-by-step walkthrough of setting up centralization rules, see Simplifying Log Management using Amazon CloudWatch Logs Centralization.
The unified data store
Beyond log centralization, the CloudWatch unified data store provides managed ingestion from AWS and third-party sources through a growing catalog of connectors.
AWS sources: Organization-wide enablement is available for a range of AWS services including networking, security, and container logs. New data sources are added regularly. Check the CloudWatch console for the current list.
Third-party sources: Managed connectors cover categories such as endpoint security, identity providers, cloud security posture management, network security, productivity tools, and IT service management platforms. The connector catalog continues to expand.
Transformation pipelines: CloudWatch pipelines normalize and enrich data during ingestion. Processors include OCSF conversion for security data standardization, Grok parsing for unstructured logs, and field-level operations for enrichment. Apply transformations in the central account after logs are consolidated. Transformations applied in source accounts are not reflected in centralized logs.
S3 Tables integration: You can make centralized log data available as Apache Iceberg tables for querying with Athena, Redshift, SageMaker, or any Iceberg-compatible tool. Data flows to S3 Tables only after the association is created (no backfill), and retention matches the log group's retention policy.
Querying and analyzing centralized logs
Once logs are centralized, you can query across all accounts and Regions from a single interface using the @aws.account and @aws.region system fields.
CloudWatch Log Analytics (real-time operational queries)
-- Find errors across all accounts in a specific Region fields @timestamp, @aws.account, @aws.region, @message | filter @message like /(?i)(error|exception|failed)/ | stats count() as error_count by @aws.account, @aws.region, bin(5m) | sort error_count desc
-- Cross-account security investigation fields @timestamp, @aws.account, @aws.region, @message | filter @message like /(?i)(unauthorized|denied|forbidden)/ | stats count() as denied_attempts by @aws.account, @aws.region | sort denied_attempts desc
-- Identify top talkers across VPC Flow Logs from all accounts fields @timestamp, srcAddr, dstAddr, bytes, @aws.account | stats sum(bytes) as total_bytes by srcAddr, @aws.account | sort total_bytes desc | limit 20
Amazon Athena (cross-source correlation via S3 Tables)
After enabling S3 Tables integration, you can query logs using standard SQL. Table names depend on your data source configuration. Check the AWS Glue Data Catalog for the actual names created by the S3 Tables integration.
-- Correlate network traffic with API activity from a suspicious IP range SELECT v.srcaddr, v.dstaddr, v.bytes, c.eventsource, c.eventname, c.useridentity.arn FROM vpc_flow_logs v JOIN cloudtrail_logs c ON v.srcaddr = c.sourceipaddress WHERE v.srcaddr LIKE '198.51.100.%' AND v.start_time > current_timestamp - interval '24' hour ORDER BY v.start_time DESC;
-- Monthly log volume analysis by account and data source SELECT account_id, data_source_type, DATE_TRUNC('month', event_time) as month, COUNT(*) as event_count, SUM(LENGTH(message)) / (1024*1024*1024) as approx_gb FROM unified_logs GROUP BY account_id, data_source_type, DATE_TRUNC('month', event_time) ORDER BY approx_gb DESC;
Natural language queries
CloudWatch Log Analytics supports natural language querying. You can type questions like:
- "Show me the top 5 accounts with the most errors in the last hour"
- "List failed authentication attempts grouped by Region"
- Language
- English
Relevant content
AWS OFFICIALUpdated 9 months ago- Accepted Answerasked a year ago