Ongoing service disruptions
For the most recent update on ongoing service disruptions affecting the AWS Middle East (UAE) Region (ME-CENTRAL-1), refer to the AWS Health Dashboard. For information on AWS Service migration, see How do I migrate my services to another region?
Implementing comprehensive audit logging for Amazon Athena queries
This article demonstrates how to implement comprehensive audit logging for Amazon Athena queries with enriched user identity information. This solution helps organizations in regulated industries meet compliance requirements by providing detailed query audit trails with user attributions.
Introduction
In regulated industries such as healthcare, financial services, and government affairs, maintaining detailed audit trails of data access is not only a best practice, it's a compliance requirement. Organizations that are subject to regulations, such as HIPAA, SOX, GDPR, and FedRAMP, must be able to provide information on:
-
Who accessed what data?
-
When did the user access the data?
-
Where did the user access the data from?
AWS Enterprise Support frequently partners with customers in these regulated industries to help them implement robust auditing solutions that meet stringent regulatory requirements.
Athena is a powerful serverless query service that allows organizations to use standard SQL to analyze data in Amazon Simple Storage Service (Amazon S3). While Athena provides basic query history, this information lacks the detailed user attribution and enriched metadata that’s required for comprehensive audit trails in regulated environments. Specifically, Athena's native query history doesn't capture the following information:
-
User identity details: AWS Identity and Access Management (IAM) user or role names, Amazon Resource Names (ARNs), and principal IDs.
-
Source IP addresses: Where queries originated from.
-
User agent information: What tools or applications users accessed the information from.
-
Enriched query metadata: Detailed execution statistics and error information.
-
Long-term retention: Query history beyond Athena retention limits.
-
SQL-queryable format: Ability to use Athena to analyze audit logs.
During compliance audits and security assessments, organizations must be able to answer questions such as:
-
Who ran queries against our sensitive patient data last quarter?
-
Which users accessed financial records from outside our corporate network?
Without enriched audit logging, organizations need manual correlation of multiple AWS service logs to answer these questions, which is a time-consuming and error-prone process.
This solution demonstrates how AWS Enterprise Support collaborates with customers to architect and implement enhanced audit logging solutions for Athena. Through a combination of Amazon EventBridge, AWS Lambda, AWS CloudTrail, and Amazon S3, this solution provides a comprehensive audit trail that captures every query with full user attribution and detailed metadata. The solution seamlessly integrates with Athena's existing architecture and provides a SQL-queryable audit log that compliance teams can easily analyze.
Solution overview
This solution implements a two-stage enrichment process that captures comprehensive audit information for Athena queries:
Real-time enrichment
When Athena completes a query, EventBridge triggers a Lambda function that:
-
Fetches detailed query metadata from the Athena API, such as execution statistics, data scanned, and errors.
-
Tries to look up user identity information from CloudTrail, including the IAM user or role, source IP address, and user agent.
Note: This action might return a null message if events aren’t available. -
Combines all information into an enriched audit record.
-
Stores the record in Amazon S3 with date-based partitioning.
Backfill process
Because CloudTrail events can take up to 15 minutes to appear in the LookupEvents API, some queries might initially have null user identity information. A scheduled Lambda function runs every 10 minutes to:
-
Scan recent audit records for missing user identity data.
-
Look up CloudTrail events that are now available.
-
Update Amazon S3 audit records with complete user attribution.
This two-stage approach makes sure that the solution immediately creates audit records, while also providing complete user attribution when CloudTrail data becomes available.
The enriched audit records are stored in Amazon S3 with Hive-style partitioning, similar to the following example:
(`year=YYYY/month=MM/day=DD/`)
This partitioning makes the records immediately queryable through Athena. Compliance teams can then run SQL queries against the audit logs, such as the following:
sql
-- Find all queries by a specific user in the last 30 days
SELECT query_execution_id, query_text, submission_time, data_scanned_bytes
FROM athena_audit_logs
WHERE user_name = 'john.doe'
AND submission_time >= current_date - interval '30' day;
-- Identify queries from outside corporate network
SELECT user_name, source_ip, query_text, submission_time
FROM athena_audit_logs
WHERE source_ip NOT LIKE '10.%'
AND submission_time >= current_date - interval '7' day;
Prerequisites
-
AWS account with Athena workgroups configured.
-
S3 bucket for storing audit records, or permissions to create an S3 bucket.
-
IAM permissions to create Lambda functions, IAM roles, EventBridge rules, and CloudWatch log groups.
Solution implementation
For steps on how to implement the solution, including the Lambda function code and configuration examples, see Athena query audit solution on the GitHub website.
Conclusion
Through comprehensive audit logging for Athena queries, you can use existing AWS services to address complex compliance and security requirements. This solution, developed through collaboration between AWS Enterprise Support and customers in regulated industries, highlights the ability of serverless architectures to create robust audit trails without managing infrastructure.
Organizations can use this approach to achieve compliance with regulatory requirements while maintaining operational efficiency. For industries that deal with sensitive data, maintaining detailed audit trails isn’t optional. This solution provides the comprehensive query attribution needed to demonstrate compliance during audits.
The two-stage enrichment process immediately creates the audit records for real-time monitoring while providing complete user attribution when CloudTrail data becomes available. The solution is cost-effective, scalable, and non-intrusive, and doesn’t require any changes to existing Athena workloads or queries.
As your organization's compliance requirements evolve, you can easily extend this solution to:
-
Add custom alerting: Use Amazon Simple Notification Service (Amazon SNS) to send notifications for suspicious query patterns.
-
Implement data retention policies: Use Amazon S3 Lifecycle policies to archive old audit logs to Amazon S3 Glacier.
-
Enhance with AWS Lake Formation: Integrate with Lake Formation for fine-grained access control auditing.
-
Create compliance dashboards: Use Amazon QuickSight to visualize query patterns and user activity.
AWS Enterprise Support can help organizations implement similar compliance and security solutions. Our team of Cloud Support Engineers (CSEs) and Technical Account Managers (TAMs) can provide tailored guidance, share industry best practices, and offer hands-on support to help you optimize your AWS environment for regulatory compliance. To learn more about our plans and offerings, see AWS Support.
About the authors
Sana Aneja
Sana is a Senior TAM at AWS located in New Jersey. She focuses on supporting independent software vendor customers, and helps them migrate, optimize, and navigate their journey in the AWS Cloud. When not at work, Sana likes to spend time with family, listen to music, travel, and explore new places.
Rashmiman Ray Rashmiman is a TAM at AWS in New Jersey. He works with Enterprise customers to provide technical insights and cloud optimization strategies that drive their success in the cloud. Outside of work, Rashmiman enjoys hiking, playing cricket, and cooking Indian delicacies.
- Language
- English

Relevant content
- asked 2 years ago