- Newest
- Most votes
- Most comments
Hello,
I see that the DDL you are using is for creating the table over S3 access logs and the S3 location you are using in your query is where the cloudtrail logs are being stored. I suspect this is the reason it is not able to read the file. Could you please cross check the same. You can refer https://docs.aws.amazon.com/athena/latest/ug/cloudtrail-logs.html for creating the table in athena over cloudtrail logs and https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-s3-access-logs-to-identify-requests.html for creating table in athena over S3 access logs.
Below mentioned is the DDL for creating table over cloudtrail logs in athena. Please note that it is just for your reference and you can modify it according to your use case.
CREATE EXTERNAL TABLE [TABLE_NAME] ( eventVersion STRING, userIdentity STRUCT< type: STRING, principalId: STRING, arn: STRING, accountId: STRING, invokedBy: STRING, accessKeyId: STRING, userName: STRING, sessionContext: STRUCT< attributes: STRUCT< mfaAuthenticated: STRING, creationDate: STRING>, sessionIssuer: STRUCT< type: STRING, principalId: STRING, arn: STRING, accountId: STRING, username: STRING>, ec2RoleDelivery: STRING, webIdFederationData: MAP<STRING,STRING>>>, eventTime STRING, eventSource STRING, eventName STRING, awsRegion STRING, sourceIpAddress STRING, userAgent STRING, errorCode STRING, errorMessage STRING, requestParameters STRING, responseElements STRING, additionalEventData STRING, requestId STRING, eventId STRING, resources ARRAY<STRUCT< arn: STRING, accountId: STRING, type: STRING>>, eventType STRING, apiVersion STRING, readOnly STRING, recipientAccountId STRING, serviceEventDetails STRING, sharedEventID STRING, vpcEndpointId STRING, tlsDetails STRUCT< tlsVersion: STRING, cipherSuite: STRING, clientProvidedHostHeader: STRING> ) COMMENT 'CloudTrail table for [S3_BUCKET_NAME] bucket' ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe' STORED AS INPUTFORMAT 'com.amazon.emr.cloudtrail.CloudTrailInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '[S3_BUCKET_URL]' TBLPROPERTIES ('classification'='cloudtrail');
+1 to the above answer. Another option for you to try is CloudTrail Lake service. You could then use the sample TLS query to pull a report who is making calls using outdated TLS versions.
Please refer to the CloudTrail Lake pricing and may be cleanup or delete after you pull this report if you dont want to incur ongoing cost.
Thanks to you both! This URL in particular was very valuable:
https://docs.aws.amazon.com/athena/latest/ug/cloudtrail-logs.html
This especially was helpful:
The example uses a LOCATION value of logs for a particular account, but you can use the degree of specificity that suits your application.
For example:
To analyze data from multiple accounts, you can roll back the LOCATION specifier to indicate all AWSLogs by using LOCATION 's3://MyLogFiles/AWSLogs/'.
To analyze data from a specific date, account, and Region, use LOCATION 's3://MyLogFiles/123456789012/CloudTrail/us-east-1/2016/03/14/'.
Knowing that I could point the table to a specific month of Cloudtrail logs helped bypass a lot of unnecessary data and processing time when I finally ran my query in Athena.
Relevant content
- asked 7 months ago
- Accepted Answerasked a year ago
- AWS OFFICIALUpdated 3 months ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated a year ago