Hello,
I have currently setup cloudtrail logs to be query able in athena, however my create table looks something like
CREATE EXTERNAL TABLE cloudtrail(
eventVersion STRING,
userIdentity STRUCT<
type: STRING,
principalId: STRING,
arn: STRING,
accountId: STRING,
invokedBy: STRING,
accessKeyId: STRING,
userName: STRING,
sessionContext: STRUCT<
attributes: STRUCT<
mfaAuthenticated: STRING,
creationDate: STRING>,
sessionIssuer: STRUCT<
type: STRING,
principalId: STRING,
arn: STRING,
accountId: STRING,
userName: STRING>,
ec2RoleDelivery:string,
webIdFederationData:map<string,string>
>
>,
eventTime STRING,
eventSource STRING,
eventName STRING,
awsRegion STRING,
sourceIpAddress STRING,
userAgent STRING,
errorCode STRING,
errorMessage STRING,
requestparameters STRING,
responseelements STRING,
additionaleventdata STRING,
requestId STRING,
eventId STRING,
readOnly STRING,
resources ARRAY<STRUCT<
arn: STRING,
accountId: STRING,
type: STRING>>,
eventType STRING,
apiVersion STRING,
recipientAccountId STRING,
serviceEventDetails STRING,
sharedEventID STRING,
vpcendpointid STRING,
tlsDetails struct<
tlsVersion:string,
cipherSuite:string,
clientProvidedHostHeader:string>
)
PARTITIONED BY (accountid string,region string,year string,month string,day string)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
STORED AS INPUTFORMAT 'com.amazon.emr.cloudtrail.CloudTrailInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
's3://aws-controltower-logs-423167154025-ca-central-1/o-1vy7fz11og/AWSLogs/'
TBLPROPERTIES (
'projection.accountid.range'='0,9999999999999',
'projection.accountid.type'='integer',
'projection.enabled'='true',
'projection.region.type'='enum',
'projection.region.values'='us-east-2,us-east-1,us-west-1,us-west-2,af-south-1,ap-east-1,ap-south-1,ap-northeast-3,ap-northeast-2,ap-southeast-1,ap-southeast-2,ap-northeast-1,ca-central-1,eu-central-1,eu-west-1,eu-west-2,eu-south-1,eu-west-3,eu-north-1,me-south-1,sa-east-1,eu-south-2',
'projection.year.format'='yyyy',
'projection.year.interval'='1',
'projection.year.interval.unit'='YEARS',
'projection.year.range'='2021,NOW',
'projection.year.type'='date',
'projection.month.format'='MM',
'projection.month.interval'='1',
'projection.month.interval.unit'='MONTHS',
'projection.month.range'='01,12',
'projection.month.type'='integer',
'projection.day.format'='dd',
'projection.day.interval'='1',
'projection.day.interval.unit'='DAYS',
'projection.day.range'='01,31',
'projection.day.type'='integer',
'projection.day.digits'='2',
'storage.location.template'='s3://aws-controltower-logs-423167154025-ca-central-1/o-1vy7fz11og/AWSLogs/${accountid}/CloudTrail/${region}/${year}/${month}/${day}')
While I am able to query the table however it compulsorily needs me to pass the account id in the where clause. no matter I add date and region in the where clause. Is this a caveat?
For example If I run the query:
select * from cloudtrail_db.cloudtrail where year='2023' and month='11' and day='01' and region='ca-central-1'
it gives me an error
Amazon Athena experienced an internal error while executing this query. Please try submitting the query again and if the issue reoccurs, contact AWS support for further assistance. We apologize for the inconvenience.
But if I run the same query as above along with account id it is able to query.