Query JSON files from S3 with Athena

0

Hello, Can someone please help? I set up a trail to audit all TLS calls in the account and saved all logs in S3. I tried to query the logs from S3 with Athena. This is the query I created:

CREATE EXTERNAL TABLE cloudtrail_logs_tls_calls (
    eventVersion STRING,
    userIdentity STRUCT<
        type: STRING,
        principalId: STRING,
        arn: STRING,
        accountId: STRING,
        accessKeyId: STRING,
        sessionContext: STRUCT<
            sessionIssuer: STRUCT<
                type: STRING,
                principalId: STRING,
                arn: STRING,
                accountId: STRING,
                userName: STRING>>>,
    eventTime STRING,
    eventSource STRING,
    eventName STRING,
    awsRegion STRING,
    sourceIpAddress STRING,
    userAgent STRING,
    requestParameters STRUCT<
		maxResults: STRING>,
    responseElements STRING,
    requestId STRING,
    eventId STRING,
    eventType STRING,
    managementEvent STRING,
    recipientAccountId STRING,
    eventCategory STRING,
    tlsDetails STRUCT<
        tlsVersion: STRING,
        cipherSuite: STRING,
        clientProvidedHostHeader: STRING>
)
COMMENT 'CloudTrail table for <bucket_name> bucket'
ROW FORMAT SERDE 'com.amazon.emr.hive.serde.CloudTrailSerde'
STORED AS INPUTFORMAT 'com.amazon.emr.cloudtrail.CloudTrailInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION 's3://<bucket_name>/AWSLogs/<Acccount_Number>/CloudTrail/'
TBLPROPERTIES ('classification'='cloudtrail');

Then, when preview the table. I get this error:

Enter image description here

Thank you in advance for help,

4 個答案
2
已接受的答案

Amazon Athena lets you parse JSON-encoded values, extract data from JSON, search for values, and find length and size of JSON arrays.

You get this error if you used an incorrect SerDe during table definition. For example, the table might be using a JSON SerDe, and the source data includes Parquet objects. [1]

https://aws.amazon.com/premiumsupport/knowledge-center/athena-hive-cursor-error -- [1]

To resolve this error, check the source data and confirm that the correct SerDe is used. For more information, see Supported SerDes and data formats [2]

Link- https://docs.aws.amazon.com/athena/latest/ug/supported-serdes.html -- [2]

profile pictureAWS
支援工程師
已回答 2 年前
AWS
專家
已審閱 2 年前
1

Hi, Thanks for response. I changed the ROW FORMAT serde to 'org.openx.data.jsonserde.JsonSerDe'. Now I am getting the information I need. Really appreciate your help.

AWS
已回答 2 年前
1

No problem. Happy to help. :)

profile pictureAWS
支援工程師
已回答 2 年前
0

Hi There In addition to the above answer, are you able to see TLSDetails in Athena? Take a look at this blog post which mentions that TLSDetails isn't yet supported in Athena.

Amazon Athena: You can query AWS CloudTrail logs in Amazon Athena, and we will be adding support for querying the TLS values in your CloudTrail logs in the coming months. Look for updates and announcements about this in future AWS Security Blog posts.

https://aws.amazon.com/blogs/security/tls-1-2-required-for-aws-endpoints/

One recommendation is to use AWS CloudTrail Lake: You can follow the steps, and use the sample TLS query, in the blog post Using AWS CloudTrail Lake to identify older TLS connections. There is also a built-in sample CloudTrail TLS query available in the AWS CloudTrail Lake console.

profile pictureAWS
專家
Matt-B
已回答 2 年前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南