Create a Glue table (Presto/Athena view) for analyzing ALB logs using CloudFormation or Terraform
I am trying to create a Glue/Athena table using CloudFormation or Terraform for analyzing ALB logs. I can easily do it using AWS Console following this guide https://docs.aws.amazon.com/athena/latest/ug/application-load-balancer-logs.html However I can't find any recipe how to do it using CloudFormation or Terraform. I was trying to follow this example ( https://stackoverflow.com/a/56347331 ) but it just creates a Presto view, not using any regex to parse anything in S3 bucket.
How can I achieve it? Could anyone please point some example or documentation?
You could start from the Athena Cloudformation documentation page.
Look at the Named Query construct, there is also an example. If you are looking to just create the same table as for the link you mentioned in the question it should look like the code snippet below.
You can also see the cloudformation template from the Athena Workshop to see how the Named Query construct has been used in it. Hope this helps.
AthenaNamedQuery:
Type: AWS::Athena::NamedQuery
Properties:
Database: "swfnetadata"
Description: "A query that selects all aggregated data"
Name: "MostExpensiveWorkflow"
QueryString: >
CREATE EXTERNAL TABLE IF NOT EXISTS alb_logs (
type string,
time string,
elb string,
client_ip string,
client_port int,
target_ip string,
target_port int,
request_processing_time double,
target_processing_time double,
response_processing_time double,
elb_status_code int,
target_status_code string,
received_bytes bigint,
sent_bytes bigint,
request_verb string,
request_url string,
request_proto string,
user_agent string,
ssl_cipher string,
ssl_protocol string,
target_group_arn string,
trace_id string,
domain_name string,
chosen_cert_arn string,
matched_rule_priority string,
request_creation_time string,
actions_executed string,
redirect_url string,
lambda_error_reason string,
target_port_list string,
target_status_code_list string,
classification string,
classification_reason string
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
WITH SERDEPROPERTIES (
'serialization.format' = '1',
'input.regex' =
'([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*):([0-9]*) ([^ ]*)[:-]([0-9]*) ([-.0-9]*) ([-.0-9]*) ([-.0-9]*) (|[-0-9]*) (-|[-0-9]*) ([-0-9]*) ([-0-9]*) \"([^ ]*) (.*) (- |[^ ]*)\" \"([^\"]*)\" ([A-Z0-9-_]+) ([A-Za-z0-9.-]*) ([^ ]*) \"([^\"]*)\" \"([^\"]*)\" \"([^\"]*)\" ([-.0-9]*) ([^ ]*) \"([^\"]*)\" \"([^\"]*)\" \"([^ ]*)\" \"([^\s]+?)\" \"([^\s]+)\" \"([^ ]*)\" \"([^ ]*)\"')
LOCATION 's3://your-alb-logs-directory/AWSLogs/<ACCOUNT-ID>/elasticloadbalancing/<REGION>/';
Relevant questions
Cloudformation support for creating configurations
asked 3 years agoGlue processing a csv
Accepted Answerasked 2 years agoAWS firewall manager and custom protocol lists
asked 3 months agoPossible to use nest json when configuring rule target constant input
asked 2 years agoWhat's the meaning of an AWS Glue Database name in italic?
asked 2 months agoCreate a Glue table (Presto/Athena view) for analyzing ALB logs using CloudFormation or Terraform
asked 5 months agoWebSocket handshake: Unexpected response code: 403
asked 2 years agoRacing condition of IAM role creation and AWS resources
Accepted Answerasked 3 years agoCan we create a Table in Aurora during the infra setup using terraform IaaC code ?
Accepted Answerasked 21 days agoHow to create Athena View using CDK
Accepted Answerasked 4 months ago
Hi Aleksei,
See if these templates could be of any help to you. You definitely need to customize it for your requirement.
https://github.com/garystafford/athena-glue-quicksight-demo/blob/master/cloudformation/smart-hub-athena-glue.yml https://gist.github.com/gcchaan/bd4f74034aa2e10b869dc7653cba2807
Thanks Rimpal Johal