Create a Glue table (Presto/Athena view) for analyzing ALB logs using CloudFormation or Terraform

1

I am trying to create a Glue/Athena table using CloudFormation or Terraform for analyzing ALB logs. I can easily do it using AWS Console following this guide https://docs.aws.amazon.com/athena/latest/ug/application-load-balancer-logs.html However I can't find any recipe how to do it using CloudFormation or Terraform. I was trying to follow this example ( https://stackoverflow.com/a/56347331 ) but it just creates a Presto view, not using any regex to parse anything in S3 bucket.

How can I achieve it? Could anyone please point some example or documentation?

1 Answer
0

You could start from the Athena Cloudformation documentation page.

Look at the Named Query construct, there is also an example. If you are looking to just create the same table as for the link you mentioned in the question it should look like the code snippet below.

You can also see the cloudformation template from the Athena Workshop to see how the Named Query construct has been used in it. Hope this helps.

AthenaNamedQuery:
    Type: AWS::Athena::NamedQuery
    Properties:
      Database: "swfnetadata"
      Description: "A query that selects all aggregated data"
      Name: "MostExpensiveWorkflow"
      QueryString: >
                    CREATE EXTERNAL TABLE IF NOT EXISTS alb_logs (
                                type string,
                                time string,
                                elb string,
                                client_ip string,
                                client_port int,
                                target_ip string,
                                target_port int,
                                request_processing_time double,
                                target_processing_time double,
                                response_processing_time double,
                                elb_status_code int,
                                target_status_code string,
                                received_bytes bigint,
                                sent_bytes bigint,
                                request_verb string,
                                request_url string,
                                request_proto string,
                                user_agent string,
                                ssl_cipher string,
                                ssl_protocol string,
                                target_group_arn string,
                                trace_id string,
                                domain_name string,
                                chosen_cert_arn string,
                                matched_rule_priority string,
                                request_creation_time string,
                                actions_executed string,
                                redirect_url string,
                                lambda_error_reason string,
                                target_port_list string,
                                target_status_code_list string,
                                classification string,
                                classification_reason string
                                )
                                ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
                                WITH SERDEPROPERTIES (
                                'serialization.format' = '1',
                                'input.regex' = 
                            '([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*):([0-9]*) ([^ ]*)[:-]([0-9]*) ([-.0-9]*) ([-.0-9]*) ([-.0-9]*) (|[-0-9]*) (-|[-0-9]*) ([-0-9]*) ([-0-9]*) \"([^ ]*) (.*) (- |[^ ]*)\" \"([^\"]*)\" ([A-Z0-9-_]+) ([A-Za-z0-9.-]*) ([^ ]*) \"([^\"]*)\" \"([^\"]*)\" \"([^\"]*)\" ([-.0-9]*) ([^ ]*) \"([^\"]*)\" \"([^\"]*)\" \"([^ ]*)\" \"([^\s]+?)\" \"([^\s]+)\" \"([^ ]*)\" \"([^ ]*)\"')
            LOCATION 's3://your-alb-logs-directory/AWSLogs/<ACCOUNT-ID>/elasticloadbalancing/<REGION>/';
AWS
EXPERT
answered 3 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions