Skip to content

How do I use Athena to query my Application Load Balancer connection logs?

2 minute read
2

I want to use Amazon Athena to query my Application Load Balancer connection logs.

Resolution

To query Application Load Balancer connection logs, create a database and table for Athena. Complete the following steps:

  1. Open the Athena console.

  2. To create a database, run the following command in the Query Editor:

    CREATE DATABASE alb_db

    Note: It's a best practice to create the database in the same AWS Region as your Amazon Simple Storage Service (Amazon S3) bucket.

  3. After you create the database, create a table with the following query:

    CREATE EXTERNAL TABLE IF NOT EXISTS alb_connection_logs (  
             time string,  
             client_ip string,  
             client_port int,  
             listener_port int,  
             tls_protocol string,  
             tls_cipher string,  
             tls_handshake_latency double,  
             leaf_client_cert_subject string,  
             leaf_client_cert_validity string,  
             leaf_client_cert_serial_number string,  
             tls_verify_status string,  
             conn_trace_id string  
             )  
                PARTITIONED BY  
                (  
                 day STRING  
                )  
                ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'  
                WITH SERDEPROPERTIES (  
                'serialization.format' = '1',  
                'input.regex' =  
                 '([^ ]*) ([^ ]*) ([0-9]*) ([0-9]*) ([A-Za-z0-9.-]*) ([^ ]*) ([-.0-9]*) \"([^\"]*)\" ([^ ]*) ([^ ]*) ([^ ]*) ?([^ ]*)?( .*)?'  
                )  
                LOCATION 's3://S3-LOCATION/AWSLogs/ACCOUNT-NUMBER/elasticloadbalancing/REGION/'  
                TBLPROPERTIES  
                (  
                 "projection.enabled" = "true",  
                 "projection.day.type" = "date",  
                 "projection.day.range" = "2023/01/01,NOW",  
                 "projection.day.format" = "yyyy/MM/dd",  
                 "projection.day.interval" = "1",  
                 "projection.day.interval.unit" = "DAYS",  
                 "storage.location.template" = "s3://S3-LOCATION/AWSLogs/ACCOUNT-NUMBER/elasticloadbalancing/REGION/${day}"  
                )

    Note: Replace S3-LOCATION, ACCOUNT-NUMBER, and REGION with your S3 bucket values.

  4. To query the Application Load Balancer connection logs, use SQL statements.

Example queries

View the latest 100 occurrences of logs

SELECT *FROM alb_conn_logs  
ORDER by time desc  
LIMIT 100

Count occurrences where tls_verify_status was NOT 'Success' and group by client_IP

SELECT distinct client_ip, count() as count from alb_conn_logs WHERE tls_verify_status != 'Success'  
GROUP by client_ip  
ORDER by count() DESC;

View all occurrences where tls_handshake_latency is over two seconds in a specific time range

SELECT * FROM alb_conn_logs WHERE (parse_datetime(time,'yyyy-MM-dd''T''HH:mm:ss.SSSSSS''Z')  
BETWEEN parse_datetime('2024-01-01-00:00:00','yyyy-MM-dd-HH:mm:ss')  
AND parse_datetime('2024-03-20-00:00:00','yyyy-MM-dd-HH:mm:ss'))  
AND (tls_handshake_latency >= 2.0);

Check for connections with an exact certification validity date and group by leaf_client_cert_serial_number

SELECT DISTINCT leaf_client_cert_serial_number, count() AS count FROM alb_conn_logs  
WHERE tls_verify_status = 'Success' AND leaf_client_cert_validity like '%NotAfter=2025-01-01%'  
GROUP BY leaf_client_cert_serial_number  
ORDER BY count() DESC;

Related information

How do I use Athena to analyze my Application Load Balancer access logs?

Query Application Load Balancer logs

AWS OFFICIALUpdated 10 months ago