How do I use CloudWatch Logs Insights to analyze custom Amazon VPC flow logs?

7 minute read
1

I used Amazon Virtual Private Cloud (Amazon VPC) Flow Logs to configure custom VPC flow logs. I want to use Amazon CloudWatch Logs Insights to discover patterns and trends within the logs.

Short description

CloudWatch Logs Insights automatically discovers flow logs that are in the default format, but doesn't automatically discover flow logs in the custom format.

To use CloudWatch Logs Insights with flow logs that are in the custom format, you must modify the queries.

The following is an example of a custom flow log format:

${account-id} ${vpc-id} ${subnet-id} ${interface-id} ${instance-id} ${srcaddr} ${srcport} ${dstaddr} ${dstport} ${protocol} ${packets} ${bytes} ${action} ${log-status} ${start} ${end} ${flow-direction} ${traffic-path} ${tcp-flags} ${pkt-srcaddr} ${pkt-src-aws-service} ${pkt-dstaddr} ${pkt-dst-aws-service} ${region} ${az-id} ${sublocation-type} ${sublocation-id}

The following queries are examples of how you can customize and extend queries to match your use cases.

Resolution

Retrieve the latest flow logs

To extract data from your log fields, use the parse keyword. For example, the output from the following query is sorted by the flow log event start time and restricted to the two most recent log entries.

Query

#Retrieve latest custom VPC Flow Logs
parse @message "* * * * * * * * * * * * * * * * * * * * * * * * * * *" as account_id, vpc_id, subnet_id, interface_id,instance_id, srcaddr, srcport, dstaddr, dstport, protocol, packets, bytes, action, log_status, start, end, flow_direction, traffic_path, tcp_flags, pkt_srcaddr, pkt_src_aws_service, pkt_dstaddr, pkt_dst_aws_service, region, az_id, sublocation_type, sublocation_id
| sort start desc
| limit 2

Output

account_idvpc_idsubnet_idinterface_idinstance_idsrcaddrsrcport
123456789012vpc-0b69ce8d04278dddsubnet-002bdfe1767d0ddb0eni-0435cbb62960f230e-172.31.0.10455125
123456789012vpc-0b69ce8d04278ddd1subnet-002bdfe1767d0ddb0eni-0435cbb62960f230e-91.240.118.8149422

Summarize data transfers by source and destination IP address pairs

Use the following query to summarize the network traffic by source and destination IP address pairs. In the example query, the sum statistic aggregates the bytes field. The sum statistic calculates a cumulative total of the data that's transferred between hosts, so the flow_direction is included in the query and output. The results of the aggregation are temporarily assigned to the Data_Transferred field. Then, the results are sorted by Data_Transferred in descending order, and the two largest pairs are returned.

Query

parse @message "* * * * * * * * * * * * * * * * * * * * * * * * * * *" as account_id, vpc_id, subnet_id, interface_id,instance_id, srcaddr, srcport, dstaddr, dstport, protocol, packets, bytes, action, log_status, start, end, flow_direction, traffic_path, tcp_flags, pkt_srcaddr, pkt_src_aws_service, pkt_dstaddr, pkt_dst_aws_service, region, az_id, sublocation_type, sublocation_id
| stats sum(bytes) as Data_Transferred by srcaddr, dstaddr, flow_direction
| sort by Data_Transferred desc
| limit 2

Output

srcaddrdstaddrflow_directionData_Transferred
172.31.1.2473.230.172.154egress346952038
172.31.0.463.230.172.154egress343799447

Analyze data transfers by Amazon EC2 instance ID

You can use custom flow logs to analyze data transfers by Amazon Elastic Compute Cloud (Amazon EC2) instance ID. To determine the most active EC2 instances, include the instance_id field in the query.

Query

parse @message "* * * * * * * * * * * * * * * * * * * * * * * * * * *" as account_id, vpc_id, subnet_id, interface_id,instance_id, srcaddr, srcport, dstaddr, dstport, protocol, packets, bytes, action, log_status, start, end, flow_direction, traffic_path, tcp_flags, pkt_srcaddr, pkt_src_aws_service, pkt_dstaddr, pkt_dst_aws_service, region, az_id, sublocation_type, sublocation_id
| stats sum(bytes) as Data_Transferred by instance_id
| sort by Data_Transferred desc
| limit 5

Output

instance_idData_Transferred
-1443477306
i-03205758c9203c979517558754
i-0ae33894105aa500c324629414
i-01506ab9e9e90749d198063232
i-0724007fef3cb06f354847643

Filter for rejected SSH traffic

To analyze the traffic that your security group and network access control lists (network ACLs) denied, use the REJECT filter action. To identify hosts that are rejected on SSH traffic, extend the filter to include TCP protocol and traffic with a destination port of 22. In the following example query, TCP protocol 6 is used.

Query

parse @message "* * * * * * * * * * * * * * * * * * * * * * * * * * *" as account_id, vpc_id, subnet_id, interface_id,instance_id, srcaddr, srcport, dstaddr, dstport, protocol, packets, bytes, action, log_status, start, end, flow_direction, traffic_path, tcp_flags, pkt_srcaddr, pkt_src_aws_service, pkt_dstaddr, pkt_dst_aws_service, region, az_id, sublocation_type, sublocation_id
| filter action = "REJECT" and protocol = 6 and dstport = 22
| stats sum(bytes) as SSH_Traffic_Volume by srcaddr
| sort by SSH_Traffic_Volume desc
| limit 2

Output

srcaddrSSH_Traffic_Volume
23.95.222.129160
179.43.167.7480

Isolate HTTP data stream for a specific source/destination pair

To analyze the trends in your data, use CloudWatch Logs Insights to isolate bidirectional traffic between two IP addresses. In the following query, ["172.31.1.247","172.31.11.212"] uses either IP address as the source or destination IP address to return flow logs. The filter statements match VPC Flow Log events with TCP protocol 6 and port 80 to isolate HTTP traffic. To return a subset of all available fields, use the display keyword.

Query

See the following query:

#HTTP Data Stream for Specific Source/Destination Pair
parse @message "* * * * * * * * * * * * * * * * * * * * * * * * * * *" as account_id, vpc_id, subnet_id, interface_id,instance_id, srcaddr, srcport, dstaddr, dstport, protocol, packets, bytes, action, log_status, start, end, flow_direction, traffic_path, tcp_flags, pkt_srcaddr, pkt_src_aws_service, pkt_dstaddr, pkt_dst_aws_service, region, az_id, sublocation_type, sublocation_id
| filter srcaddr in ["172.31.1.247","172.31.11.212"] and dstaddr in ["172.31.1.247","172.31.11.212"] and protocol = 6 and (dstport = 80 or srcport=80)
| display interface_id,srcaddr, srcport, dstaddr, dstport, protocol, bytes, action, log_status, start, end, flow_direction, tcp_flags
| sort by start desc
| limit 2

Output

interface_idsrcaddrsrcportdstaddrdstportprotocolbytesactionlog_status
eni-0b74120275654905e172.31.11.21280172.31.1.2472937665160876ACCEPTOK
eni-0b74120275654905e172.31.1.24729376172.31.11.21280697380ACCEPTOK

Visualize results as a bar or pie chart

You can use CloudWatch Log Insights to visualize results as a bar or pie chart. If the results include the bin() function, then query output returns with a timestamp. You can then visualize the time series with a line or stacked area graph.

To calculate the cumulative data that's transferred in 1-minute intervals, use stats sum(bytes) as Data_Trasferred by bin(1m). To see this visualization, toggle between the Logs and Visualization tables in the CloudWatch Logs Insights console.

Query

parse @message "* * * * * * * * * * * * * * * * * * * * * * * * * * *" as account_id, vpc_id, subnet_id, interface_id,instance_id, srcaddr, srcport, dstaddr, dstport, protocol, packets, bytes, action, log_status, start, end, flow_direction, traffic_path, tcp_flags, pkt_srcaddr, pkt_src_aws_service, pkt_dstaddr, pkt_dst_aws_service, region, az_id, sublocation_type, sublocation_id
| filter srcaddr in ["172.31.1.247","172.31.11.212"] and dstaddr in ["172.31.1.247","172.31.11.212"] and protocol = 6 and (dstport = 80 or srcport=80)
| stats sum(bytes) as Data_Transferred by bin(1m)

Output

bin(1m)Data_Transferred
2022-04-01 15:23:00.00017225787
2022-04-01 15:21:00.00017724499
2022-04-01 15:20:00.0001125500
2022-04-01 15:19:00.000101525
2022-04-01 15:18:00.00081376

Related information

Supported logs and discovered fields

CloudWatch Logs Insights query syntax

AWS OFFICIAL
AWS OFFICIALUpdated 9 months ago