How do I find the top contributors to NAT gateway traffic in my Amazon VPC?

5 minute read
5

I want to find the top contributors of traffic through the NAT gateway in my Amazon Virtual Private Cloud (Amazon VPC).

Short description

To find the top contributors of traffic through the NAT gateway in your Amazon VPC, complete the following tasks:

  • Use Amazon CloudWatch metrics to identify the time of traffic spikes.
  • Use CloudWatch Logs to identify the instances that cause traffic spikes.
  • Use Amazon Simple Storage Service (Amazon S3) or Amazon Athena to identify the instances that cause traffic spikes.

Resolution

Note: In the following steps, replace the following values with your information:

  • example-NAT-private-IP with your NAT gateway private IP address
  • example-VPC-CIDR with your Amazon VPC CIDR
  • example-database-name.example-table-name with your database and table names
  • example-y.y with the first two octets of your Amazon VPC CIDR

Use CloudWatch metrics to identify the time of traffic spikes

To identify and monitor the NAT gateway and specific time of the spikes, use the following CloudWatch metrics:

  • BytesInFromSource - upload
  • BytesInFromDestination - download

Check that you turned on Amazon VPC Flow Logs for your Amazon VPC or NAT gateway elastic network interface. If you didn't turn on Amazon VPC Flow Logs, then create a flow log to turn this option on. When you turn on Amazon VPC Flow Logs, flow log data is published to either CloudWatch Logs or Amazon S3.

Use CloudWatch Logs Insights to identify the instances that cause traffic spikes

Note: Optionally, use a CloudFormation template to create a CloudWatch Dashboard that incorporates the following queries.

Complete the following steps:

  1. Open the CloudWatch console.

  2. In the navigation pane, choose Logs Insights.

  3. From the dropdown list, select the log group for your NAT gateway.

  4. Select a predefined time range, or choose Custom to set your own time range.

  5. To identify instances that send the most traffic through your NAT gateway, run the following command:

    filter (dstAddr like example-NAT-private-IP and isIpv4InSubnet(srcAddr, example-VPC-CIDR)) | stats sum(bytes) as bytesTransferred by srcAddr, dstAddr| sort bytesTransferred desc
    | limit 10
  6. To identify traffic that goes to and from the instances, run the following command:

    filter (dstAddr like example-NAT-private-IP and isIpv4InSubnet(srcAddr, example-VPC-CIDR)) or (srcAddr like example-NAT-private-IP and isIpv4InSubnet(dstAddr, example-VPC-CIDR))| stats sum(bytes) as bytesTransferred by srcAddr, dstAddr| sort bytesTransferred desc
    | limit 10
  7. To identify the internet destinations that the instances in your Amazon VPC communicate the most with, run the following commands.
    For uploads:

    filter (srcAddr like example-NAT-private-IP and not isIpv4InSubnet(dstAddr, example-VPC-CIDR)) | stats sum(bytes) as bytesTransferred by srcAddr, dstAddr| sort bytesTransferred desc
    | limit 10

    For downloads:

    filter (dstAddr like example-NAT-private-IP and not isIpv4InSubnet(srcAddr, example-VPC-CIDR)) | stats sum(bytes) as bytesTransferred by srcAddr, dstAddr| sort bytesTransferred desc
    | limit 10

Use Amazon S3 or Athena to identify the instances that cause traffic spikes

Complete the following steps:

  1. Open the Amazon S3 console or the Athena console.

  2. Create a table. Annotate the database and table name, and then add the following filters to check for the top contributors of a specific time range:
    start>= (example-timestamp-start)
    end>= (example-timestamp-end)

  3. To identify instances that send the most traffic through your NAT gateway, run the following command:

    SELECT srcaddr,dstaddr,sum(bytes) FROM example-database-name.example-table-name WHERE srcaddr like example-y.y AND dstaddr like example-NAT-private-IP group by 1,2 order by 3 desclimit 10;
  4. To identify traffic that goes to and from the instances, run the following command:

    SELECT srcaddr,dstaddr,sum(bytes) FROM example-database-name.example-table-name WHERE (srcaddr like example-y.y AND dstaddr like example-NAT-private-IP) or (srcaddr like example-NAT-private-IP AND dstaddr like example-y.y) group by 1,2 order by 3 desclimit 10;
  5. To identify the internet destinations that the instances in your Amazon VPC communicate the most with, run the following commands.
    For uploads:

    SELECT srcaddr,dstaddr,sum(bytes) FROM example-database-name.example-table-name WHERE (srcaddr like example-NAT-private-IP AND dstaddr not like example-y.y) group by 1,2 order by 3 desclimit 10;

    For downloads:

    SELECT srcaddr,dstaddr,sum(bytes) FROM example-database-name.example-table-name WHERE (srcaddr not like example-y.y AND dstaddr like example-NAT-private-IP) group by 1,2 order by 3 desclimit 10;

Use CloudWatch Logs Insights to identify the instances communicating with internet destinations

To find the internet destinations where the instances are communicating, you must use custom VPC flow logs. These VPC flow logs must include the additional fields pkt-srcaddr and pkt-dstaddr. For more information, see Traffic through a NAT gateway.

Example VPC flow logs:

${version} ${account-id} ${interface-id} ${srcaddr} ${dstaddr} ${srcport} ${dstport} ${protocol} ${packets} ${bytes} ${start} ${end} ${action} ${log-status} ${pkt-srcaddr} ${pkt-dstaddr}

Complete the following steps:

  1. Open the CloudWatch console.
  2. In the navigation pane, choose Logs Insights.
  3. From the dropdown list, select the log group for your VPC flow logs.
  4. Select a predefined time range, or choose Custom to set your own time range.
  5. To identify instances that send the most traffic through your NAT gateway to internet destinations, run the following command.
    For upload traffic:
    parse @message "* * * * * * * * * * * * * * * " as version, account_id, interface_id, srcaddr, dstaddr, srcport, dstport, protocol, packets, bytes, start, end, action, log_status, pkt_srcaddr, pkt_dstaddr
    | filter (dstaddr like 'example-NAT-private-IP' and isIpv4InSubnet(pkt_srcaddr, 'example-VPC-CIDR')) 
    | stats sum(bytes) as bytesTransferred by pkt_srcaddr, pkt_dstaddr
    | sort bytesTransferred desc
    | limit 10
    For download traffic:
    parse @message " * * * * * * * * * * * * * * *" as version, account_id, interface_id, srcaddr, dstaddr, srcport, dstport, protocol, packets, bytes, start, end, action, log_status, pkt_srcaddr, pkt_dstaddr
    | filter (srcaddr like 'example-NAT-private-IP' and !isIpv4InSubnet(pkt_srcaddr, 'example-VPC-CIDR')) 
    | stats sum(bytes) as bytesTransferred by pkt_srcaddr, pkt_dstaddr
    | sort bytesTransferred desc
    | limit 10

Related information

Sample queries

Querying Amazon VPC flow logs

How do I use Amazon Athena to analyze VPC flow logs?

Using AWS Cost Explorer to analyze data transfer costs

AWS OFFICIAL
AWS OFFICIALUpdated a month ago
4 Comments

My NAT Usage per day is ~ 800 GB but using the above information, I was able to track 3GB of usage for a 2 day time period. I only have NAT Gateway logs in cloudwatch and not the VPC Flow logs. Also, there is a private IP 172.168.x.x which is doing a lot of data transfer but my VPC only has 10.0.x.x private IP range. AWS Resource finder service also was not able to track the IP to a service inside my account.

replied 2 years ago

Thank you for your comment. We'll review and update the Knowledge Center article as needed.

profile pictureAWS
MODERATOR
replied 2 years ago

How do I identify which of my private IPs sends the most bytes to public IP 1.2.3.4?

Using the instructions here I have identified 1.2.3.4 (example, not the real IP) as a destination for lots of bytes.

Now I want to know from which internal IP(s) this traffic is coming from.

Seems like these instructions only cover traffic to and from the NAT, but I want data for traffic through the NAT.

replied 4 months ago

Thank you for your comment. We'll review and update the Knowledge Center article as needed.

profile pictureAWS
MODERATOR
replied 4 months ago