I am trying to isolate data transfer to specific sources (and destinations where possible)

0

I have received an alert that my AWS Free Tier data transfer has reached 85% for the month of February and it is only 2/6. I am not concerned about the charges or costs, I'm more concerned with what data was sent that accounted for 850MB in only 5 days (I received the cost alert 1st thing in the morning on 2/6 so the usage must be from 2/1 - 2/5). The alert was for "Global-DataTransfer-Regional-Bytes"

I've looked at the data transfer from my 2 load balancers and neither seems to account for such substantial data transfer, but it is also hard to be certain because the usage graph ("Processed Bytes" specifically) does not show a cumulative byte count, it simply shows spikes when usage was high.

I am not well versed in the query tools that might be available to me to create a query to show this number, but I was hoping someone could point me in the direction of where that data might be coming from.

The bill for the month of February so far shows very little transfer from all the AWS data centers but shows a larger number (nearly 1GB) for "regional data transfer under the monthly global free tier." But if this data is not from the data centers or load balancers, where is it being sourced from?

Thanks! Ron

asked a year ago315 views
1 Answer
0

The most efficient way to find the source and destination for IP flows is through VPC FLow Logs. These are stored in an S3 bucket and you can use tools like Athena to query them. Here is a full explanation on the setup [1]. I highly recommend setting up S3 Lifecycle policies so that the data from the VPC flow logs will be deleted automatically after some amount of time (7 days, 30 days, whatever you deem too long).

The Data Transfer Regional Bytes means that it's probably Cross AZ data transfer traffic. If you determine the traffic is coming from or going to AWS Services (there is an option in the VPC Flow log setup to include which AWS Service is being addressed), then you may find that enabling VPC Endpoints [2] a cost effective method to avoid the data transfer costs and for the VPC resources to communicate directly with the AWS Service is question. Take note that there are some costs associated with VPC Endpoints.

[1] https://docs.aws.amazon.com/athena/latest/ug/vpc-flow-logs.html [2] https://docs.aws.amazon.com/whitepapers/latest/aws-privatelink/what-are-vpc-endpoints.html

profile pictureAWS
EXPERT
answered a year ago
  • Thank you for your help, Shlomo. I will have to research and learn more about how to create an effective flow flog to better understand where my data transfer utilization is coming from.

    I had hope to find some simple mapping between the claim of 850MB of utilization and some AWS resource like a load balancer or something that I could directly correlate to the usage warning, but it doesn't seem quite so clear.

    I will research your suggestion. Thank you so much.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions