Networking reduction cost for NATGateway-Bytes and DataTransfer-Regional-Bytes

0

Use case and context: We are using Databricks, and we have a Databricks Workspace in a specific region, reading and writing files from/to the same region in S3. We also read from a Databricks Shared Catalog in a different company, a data provider, which is pointing to multi-region S3 buckets.

The result is that we are incurring in high NATGateway-Bytes and DataTransfer-Regional-Bytes bills.

Measures that we took to reduce cost: In order to reduce cost, we setup a S3 Gateway Endpoint, to route any traffic between instances managed by databricks in private subnets and any S3 bucket in the same region. The idea is that this should reduce cost while reading and writing to our S3 in the same region, and reading from any multi-region buckets, but we are still seeing no reduction on NATGateway-Bytes and DataTransfer-Regional-Bytes costs. We are also monitoring the networking using Flow Logs, and haven't seen any reduction in public traffic, meaning that the traffic is more or less the same to/from the Internet, wouldn't the S3 Gateway Endpoint redirect the traffic at least to our buckets in the same region?

Are these costs inevitable? What could be wrong in our networking setup? Is there any other alternative?

2 Answers
1
Accepted Answer

Thanks @Andy_P for all the suggestions.

After some days of monitoring NAT cost, I realized that the implementation of the S3 Gateway Endpoint it was actually working, the problem was that I thought that this change would be reflected right away in terms of costs, but I found out that this can take a bit more than 24 hours to be visible in AWS Cost Explorer.

From AWS docs: https://docs.aws.amazon.com/cost-management/latest/userguide/ce-exploring-data.html

All costs reflect your usage up to the previous day. For example, if today is December 2, the data includes your usage through December 1.

We already had AWS Flow Logs implemented in the VPC, so using the following query in Cloudwatch Logs Insight, I saw some reduction the first day, but I wasn't sure if it was a real reduction, or just casual less traffic:

# downloads in total
filter (dstAddr like '10.0.0.1' and not isIpv4InSubnet(srcAddr, '10.0.0.0/16')) | stats sum(bytes) as bytesTransferred
# uploads in total
filter (srcAddr like '10.0.0.1' and not isIpv4InSubnet(dstAddr, '10.0.0.0/16')) | stats sum(bytes) as bytesTransferred

So I needed to confirm that actually all inbound/outbound traffic between the subnets and S3 was going through the S3 Gateway Endpoint. After some research I found all AWS IP ranges here https://docs.aws.amazon.com/vpc/latest/userguide/aws-ip-ranges.html, then using this simple script to get only the S3 IP ranges:

import json

# Load the AWS IP ranges JSON file
with open('aws-ips.json') as file:
   ip_ranges = json.load(file)

# Filter for S3 IPs in a specific region, e.g., us-east-1
s3_ips = [range["ip_prefix"] for range in ip_ranges["prefixes"]
          if range["service"] == "S3" and range["region"] == "us-east-1"]

print(s3_ips)

I was able to write a more precise Logs Insight query to check for traffic between our NAT and S3, to check if there was still some traffic:

# downloads from s3
filter (
   dstAddr like '10.0.0.1' and (
           isIpv4InSubnet(srcAddr, '18.34.0.0/19')
           or isIpv4InSubnet(srcAddr, '54.231.0.0/16')
           or isIpv4InSubnet(srcAddr, '52.216.0.0/15')
           or isIpv4InSubnet(srcAddr, '18.34.232.0/21')
           or isIpv4InSubnet(srcAddr, '16.182.0.0/16')
           or isIpv4InSubnet(srcAddr, '3.5.0.0/19')
           or isIpv4InSubnet(srcAddr, '44.192.134.240/28')
           or isIpv4InSubnet(srcAddr, '44.192.140.64/28')
       )
) | stats sum(bytes) as bytesTransferred
# uploads to s3
filter (
   srcAddr like '10.0.0.1' and (
           isIpv4InSubnet(dstAddr, '18.34.0.0/19')
           or isIpv4InSubnet(dstAddr, '54.231.0.0/16')
           or isIpv4InSubnet(dstAddr, '52.216.0.0/15')
           or isIpv4InSubnet(dstAddr, '18.34.232.0/21')
           or isIpv4InSubnet(dstAddr, '16.182.0.0/16')
           or isIpv4InSubnet(dstAddr, '3.5.0.0/19')
           or isIpv4InSubnet(dstAddr, '44.192.134.240/28')
           or isIpv4InSubnet(dstAddr, '44.192.140.64/28')
       )
) | stats sum(bytes) as bytesTransferred

After running these queries, I confirmed there was no traffic, downloading nor uploading, between NAT and S3, right after the S3 Gateway Endpoint was deployed.

NOTE: A great tool is worth mentioning is AWS Reachability Analyzer, which I used to check connectivity between instance's ENIs in private subnets, and the S3 Gateway Endpoint.

answered 2 months ago
profile picture
EXPERT
reviewed a month ago
1

Pls see this post to help troubleshoot the local S3 Gateway Endpoint.

For remote/inter-region S3 buckets have you considered S3 Multi-Region Access Points?

If Access Point restrictions are a blocker, consider the following:

  1. Establish private connectivity between local and remote regions if not already present (VPC Peering, Transit Gateway Peering)
  2. Create an S3 Interface Endpoint (not Gateway Endpoint) in the remote region
  3. If you want to resolve remote Region S3 to the global name (to avoid updating apps) you can use Private DNS support for Amazon S3 with AWS PrivateLink and Route53 Resolver Endpoints. Use an Outbound Endpoint in the local VPC to forward resolution of the remote S3 bucket to an Inbound Endpoint in the remote VPC. This will return the private S3 Interface Endpoint IPs for lookups made in the local VPC.
AWS
Andy_P
answered 2 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions