I'm trying to turn on enhanced virtual private cloud (VPC) routing in Amazon Redshift.
Short description
In Amazon Redshift, COPY and UNLOAD network traffic and Amazon Redshift Spectrum flow through a network interface. This network interface is internal to the Amazon Redshift cluster, and is located outside of your Amazon Virtual Private Cloud (Amazon VPC). This traffic doesn't pass through your VPC route tables, security groups, or network access control lists (network ACLs).
By default, the network traffic routes through the public internet to reach its destination. If you turn on Amazon Redshift enhanced VPC routing, then Amazon Redshift routes the network traffic through a VPC instead. Amazon Redshift prioritizes the VPC endpoint as the first route priority. If a VPC endpoint isn't available, then Amazon Redshift routes the network traffic through an internet gateway, NAT instance, or NAT gateway.
Resolution
Understand routing prioritization
When you turn on enhanced VPC routing, you must create and specify a VPC endpoint in the route table of the subnet. VPC routing doesn't automatically turn on the traffic flow through a VPC.
If multiple network pathways exist, then Amazon Redshift routes the traffic through the most specific route available.
Example 1: Amazon Simple Storage Service (Amazon S3) gateway endpoint
In the following example, Amazon Redshift routes the network traffic through an Amazon S3 gateway endpoint ("vpce-xxxxx"):
Destination | Target
-------------------------
10.0.0.0/16 | local
0.0.0.0/0 | igw-xxxxx
pl-6fa54006 | vpce-xxxxx
Note: You must associate each subnet in your VPC with a route table.
Example 2: Internet, NAT gateway, or NAT instance
In the following example, you use a subnet route table and Amazon S3 traffic routes through the internet gateway ("igw-xxxxx"):
Destination | Target
-------------------------
10.0.0.0/16 | local
0.0.0.0/0 | igw-xxxxx
Example 3: No available route to destination
If there are no routing methods available, and the route table can't reach S3, then the network traffic for COPY and UNLOAD times out:
Destination | Target
------------------------------
10.0.0.0/16 | local
If the routing method can't reach S3, then you get the following error message:
"ERROR: S3CurlException: Connection timed out after 50001 milliseconds, CurlError 28, multiCurlError 0, CanRetry 1, UserError 0"
To resolve this error, create an S3 VPC gateway endpoint and add it to your subnet route table.
Turn on enhanced VPC routing
Note: If you receive errors when you run AWS Command Line Interface (AWS CLI) commands, then see Troubleshooting errors for the AWS CLI. Also, make sure that you're using the most recent AWS CLI version.
Take one of the following actions:
-
Turn on enhanced VPC routing in the console.
-
Run the following describe-clusters AWS CLI command:
$ aws redshift describe-clusters --cluster-id <cluster-id> | grep EnhancedVpcRouting
|| EnhancedVpcRouting | True
Note: Replace cluster-id with your cluster ID.
-
Use VPC Flow Logs to capture information about the IP address traffic going to and from network interfaces in your VPC.
Example of a VPC flow log, that shows the COPY network traffic between a private Amazon Redshift IP address and an S3 bucket:
Account_ID ENI Source_IP Destination_IP Source_Port Destination_Port Protocol Packets Bytes Start_Time End_Time
......
2 540754XXXXXX eni-01783841dad81XXXX 52.216.29.118 172.31.13.236 443 37516 6 279740 390798072 1589668161 1589668221 ACCEPT OK
2 540754XXXXXX eni-01783841dad81XXXX 172.31.13.236 52.216.29.118 37516 443 6 9206 368276 1589668161 1589668221 ACCEPT OK
......
-
Configure your AWS Glue interface endpoint so that traffic flows privately from Redshift Spectrum to AWS Glue through a VPC. Or, use a NAT gateway or internet gateway.