- Newest
- Most votes
- Most comments
The issue you're experiencing with your EC2 instance not routing through the Virtual Private Gateway (VGW) to reach on-premises services, despite the VPN connection appearing to be up and functional, could be caused by several factors. Let's go through some potential causes and solutions:
-
Route Table Configuration: Ensure that the route table associated with the subnet where your EC2 instance resides has the correct routes propagated from the VGW. Check if there's a route for the on-premises network CIDR pointing to the VGW.
-
Security Groups and Network ACLs: While you've mentioned outbound to 0.0.0.0/0 is enabled, double-check that both the security group associated with the EC2 instance and the Network ACL for its subnet allow both inbound and outbound traffic for the protocols you're trying to use (ICMP for ping, TCP for specific services).
-
BGP Configuration: Using BGP for AWS Site-to-Site VPN is perfectly valid and often preferred for dynamic routing. There's no need to switch to static routes unless specifically required. Verify that the BGP session is established and that routes are being properly advertised and received on both ends of the VPN connection.
-
MTU and TCP MSS Clamping: VPN connections can sometimes cause issues with packet sizes. Try adjusting the MTU on your EC2 instance to a lower value (e.g., 1400 or 1300). You can also enable TCP MSS Clamping on your Linux instances.
-
VPN Configuration: Verify that the VPN connection is configured to allow traffic from all instance types and that there are no filters on the customer gateway side blocking traffic from your VPC CIDR.
-
On-premises Firewall: Although the on-premises team reports everything is working, double-check with them that their firewall is allowing return traffic to your VPC CIDR range.
-
VGW Association: Confirm that the VGW is properly associated with your VPC and that route propagation is enabled.
-
Asymmetric Routing: Ensure that return traffic from on-premises is routed back through the VPN and not taking a different path.
If after checking these points the issue persists, you may want to:
- Use VPC Flow Logs to track the traffic and see where it's being dropped or misrouted.
- Perform packet captures on both the EC2 instance and the on-premises endpoint to compare traffic patterns.
- Consider opening a support ticket with AWS for more in-depth troubleshooting if the problem continues.
Remember, BGP is a valid and often preferred method for AWS Site-to-Site VPN connections, so there's no inherent issue with using it instead of static routes.
Sources
Troubleshoot VPN routing issues | AWS re:Post
Troubleshooting AWS Direct Connect - AWS Direct Connect
Site to Site VPN working with Windows but not Linux EC2 | AWS re:Post
Impossible to ping from AWS to On-Premise Server with site-to-site AWS vpn | AWS re:Post