Skip to content

ec2 instance not routing through virtual private gateway

0

I have set up a site-to-site VPN with a Virtual Private gateway and BGP. The on-prem network engineers say that all is working on their site and they can see services inside our VPC.

However, nothing is getting out from an EC2 instance inside the VPC to the on-prem services. The reachability analyzer confirms all is OK. I can see the tunnels are up, the gateway is attached and the route table populated via propagation.

ping to the on-prem services report 100% package loss and traceroute and tcptraceroute just return 30 rows of asterisks (i.e. no route found). The instance connects locally (i.e. to other services in the local VPC) and to internet sites OK.

Outbound to 0.0.0.0/0 is (temporarily) enabled on the firewall and ACL.

I found some Youtube explainers for connecting AWS and Azure via site-to-site VPN, but they all use static routes. Is using BGP not allowed, should I change the configuration to use static routes? (The on-site network people specified BGP, which is why I've configured it in the VPC).

2 Answers
0

The issue you're experiencing with your EC2 instance not routing through the Virtual Private Gateway (VGW) to reach on-premises services, despite the VPN connection appearing to be up and functional, could be caused by several factors. Let's go through some potential causes and solutions:

  1. Route Table Configuration: Ensure that the route table associated with the subnet where your EC2 instance resides has the correct routes propagated from the VGW. Check if there's a route for the on-premises network CIDR pointing to the VGW.

  2. Security Groups and Network ACLs: While you've mentioned outbound to 0.0.0.0/0 is enabled, double-check that both the security group associated with the EC2 instance and the Network ACL for its subnet allow both inbound and outbound traffic for the protocols you're trying to use (ICMP for ping, TCP for specific services).

  3. BGP Configuration: Using BGP for AWS Site-to-Site VPN is perfectly valid and often preferred for dynamic routing. There's no need to switch to static routes unless specifically required. Verify that the BGP session is established and that routes are being properly advertised and received on both ends of the VPN connection.

  4. MTU and TCP MSS Clamping: VPN connections can sometimes cause issues with packet sizes. Try adjusting the MTU on your EC2 instance to a lower value (e.g., 1400 or 1300). You can also enable TCP MSS Clamping on your Linux instances.

  5. VPN Configuration: Verify that the VPN connection is configured to allow traffic from all instance types and that there are no filters on the customer gateway side blocking traffic from your VPC CIDR.

  6. On-premises Firewall: Although the on-premises team reports everything is working, double-check with them that their firewall is allowing return traffic to your VPC CIDR range.

  7. VGW Association: Confirm that the VGW is properly associated with your VPC and that route propagation is enabled.

  8. Asymmetric Routing: Ensure that return traffic from on-premises is routed back through the VPN and not taking a different path.

If after checking these points the issue persists, you may want to:

  • Use VPC Flow Logs to track the traffic and see where it's being dropped or misrouted.
  • Perform packet captures on both the EC2 instance and the on-premises endpoint to compare traffic patterns.
  • Consider opening a support ticket with AWS for more in-depth troubleshooting if the problem continues.

Remember, BGP is a valid and often preferred method for AWS Site-to-Site VPN connections, so there's no inherent issue with using it instead of static routes.
Sources
Troubleshoot VPN routing issues | AWS re:Post
Troubleshooting AWS Direct Connect - AWS Direct Connect
Site to Site VPN working with Windows but not Linux EC2 | AWS re:Post
Impossible to ping from AWS to On-Premise Server with site-to-site AWS vpn | AWS re:Post

answered a year ago
EXPERT
reviewed a year ago
0

Hello,

From your notes, I understand that you are facing connectivity issue over the AWS Site-to-Site VPN connection using Dynamic Routing(BGP) between AWS and Azure where your On-Prem network engineers are able to reach services inside AWS VPC, however there is not return traffic from AWS towards Azure.

  • Based on the notes, I am assuming BGP is UP, your Customer Gateway (Azure) is advertising and receiving advertised routes from AWS. As you have mentioned, this is a VPN terminating on VGW, Azure should be receiving VPC CIDR network over the BGP/Site-to-Site VPN. Similarly, AWS should be receiving your Azure advertised networks. How do I troubleshoot BGP connection issues over VPN?

  • I would recommend you to create VPC flow logs that enables you to capture information about the IP traffic going to and and from network interfaces in your VPC. This way, we can confirm if traffic from Azure end is reaching AWS and if AWS is responding back or not.

  • Also, I would recommend checking OS level firewall of your EC2 instance to ensure traffic is not getting blocked/dropped when replying back to Azure.

  • One common issue I have observed between AWS and Azure using Dynamic routing is because of "Inside Tunnel IP Address" assigned to each Tunnel 1 and Tunnel 2. Generally they are a CIDR block from 169.254.0.0/16. However, in regards to Azure this CIDR must be in the Azure-reserved APIPA range for VPN, which is from 169.254.21.0 to 169.254.22.255 that is mentioned here

Hence, I would like to confirm if the above step has been followed when creating AWS Site-to-Site VPN connection and the BGP/Tunnels are in UP state.

Please let me know if you have any questions, Thank you!

AWS
SUPPORT ENGINEER
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.