AWS Site-to-Site VPN connection

0

Dear AWS Support Team, I am writing to seek your assistance with a technical issue related to our AWS Site-to-Site VPN connection.

Problem Description:

I have set up a Site-to-Site VPN connection between our AWS environment and an on-premises server. The objective is to establish communication between specific instances in an AWS subnet and the on-premises server while allowing flexibility to change to another VPC and subnet on our end. However, I am encountering connectivity issues as follows:

The VPN tunnel is successfully established. I have verified that ICMP traffic (ping) is allowed through firewall rules and security groups on both the on-premises and AWS sides. I have made the necessary modifications, including switching the VPC the VPN Gateway (VGW) is attached to and updating the remote IP address in the Site-to-Site VPN configuration. I have also enabled route propagation to the VGW in the route table of the AWS subnet that should have connectivity to the on-premises server. Despite these steps, instances on both sides of the tunnel are unable to ping each other when I change to another VPC and subnet on our end.

Steps Taken:

Switched VGW to preferred vpc Updated remote IP address in Site-to-Site VPN configuration. Enabled route propagation to VGW in the route table of preferred subnet. Desired Change:

I am seeking to change to a different VPC and subnet on our end while maintaining uninterrupted communication with AWS instances across the VPN connection. The objective is to transition seamlessly to this new configuration.

Firewall Rules and Security Group Settings:

AWS Security Group Inbound Rule: Allows ICMP from All AWS Security Group Outbound Rule: Allows ICMP to All On-Premises Firewall Rule: Allows ICMP to and from any subnet ip address I intend to update to Additional Information:

All appropriate network ACLs and firewall rules have been checked and are not blocking traffic. VPN tunnel status indicates that it is "up." I have conducted extensive troubleshooting, and I am unable to determine the root cause of the connectivity issue when transitioning to the new VPC and subnet. I would greatly appreciate your guidance and support in resolving this matter promptly.

Please let me know if you require any further information or log details to assist with troubleshooting.

Thank you for your prompt attention to this matter.

Olive
asked 8 months ago426 views
2 Answers
0

Hello Olive,

I have a couple of suggestions here:

The objective is to establish communication between specific instances in an AWS subnet and the on-premises server while allowing flexibility to change to another VPC and subnet on our end.

  1. If you are trying to connect to multiple VPCs, try using Target Gateway as a Transit Gateway (TGW). You will not have to attach and detach since TGW will allow you to reach multiple VPCs.

[+] AWS Transit Gateway and AWS Site-to-Site VPN: https://docs.aws.amazon.com/whitepapers/latest/aws-vpc-connectivity-options/aws-transit-gateway-vpn.html

Note: If you modify the VPC-VGW association, there will be some interruption in traffic.

  1. As you said, the VPN tunnel status indicates that it is "up. You don't need to verify Phase 1 and Phase 2 configuration. However, the below document has all the steps that you can follow to troubleshoot the connectivity issue.

[+] How do I troubleshoot VPN tunnel connectivity to an Amazon VPC? https://repost.aws/knowledge-center/vpn-tunnel-troubleshooting

In addition to the above, you can monitor cloud watch tunnel data IN and OUT to check if there's ESP traffic over the tunnel. Also, try capturing the VPC flow logs to see if traffic is reaching instances.

[+] Monitoring VPN tunnels using Amazon CloudWatch - VPN metrics and dimensions - https://docs.aws.amazon.com/vpn/latest/s2svpn/monitoring-cloudwatch-vpn.html#metrics-dimensions-vpn

[+] Create a flow log that publishes to CloudWatch Logs: https://docs.aws.amazon.com/vpc/latest/userguide/flow-logs-cwl.html#flow-logs-cwl-create-flow-log

profile picture
EXPERT
answered 8 months ago
  • Thanks for your response. However, the endpoint for this set up is not to constantly change vpcs, we just want to move it to a different vpc that is better suited for the purpose of this configuration. The configuration for the tunnel that works has been replicated on the vpc I would like to change to the tunnel goes up but instances cannot ping. When I go to the initial vpc, the tunnel will go up and still communicate. Also I created a brand new tunnel and pointed it to a totally different vpc while replicating the setup of the working vpn, the tunnel goes up but still fails to have servers communicate across the tunnel.

  • Thanks for your response. However, the endpoint for this set up is not to constantly change vpcs, we just want to move it to a different vpc that is better suited for the purpose of this configuration. The configuration for the tunnel that works has been replicated on the vpc I would like to change to the tunnel goes up but instances cannot ping. When I go to the initial vpc, the tunnel will go up and still communicate. Also I created a brand new tunnel and pointed it to a totally different vpc while replicating the setup of the working vpn, the tunnel goes up but still fails to have servers communicate across the tunnel.

    • Do you see traffic in Cloud watch metric Tunnel Data In?
    • How about phase 2 traffic selectors/security association? Does that have correct range defined?
    Note: Policy-based VPNs with more than one pair of security associations will drop existing connections when new connections with different security associations initiate. This behavior indicates that a new VPN connection has interrupted an existing one.
    

    Reference: https://repost.aws/knowledge-center/vpn-connection-instability

  • When I ping out and check the Tunnel data out metric, I can see tunnel data out value but tunnel data in is 0 (also ping requests has the timed out error) .When the instance is pinged, it is unavailable and I see tunnel data in value but tunnel data out is 0. The correct range is configured in phase 2. I really dont know what to make of this behaviour

  • If you see tunnel data IN, that means traffic initiated from on-premises is reaching the VPN endpoint. How about VPC flow logs? Does that show anything? Do we have the correct VPN static route back to on-premises?

0

Hello.

Has the VPN been successful in Phase 1?
If phase 1 is failing, troubleshoot according to the following document.
https://repost.aws/knowledge-center/vpn-tunnel-phase-1-ike

If phase 2 is failing, troubleshoot according to the following documentation.
https://repost.aws/knowledge-center/vpn-tunnel-phase-2-ipsec

Also, if you check the VPN logs and other logs of the customer gateway, you will probably see an error output that leads to the cause of the connection failure.

profile picture
EXPERT
answered 8 months ago
profile pictureAWS
EXPERT
reviewed 8 months ago
  • Thank you, the tunnel is up. vpn logs show no insight as the log is up and works as it should so I am still stuck

  • If I add a static route or other route for routing from on-premises to an AWS VPC, can I still communicate? Also, has a route to the VGW been added to the AWS side route table to communicate to the on-premises?

  • All these have been done, route table points to vgw, static route to the onprem ip address

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions