I want to troubleshoot connectivity issues with the AWS Site-to-Site VPN tunnels on my customer gateway device.
Resolution
The following common errors on a customer gateway device can cause tunnel inactivity, instability, flapping, or down tunnels over Site-to-Site VPN:
- Internet Key Exchange (IKE)/Phase 1 or Internet Protocol security (IPsec)/Phase 2 failures cause a down tunnel.
- Problems with IPsec dead peer detection (DPD) monitoring.
- Idle timeouts because of low traffic on a Site-to-Site VPN tunnel or vendor-specific customer gateway configuration issues.
- Rekey issues for Phase 1 or Phase 2 of your Site-to-Site VPN tunnel.
- A policy-based Site-to-Site VPN connection on the customer gateway device causes intermittent connectivity.
- Static routing causes intermittent connectivity.
- The customer gateway device isn't configured correctly.
Use tunnel activity logs to monitor Site-to-Site VPN tunnels and collect information about tunnel outages and other tunnel issues. Then, take the following actions:
Troubleshoot IKE/Phase 1 or IPsec/Phase 2 failures that cause a down tunnel
Verify that the Site-to-Site VPN connection's tunnels are UP. If the connection is DOWN, then troubleshoot IKE/Phase 1 and IPsec/Phase 2 failures.
Troubleshoot Problems with DPD monitoring
When you experience a DPD timeout, your logs display the following message: "Peer is not responsive - Declaring peer dead." By default, Site-to-Site VPN sends a "DPD R_U_THERE" message to the customer gateway every 10 seconds. After three successive messages without response, Site-to-Site VPN considers the peer dead. Then, Site-to-Site VPN closes the connection.
If DPD is active on your customer gateway device, then check the following:
- Your customer gateway device is configured to receive and respond to DPD messages.
- Your customer gateway device is available to respond to DPD messages from AWS peers.
- If intrusion prevention system features are active in the firewall, then verify that your customer gateway device allows DPD messages without rate limiting.
- Your customer gateway device has stable and reliable internet connectivity.
If the Site-to-Site VPN must take no action when a DPD timeout occurs, then change your DPD timeout action to none.
Troubleshoot idle timeouts
Confirm constant bidirectional traffic between your local network and your virtual private cloud (VPC). To confirm the traffic, create a host that sends Internet Control Message Protocol requests to an instance in your VPC every 5 seconds.
Use information from your device's vendor to review your VPN device's idle timeout settings. If traffic doesn't pass through a Site-to-Site VPN tunnel for the duration of your vendor-specific VPN idle time, then the IPsec session ends.
Troubleshoot rekey issues for Phase 1 or Phase 2
Review the Phase 1 or Phase 2 lifetime fields on the customer gateway. Make sure that the fields match the AWS parameters. It's a best practice to only select the necessary Site-to-Site VPN tunnel options.
Make sure that you activate perfect forward secrecy (PFS) on the customer gateway device. PFS is activated by default on the AWS side.
Note: The IKEv2 lifetime value field is independent of peers. If you set a lower lifetime value, then the peer initiates the rekey. It's a best practice to configure one peer to initiate a rekey.
Troubleshoot connectivity issues with policy-based VPNs
Make sure that the customer gateway device covers the Site-to-Site VPN connection's IPv4 and IPv6 CIDR ranges. The default range is 0.0.0.0/0.
If the Site-to-Site VPN on the customer gateway side is policy-based, then specify an encryption domain that covers the intended traffic.
Note: Site-to-Site VPN supports only one encryption domain. If the VPN includes multiple networks, then summarize the encryption domain on the customer gateway device to maintain only one pair of security associations.
Troubleshoot connectivity issues with static routing
Important: It's a best practice to use dynamic routing instead of static routing. For more information, see Static and dynamic routing in AWS Site-to-Site VPN.
AWS Site-to-Site VPN tunnels that use static routing and are configured with an active/active setup may experience connectivity issues. Make sure that your customer gateway device supports dynamic routing. If the customer gateway device doesn't support dynamic routing, then configure your static VPN to avoid asymmetric routing.
Connectivity issues due to customer gateway configurations
Complete the following steps:
- Check that the customer gateway is behind a NAT device. Or, verify that acceleration is turned on for a Site-to-Site VPN connection. Then, make sure that NAT-traversal (NAT-T) is active on the customer gateway device. Verify that UDP packets can pass between the network and the VPN endpoints on port 4500.
- If the customer gateway isn't behind a NAT device, then verify that port 50 allows UDP packets to pass between the network and the VPN endpoints.
- Verify that the firewall rules for your customer gateway device allow traffic between the on-premises network and AWS.
- Troubleshoot connection issues associated with your specific customer gateway device.
Related information
The VPN tunnel between my customer gateway and my virtual private gateway is Up, but I am unable to pass traffic through it. What can I do?
How do I configure my Site-to-Site VPN connection to prefer tunnel A over tunnel B?