How do I troubleshoot connectivity issues from Direct Connect to AWS resources?

9 minute read
0

I want to troubleshoot connectivity issues I'm having between AWS Direct Connect and AWS resources.

Short description

Direct Connect uses private, public, and transit virtual interfaces (VIF) depending on the resource that you're accessing within AWS. Connecting to AWS resources from on-premises using a Direct Connect VIF can result in multiple connectivity issues, depending upon the type of VIF that you're using.

Resolution

If the VIF you're having issues with was in use and suddenly stopped working, then complete the following steps:

Troubleshoot your connectivity issues based on your VIF type

To troubleshoot your connectivity issues, determine your VIF type and complete the following steps:

Private virtual interfaces

Private virtual interfaces are used to access resources within an Amazon Virtual Private Cloud (Amazon VPC). They access resources by using the private IP address assigned from the Amazon VPC CIDR range. If you're having issues connecting to a resource within an Amazon VPC, then complete the following steps:

1.    Check that the security group of the destination instance and the subnet network access control list (ACL) have appropriate inbound and outbound rules. Bidirectional connectivity between AWS and on-premises should be allowed depending on the source and destination IP address and port being used.

2.    Check the BGP routing configuration at the on-premises router to make sure the required routes are being directed toward AWS. If you're using route propagation on the Amazon VPC route table, then the routes should be visible on the Amazon VPC route table. Also, the correct virtual private gateway should be the target.

3.    Check that your on-premises router is receiving routes for the Amazon VPC CIDR over the BGP. Routes should be received from the AWS peer IP address associated with the Direct Connect VIF.

  • If you're not receiving routes from the AWS peer IP address, then check if the virtual private gateway is associated to the correct Amazon VPC.
  • If your private VIF terminates on the Direct Connect gateway, then make sure that the correct virtual private gateway is associated with the Direct Connect gateway. Make sure that the allowed prefixes are configured to permit the Amazon VPC CIDR to be directed toward the on-premises router.

4.    Perform a traceroute from the on-premises router to the Amazon VPC instance and reverse direction as follows:

ICMP-based traceroute:

sudo traceroute -n -I <private-IP-of-EC2-instance/on-premises-host>

Note: If your on-premises router or firewall blocks ICMP-based traceroute requests, then run a TCP-based traceroute on the appropriate TCP port.

TCP-based traceroute:

sudo traceroute -n -T -p 22 <private-IP-of-EC2-instance/on-premises-host>

Note: In the preceding command, -n -T -p 22 performs a trace on port 22. You can use any port that your application is listening on.

5.    Check your traceroute results to confirm the visibility and behavior of the on-premises router and AWS peer IPs associated with your VIF.

  • If the traceroute stops at the on-premises router peer IP, then the traffic drops after it reaches the on-premises router. Check the on-premises network firewall settings to make sure that bidirectional connectivity is allowed on the selected port.
  • If the traceroute stops at the AWS peer IP, then check the network ACL and security group configuration from step 1. You can also use Amazon VPC Flow Logs to check if the packets sent from the on-premises router are received at a specific elastic network interface.
  • If you don't see the AWS or on-premises peer IP associated with the VIF, then traffic is being forwarded over an incorrect path. Check your on-premises router to confirm if it has a more specific or preferred route for the same CIDR through a different peer.
  • If the traceroute from AWS to the on-premises router doesn't contain the AWS peer IP address, then check if another VIF is also terminating. Check if another VIF is terminating on the same virtual private gateway or Direct Connect gateway that is advertising the same on-premises route. If so, check for any existing Site-to-Site VPN connections advertising specific routes for the on-premises router on the Amazon VPC route table.

6.    Compare traceroutes from AWS to the on-premises router and from the on-premises router to AWS. If both traceroutes have different hops, then this indicates asymmetric routing. Make sure that the same Direct Connect private virtual interface is preferred bidirectionally through the use of routing policies.

Public virtual interfaces

Public virtual interfaces access all AWS public services using public IP addresses. To troubleshoot your public virtual interface connectivity issues, complete the following steps:

1.    Check if the on-premises router that's hosting your public virtual interface is receiving routes from public prefixes from the AWS peer IP address. If you're using an inbound prefix filter and route map to filter the routes, then make sure the prefix filter matches the required prefixes.

2.    Check that you're advertising the public peer IP address to AWS over the BGP if you're performing network address translation (NAT) for the on-premises networks.

Example scenario:

  • Local peer IP address is 69.210.74.146/31
  • Remote peer IP address is 69.210.74.147/31
  • If performing NAT for the on-premises local network traffic to the local peer IP address, then advertise 69.210.74.146/32 to AWS.

Note: Make sure that you connect from a prefix that's advertised from on-premises to AWS over the public VIF. You can't connect from a prefix that isn't advertised to a public VIF.

3.    Perform a traceroute from on-premises to AWS to check if the traffic is being forwarded over the Direct Connect public VIF.

  • If traffic is being forwarded over the public VIF, then the traceroute should have the local (on-premises) and remote(AWS) peer IPs associated.
  • If you need to check the network path used within AWS, then launch a public Amazon Elastic Compute Cloud (Amazon EC2) instance. The instance must have the same Region as your AWS service. After launching the instance, perform a traceroute to on-premises. If the traceroute indicates traffic is being forwarded over the internet or through a different VIF, then there could be a specific route being advertised.

Note: AWS uses AS_PATH and Longest Prefix Match is used to determine the routing path. Direct Connect is the preferred path for traffic sourced from Amazon.

4.    Check that your connectivity to a public AWS service (such as Amazon Simple Storage Service or Amazon S3) is working for the correct destination Region. Then, check if you're using BGP community tags on the public prefixes that you advertise to Amazon.

Note: BGP community tags determine how far to propagate your prefixes on the Amazon network.

Transit virtual interfaces

Transit virtual interfaces access one or more Amazon VPC transit gateways (TGW) associated with Direct Connect gateways. To troubleshoot connectivity issues, complete the following steps:

1.    Check that the destination resource's Amazon VPC subnet route table has a route for the on-premises CIDR toward the TGW. Make sure that the instance or resource security groups and the subnet's network ACL allow bidirectional connectivity. For more information, see How network ACLs work with transit gateways.

2.    Check that the on-premises router associated with your transit VIF is receiving the correct routes over the BGP from the AWS peer. The routes are for the destination Amazon VPC CIDR. If you're not receiving the required routes, then check the Allowed Prefixes section. Check that the allowed prefixes for the Direct Connect gateway association with TGW are configured with the required prefixes. Only routes configured under the Allowed Prefixes section are advertised from AWS over a transit VIF.

3.    Check if your on-premises router associated with the transit VIF is advertising the required on-premises network prefixes to AWS.

  • If you're propagating routes from the Direct Connect gateway to a TGW route table, then check if the routes are visible on the route table. If the routes are not visible, then check the AS path on the advertised routes to make sure that it doesn't include the TGW ASN.
  • If you are advertising a specific route that contains the TGW ASN on the AS path, then the route won't be installed on the route table. Make sure that the ASN used by the customer gateway device (on-premises router) is different from the TGW ASN.

4.    Check that the TGW table associated with the Direct Connect gateway and destination Amazon VPC attachments have the correct route for the destination.

  • The TGW route table associated with the Direct Connect gateway must have a route for the Amazon VPC CIDR directed to the Amazon VPC attachment.
  • The TGW route table associated with the Amazon VPC attachment must have a route for the on-premises CIDR directed to the Direct Connect gateway attachment.

5.    Perform a bidirectional traceroute from AWS to on-premises (both directions) to identify the traffic path and the hop where traffic drops.

  • If the traffic drops after reaching the AWS peer IP address, then check the following:
  • If the traceroute from AWS to on-premises drops at the on-premises peer IP address, then check the firewall configuration at the on-premises. Make sure bidirectional connectivity is allowed on the correct ports between the source and destination.

6.    If the traceroute from AWS to on-premises doesn't include the peer IP associated with your VIF, then check the Direct Connect gateway. Check the Direct Connect gateway to confirm if you have other transit VIFs on the same Direct Connect gateway advertising the same on-premises routes. If so, use this routing policy for the transit VIF to identify the VIF that must be used for outbound connectivity.

7.    Check that the TGW Amazon VPC attachment has a subnet associated from the same Availability Zone as the destination resource. For example, if your instance is located in specific Availability Zone, then the TGW Amazon VPC attachment must have one subnet in same location.

Note: You can use Amazon VPC Flow Logs to check if the on-premises traffic is reaching a specific instance elastic network interface. This helps to identify if there is any bidirectional traffic on the elastic network interface.

AWS OFFICIAL
AWS OFFICIALUpdated 10 months ago