Skip to content

Diagnosing packet loss and latency with MTR

10 minute read
Content level: Intermediate
8

This article discusses how to use the My Traceroute (MTR) tool to diagnose and isolate network path issues that are related to latency or packet loss. The article also provides information on other tools that can be used with MTR.

Introduction

Identifying the cause of slow networked application performance or intermittent connectivity issues in AWS resources can be difficult. In some cases, network path issues between clients, servers, or dependencies might cause connectivity issues. For example, an internet path issue between an on-premises network and a public IP address in AWS might cause a poor user experience. To help identify these issues, AWS Cloud Support Engineers and customers often use MTR. MTR combines the functionalities of ping and traceroute to provide insights into network performance. This article provides guidance on how to use the MTR tool and interpret and validate its results.

How MTR works

MTR sends the Internet Control Message Protocol (ICMP), Transmission Control Protocol (TCP), or User Datagram Protocol (UDP) packets to a specific IP address. MTR sends the packets in series with an incremental Time to Live (TTL) value. The TTL value starts at 1 and causes each subsequent packet to reach one hop further in the network path. This is because each router that a packet traverses reduces the TTL value. If TTL reaches zero, then a router discards the packet and might send an ICMP Time Exceeded message back to the sender. MTR calculates these ICMP replies and generates statistics, such as packet loss, round trip time (RTT), and jitter, for each hop. This information helps isolate the location in a network path that’s associated with degradation, and latency or packet loss.

Install MTR

MTR is available on most operating systems. However, Unix-like operating systems provide versions with more features and options than those of Windows, such as TCP, UDP, and report mode. When available, it’s a best practice to run MTR from a Unix-like host to make the best use of these features.

To install MTR on Amazon Linux, RHEL, or CentOS, run the following commands:

sudo yum update
sudo yum install mtr

To install MTR on Ubuntu or Debian, run the following commands:

sudo apt update && apt upgrade
sudo apt install mtr

To install MTR on Windows, download the WinMTR application from the WinMTR website.

To install MTR on macOS, run the following command in Homebrew:

brew install mtr

Run MTR tests

On Windows, you can run MTR from a graphical user interface (GUI) with a few inputs. On Linux or macOS, you can adjust the use of MTR according to your needs. For example, you might be testing a destination that isn’t responsive to ICMP, and instead use TCP.

Run an ICMP-based MTR test:

sudo mtr -c 100 example-ip --report

Run a TCP-based MTR test

sudo mtr -T -c 100 example-ip -P port-number --report

Note:

Replace example-ip with the target IP address. Replace port-number with the relevant port number.

MTR runs for the number of cycles specified by the -c option. The --report option puts MTR into report mode. The -T option performs a TCP-based MTR, and the appended -P is for the port number. For more information, run one of the following commands:

mtr --help

-or-

man mtr

Interpret the output

MTR provides the statistics for each hop under the following seven columns:

  • Loss%: Packet loss percentage

  • Snt: Number of packets

  • Last: Last packet RTT

  • Avg: Average RTT

  • Best: Shortest RTT

  • Wrst: Longest RTT

  • StDev: RTT standard deviation

Check for packet loss and latency

The loss percentage or RTT at any hop might indicate a problem with that hop. However, some network providers configure routers not to generate an ICMP Time Exceeded reply to the sender. Also, a router might limit the number of ICMP replies that it emits or how quickly the router sends these replies. These behaviors might appear to be packet loss, latency, or both. However, packet loss or a high RTT that continues on all subsequent hops indicates an issue.

The following example output shows a healthy network path with some behaviors on particular hops that might be misleading:

Enter image description here

You can see the packet loss on hops 2 and 4, and latency on hop 5. However, you don't see these metrics on all subsequent hops. The destination statistics show an end-to-end packet loss of 0% with an RTT of less than 1 ms.

If you have a healthy network with misleading MTR results, then consider the following items:

  • If a network hop shows any percentage of loss, then check subsequent hops. If any of the subsequent hops shows 0% of loss, then it's not an actual loss. This condition is caused by the hop not generating ICMP Time Exceeded replies.

  • An RTT spike at a certain hop that decreases in subsequent hops doesn't indicate network latency. It shows a delay in the ICMP Time Exceeded packet that the hop's router generated.

  • If you don't see any loss or concerning RTT on the final hop, then you have a healthy network path despite intermittent hop statistics.

The following example output shows an end-to-end packet loss:

Enter image description here

You can see packet loss between hops 10 and 13. An MTR test in the reverse direction might better confirm the hops that are causing the issue. The end-to-end packet loss is 96.6% with an RTT of 36 ms.

If you have a degraded network, then consider the following items:

  • If you see loss on a hop and all subsequent hops, then there probably is packet loss. Mostly, the last responding hop provides the most accurate measure of end-to-end packet loss.
  • A high RTT that continues for multiple hops might indicate network latency. The last responding hop provides the most accurate end-to-end network latency.
  • If you see different values for packet loss and RTT on different hops, then focus on the highest values that don’t decrease in subsequent hops. Or, work backwards from the last hop.

Review other considerations

Target IP not responding

You might see 100% packet loss at the last hop, even though application connectivity to that destination is working. This might be because the destination is configured to not respond to the protocol or port that MTR is testing.

Enter image description here

If other hops show good statistics, then there might be an issue with the destination's configuration. Point MTR to a responsive protocol and port instead.

Variable length ECMP path

In some cases, multiple network paths with equal routing cost might exist to reach the same destination IP address. This technique is called Equal Cost Multi-Path Routing (ECMP). ECMP generates hashes from the packet header contents, including the IP addresses, protocol, and ports. Then, ECMP distributes traffic on these available paths. The hashes are computed for each packet. ECMP sends packets with the same hash on the same network path.

When you run MTR with TCP or UDP MTR, each new sequence of probes uses an incremented source port. This results in a unique hash for each sequence of probes on ECMP-enabled networks. ICMP doesn’t use ports. Each ICMP packet header generates the same hash.

Therefore, the packets might use a single network path. If you’re troubleshooting an ECMP network, then TCP or UDP MTR can use all paths. This makes TCP more effective at detecting issues than ICMP on ECMP networks.

However, when there’s varying hop counts on ECMP paths, MTR might incorrectly report packet loss. In the following example, you can see two possible paths to the target IP address. One path includes three hops, and the other includes seven hops. Multiple routers on the longer path aren’t configured to respond with the ICMP Time Exceeded reply.

Enter image description here

This network produces the following MTR report:

Enter image description here

In this example, you see MTR reporting an end-to-end packet loss of 40%. However, this is a false positive. MTR does this because it can reach the destination IP address with a TTL of 3 when it sends probes on the path with three hops. Therefore, MTR concludes that all probe attempts must reach the destination within that specific TTL value. It doesn’t send probes with the required TTL of 7 to reach the destination on the path with seven hops. Every probe attempt where packets are sent on this longer path expire in transit and are reported as packet loss. If you test the same destination with another tool that probes the TCP port without any TTL manipulation, then you don’t see any loss.

For example, run the following command:

sudo hping3 -S -p 22 10.0.100.1

It produces the following output:

Enter image description here

Validate MTR results

Ping and Hping

MTR manipulates the TTL of packets that might introduce some false positive scenarios. You can complement your validation with end-to-end tests that don’t introduce this behavior. To do this, you can use the utilities Ping and Hping for this purpose. You can use Hping to test both ICMP and TCP ports. When you use MTR in ICMP mode, run Ping or Hping to validate end-to-end loss and latency. When you run TCP MTR, use Hping to test the same TCP port where you performed the MTR test.

Bidirectional MTR

IP traffic might often traverse asymmetric paths. For example, your traffic might use a specific path to reach AWS services, while the reply traffic might use a different path. This can produce inaccurate results when you test the network connectivity with MTR or similar traceroute tools. It's important to test MTR bidirectionally whenever there are multiple network paths between a source and destination IP address.

Conclusion

MTR is a valuable network diagnostic tool. However, it's important that you validate the results from MTR with additional tests. For more information on MTR, see How do I troubleshoot network performance issues between EC2 Linux or Windows instances in a VPC and an on-premises host over the internet gateway? and How do I troubleshoot packet loss on my VPN connection?

AWS Cloud Support Engineers can help you diagnose these network path issues and provide direction and feedback on tests that you can run. They can also check for signals of internet degradation or impact. For more information on our plans and offerings, see AWS Support.


About the authors

Enter image description here

Michael Zimmerer

Michael Zimmerer is a Senior Cloud Support Engineer who focuses on the networking domain. He helps teams navigate challenges with application performance and reliability that are related to both public and private networks. He enjoys solving difficult technical issues or hard to diagnose behaviors.

Enter image description here

Wilsina William Rodrigues

Wilsina Rodrigues is a Cloud Support Engineer and recognized subject matter expert in the AWS VPN and AWS Transit Gateway services. As a networking specialist, Wilsina leverages her technical knowledge to design, implement, and troubleshoot robust connectivity solutions that deliver seamless integration between cloud and on-premises environments.

Enter image description here

Aditya Gulia

Aditya Gulia is a Cloud Support Engineer who specializes in AWS networking solutions. With expertise in network architecture, hybrid connectivity, and network security, he helps enterprises optimize their cloud infrastructure. He excels at bridging traditional networking with cloud-native architectures to solve complex customer challenges.

AWS OFFICIALUpdated 8 months ago4.5K views