Command-line Basics: Network Troubleshooting

joshtronic

In our modern technological age, when a site or service goes offline it can really disrupt up our day. What’s worse than unexpected downtime? Contacting support only to receive the dreaded “looks good on our end” reply. As technical people, it’s our job to be ahead of the curve. Self-diagnosing issues and starting a dialog with support with a bunch of information up front will usually turn a “LGTM” into a “we have our engineers looking into this” faster than you can ping a server.

Getting started

For this article, we’re going to be using the commands ping, traceroute, and mtr.

ping tends to be standard issue on most Unix-like operating systems, like macOS and Linux. The other commands may not be on your system. If that’s the case, just dust off your trusty package manager and install them.

ping

Ah, the ping command. It’s transcended from lowly networking command to being a verb much in the same way that people “Google” and “Xerox” things.

To ping a server is to send it an ICMP ECHO_REQUEST.

In simpler terms, you’re asking the server if it’s up and available. If it is, it will kick you back a response (a pong, if you will).

To ping a host, simply run:

$ ping alligator.io

This will begin to output a stream of data, every single that looks something like this:

% ping alligator.io
PING alligator.io (167.99.4.63) 56(84) bytes of data.
64 bytes from 167.99.4.63 (167.99.4.63): icmp_seq=1 ttl=48 time=56.10 ms
64 bytes from 167.99.4.63 (167.99.4.63): icmp_seq=2 ttl=48 time=55.7 ms
64 bytes from 167.99.4.63 (167.99.4.63): icmp_seq=3 ttl=48 time=52.1 ms
64 bytes from 167.99.4.63 (167.99.4.63): icmp_seq=4 ttl=48 time=60.1 ms
64 bytes from 167.99.4.63 (167.99.4.63): icmp_seq=5 ttl=48 time=53.2 ms

This will go on indefinitely until you hit CTRL-C.

If you wanted to, you could make things less open ended by passing in -c to specify the number of ping attempts you’d like to make. Additionally, you can pass in -i to change the interval from 1 second to say, 5 seconds:

$ ping -c 10 -i 5 alligator.io # 10 attempts, 5 seconds apart

Upon hitting CTRL-C or when the specified number of attempts are completed, ping will dump out some additional statistics:

--- alligator.io ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 88ms
rtt min/avg/max/mdev = 48.883/60.812/84.296/9.125 ms

All of this information allows us to see how fast a server is responding to our requests and whether or not we’re losing packets along the way.

traceroute

Where ping simply tells if a server is available and responding, traceroute allows us a glimpse of where our network traffic goes once it leaves our machine.

When you access a remote server, there’s usually a dozen or so other systems that you touch along the way. These servers, or “hops”, are routers and switches that make sure your request is directed to the right place.

Similar to ping, the syntax is quite simple:

$ traceroute alligator.io

Which outputs all of this networking goodness:

traceroute to alligator.io (104.248.120.187), 30 hops max, 60 byte packets
1  _gateway (192.168.1.1)  0.770 ms  0.761 ms  0.819 ms
2  * * *
3  tge0-0-4.ausgtxlg01h.texas.rr.com (66.68.3.233)  30.777 ms  30.806 ms 30.874 ms
4  agg22.ausutxla01r.texas.rr.com (24.175.43.211)  14.718 ms  15.912 ms 15.903 ms
5  agg22.dllatxl301r.texas.rr.com (24.175.41.46)  20.321 ms  23.851 ms 20.254 ms
6  bu-ether14.dllstx976iw-bcr00.tbone.rr.com (66.109.6.88)  20.159 ms 66.109.1.216 (66.109.1.216)  19.298 ms bu-ether14.dllstx976iw-bcr00.tbone.rr.com (66.109.6.88)  22.947 ms
7  66.109.5.121 (66.109.5.121)  17.162 ms *  17.268 ms
8  ix-ae-52-0.tcore2.dt8-dallas.as6453.net (66.110.57.162)  17.829 ms 15.449 ms  16.725 ms
9  if-ae-2-2.tcore1.dt8-dallas.as6453.net (66.110.56.5)  53.117 ms 53.062 ms  52.972 ms
10  if-ae-37-3.tcore1.aeq-ashburn.as6453.net (66.198.154.68)  49.641 ms 49.772 ms  56.536 ms
11  if-ae-30-2.tcore2.nto-new-york.as6453.net (63.243.216.20)  56.445 ms  56.323 ms  55.215 ms
12  if-ae-12-2.tcore1.n75-new-york.as6453.net (66.110.96.5)  68.935 ms 70.431 ms  67.118 ms
13  66.110.96.26 (66.110.96.26)  54.499 ms  55.206 ms 66.110.96.22 (66.110.96.22)  55.499 ms
14  * * *
15  * * *
16  104.248.120.187 (104.248.120.187)  53.645 ms !X  56.128 ms !X 55.966 ms !X

Unlike ping, traceroute only runs once.

As you can see, the output is a little bit intimidating.

Not to fear though, while the output is quite valuable, when you’re troubleshooting a network issue, your prime suspect tends to be the last system on the list, because the command will freeze there.

When traceroute freezes up, it’s because the next hop on the route isn’t responding at all. In that scenario, you could try to ping the server, assuming the IP or host name is showing up. It probably won’t respond.

Speaking of availability of a host name, you may have noticed the * * * for some of the hops. This means that traceroute didn’t receive a response within the expected timeout. This could be indicative of an outage, but often times is a slow machine that routed you properly, but was too slow on it’s response.

mtr

mtr is what happens when ping and traceroute make a baby.

The mtr command combines the route tracing of traceroute and the periodic checking on an interval of ping.

It’s great for monitoring the hops on your route for responses as well as packet loss. It comes up super handy when you’re working with a hosting provider, as it’s the tool they usually request the output from (both to and from your server).

Similar to the other tools we’ve discussed, mtr can be run as minimally as:

$ mtr alligator.io

Just like it’s older cousin ping, it will run indefinitely until you hit CTRL-C. Different from ping, the output will look more like traceroute:

                           My traceroute  [vUNKNOWN]
galagopro.josh (192.168.1.14)                          2019-06-21T20:06:37-0500
Keys:  Help   Display mode   Restart statistics   Order of fields   quit
                                       Packets               Pings
 Host                                Loss%   Snt   Last   Avg  Best  Wrst StDev
 1. _gateway                          0.0%    89    0.8   1.3   0.5   9.7   1.8
 2. cpe-24-55-0-1.austin.res.rr.com   0.0%    89   13.4  16.9   6.7  75.7   9.3
 3. tge0-0-4.ausgtxlg01h.texas.rr.co  0.0%    89   33.5  86.4  15.4 628.1 138.4
 4. agg22.ausutxla01r.texas.rr.com    0.0%    89   13.7  16.5   6.7  31.9   4.4
 5. agg22.dllatxl301r.texas.rr.com    0.0%    89   24.0  25.0  15.0  83.9   8.4
 6. bu-ether14.dllstx976iw-bcr00.tbo  0.0%    89   19.9  24.7  13.5  57.8   6.9
 7. 66.109.5.121                     15.7%    89   16.0  22.1  14.3  55.4   7.3
 8. ix-ae-52-0.tcore2.dt8-dallas.as6  0.0%    89   18.0  22.3  10.9  58.7   7.6
 9. if-ae-2-2.tcore1.dt8-dallas.as64  0.0%    89   54.5  57.0  46.7  74.7   5.5
10. if-ae-37-3.tcore1.aeq-ashburn.as  0.0%    89   50.6  57.0  48.2  90.8   6.5
11. if-ae-30-2.tcore2.nto-new-york.a  0.0%    89   56.4  58.5  50.3  91.6   7.0
12. if-ae-12-2.tcore1.n75-new-york.a  0.0%    89   57.0  55.1  45.2  71.9   4.7
13. 66.110.96.26                      0.0%    88   53.5  56.6  44.7  90.3   7.4
14. ???
15. ???
16. 104.248.120.187                   0.0%    88   57.3  57.6  47.3  73.5   4.7

As you can see, I was experiencing a bit of packet loss on one of the hops. This can mean that the router is struggling a bit. Fortunately with modern routers, even with a little bit of packet loss, it may not be that big of a deal.

The output from mtr can be a bit of a pain since it’s updating regularly (making it hard to copy and paste) and when you hit CTRL-C the output will vanish, unless you’re inside of a terminal multiplexer.

To corral the output a bit, you can pass in the -r argument to tell it to go into “report mode”. The -w argument which will make the output “wide”, helping to show the complete host name. And finally, the -c argument that you may remember from ping, which will limit the number of attempts:

$ mtr -rwc 100 alligator.io

This will take a moment to run and then output the final results in a nice and easy, ready to copy and paste format:

Start: 2019-06-21T20:11:52-0500
HOST: galagopro.josh                            Loss%   Snt   Last   Avg  Best  Wrst StDev
  1.|-- _gateway                                   0.0%   100    8.2   1.8   0.6  44.6   5.1
  2.|-- cpe-24-55-0-1.austin.res.rr.com            0.0%   100   45.5  16.3   7.6  61.5   9.0
  3.|-- tge0-0-4.ausgtxlg01h.texas.rr.com          0.0%   100   59.0  85.7  16.6 630.8 136.3
  4.|-- agg22.ausutxla01r.texas.rr.com             0.0%   100   41.4  15.8   5.6 124.3  12.1
  5.|-- agg22.dllatxl301r.texas.rr.com             0.0%   100   20.2  24.9  14.5 191.7  17.6
  6.|-- bu-ether14.dllstx976iw-bcr00.tbone.rr.com  0.0%   100   27.9  24.7  16.1 146.7  12.8
  7.|-- 66.109.5.121                               5.0%   100   24.0  23.5  10.4 140.9  16.5
  8.|-- ix-ae-52-0.tcore2.dt8-dallas.as6453.net    0.0%   100   17.1  19.9  11.1  46.3   5.0
  9.|-- if-ae-2-2.tcore1.dt8-dallas.as6453.net     0.0%   100   64.8  57.3  45.7 168.0  12.4
 10.|-- if-ae-37-3.tcore1.aeq-ashburn.as6453.net   0.0%   100   55.4  57.0  47.2 242.6  19.3
 11.|-- if-ae-30-2.tcore2.nto-new-york.as6453.net  0.0%   100   54.8  57.5  49.6 179.8  13.7
 12.|-- if-ae-12-2.tcore1.n75-new-york.as6453.net  0.0%   100   56.3  55.6  44.8 116.9   8.6
 13.|-- 66.110.96.26                               0.0%   100   52.0  57.7  46.8 228.9  18.9
 14.|-- ???                                       100.0   100    0.0   0.0   0.0   0.0   0.0
 15.|-- ???                                       100.0   100    0.0   0.0   0.0   0.0   0.0
 16.|-- 104.248.120.187                            0.0%   100   78.3  59.2  49.0 137.7  10.7

Ouch, 100% packet loss on two of the hops! Fortunately, this can be a misnomer and byproduct of slow responding routers, and not indicative of actual packet loss.

Conclusion

It’s always good to be prepared when reaching out to a support representative. Often times problems are temporary, so doing your own research at the time of incident could very well be the only information available about said incident.

Armed with three simple command-line tools, you should be able to troubleshoot quite a few networking problems with ease and efficiency.

And remember, when submitting a support issue, there’s always another human being on the other end. While they may lack empathy towards your plight, that’s no reason to be a jerk to them.

  Tweet It

🕵 Search Results

🔎 Searching...

Sponsored by #native_company# — Learn More
#native_title# #native_desc#
#native_cta#