Command-line Basics: Counting Words and More

joshtronic

Modern word processors do a great job of providing information about the number of words and lines of a document. But if you’re working with a log file out on a remote server, you probably don’t want to go through the hassle of downloading the file so you can open it in your favorite word processor. In those situations, you can quickly and easily use the wc utility to get the number of lines, words, and more, right from the command-line.

Getting started

For this post we’re going to be using the wc command. It’s part of the GNU coreutils package, so it’s pretty much standard issue on most Unix-like systems such as Linux and macOS.

At one point, it was part of the GNU textutils package, but even when it was, it was still pretty much readily available on nearly all Unix-like systems.

If you happen to be lacking the wc command on your system, consult with your favorite package manager and see about installing one of the aforementioned GNU packages.

Also worth noting, none of the commands in this post are destructive in nature, so feel free to play around with the files on your local file system. In fact, it’s expected for you supply your own text files for the following examples :)

Word Count

As a blogger, I know that word count is a fairly important thing, especially if you’re trying to hit certain word counts for SEO purposes and such.

Obtaining the word count of a file or files can be accomplished by passing the file name to wc with the -w or --words argument:

$ wc -w filename.txt
313 filename.txt

The wc always returns the count at the beginning of the file. If you were to pass in more than one file, you will be greeted by a third line with the total:

$ wc -w file1.txt file2.txt
 1037 file1.txt
 1123 file2.txt
 2160 total

Line Count

On the flip side to my blogging, as a developer, I know that the number of lines of code, while a vanity metric, is still something that tends to be bragged about.

By swapping the -w argument for -l or --lines we can find out how many lines are in a file:

$ wc -w filename.js
73 filename.js

Character Count

Full disclosure, I rarely look up the number of characters in a file, so I can’t really related to this one.

Even without the personal connection, it’s still easy to pull off. Similar to line count, we just need to swap out arguments, passing in the -m or --chars argument:

$ wc -m filename.txt
2831 filename.txt

Byte Count

I bet you were wondering why the heck we used -m instead of -c for obtaining the number of characters. That’s because the -c argument is used for pulling the byte count.

Remember that characters and bytes aren’t always the same thing, as some characters are represented by multiple bytes. To pull the byte count from a file, pass in -c or --bytes:

$ wc -m filename.txt
3192 filename.txt

Longest Line Length

I’m quite the stickler about code formatting. In my perfect world, no lines would ever be over 80 characters.

Unfortunately, my totalitarian coder dream isn’t remotely close to reality as most of the popular / accepted style guides out there document soft limits (gasp) and aren’t quite as rigid as I am.

Even if you don’t have a strong opinion about line lengths, sometimes it’s good to know if you have any crazy long lines in a file.

To find out the length of the longest line in a file, pass in the -L or --max-line-length argument to get the length of the longest line:

$ wc -m filename.js
80 filename.js

See! I told you I go out of my way to keep it around 80 or less ;)

Putting it All Together

All of the previous examples included at least one argument, but you can actually omit the arguments all together.

When you omit the arguments, wc will by default return the line, word and character count (without any heading or labels), the same as if you used the argument -lwm:

$ wc filename.js
 128  691 3898 filename.txt

$ wc -lwm filename.js
 128  691 3898 filename.txt

That said, you can pass in any combination of arguments that you’d like, to create your own personalized output of information. Here’s how it looks when we report on the number of characters, bytes, lines, words AND maximum line length:

$ wc -mclwL filename.txt
 143  752 4268 4268   80 filename.txt

Of course, if you pass in multiple files, you’ll receive information on each file as well as a grand total (or maximum value) at the end:

$ wc -mclwL file1.txt file2.txt file3.txt
  212  1123  6702  6702    94 file1.txt
  218   991  6225  6225    84 file2.txt
  185  1058  6184  6184    84 file3.txt
  615  3172 19111 19111    94 total

If you’re feeling especially frisky, you can also pass in a directory or wild card character, *, to see the counts for a whole bunch of files!

  Tweet It

🕵 Search Results

🔎 Searching...

Sponsored by #native_company# — Learn More
#native_title# #native_desc#
#native_cta#