Command-line Basics: Searching File Contents

joshtronic

Oftentimes when looking for files, just knowing the name and/or relative location just isn’t enough. Sometimes we know something about the contents of the file and nothing else. Other times we know exactly which directory the file is in, but don’t have the time to open up every single file to find the one we need. Knowing how to use grep to search file contents can come in handy during those times and many more.

Getting started

For the sake of example, we’re not going to create a whole bunch of files to work with. I’ll keep the syntax as simple and straight forward as possible but I encourage you to update the samples commands with real-life scenarios based on your system.

None of the commands in this article will be destructive, so I feel it’s best to work with your own live files so you can really get a feel for how things actually work.

Search a file’s contents

Let’s say you wanted to know if there was a user account on your system named alligator. We know that all of the user accounts on the system are listed in the /etc/passwd file:

$ grep 'alligator' /etc/passwd

Unless you are in fact an alligator yourself, that search probably didn’t yield any results. Try it again with your own user name and you should be greeting with something similar to this:

username:x:1000:1000::/home/username:/bin/bash

Search the contents of a directory of files

Searching files by name is quite efficient since there’s a limited number of files to actually search. Quite often we’re not entirely sure which file has what we’re looking for.

In those scenarios, we can give grep as path instead of a filename and pass it either -r or -R to tell it to work recursively.

-r or --recursive will search files recursively, not following symbolic links while -R or --dereference-recusive will follow all symbolic links. Personally speaking, I use -R exclusively.

So if we wanted to build on the previous example, and find all files in /etc that mention the alligator user name:

$ grep 'alligator' /etc -R

Technically speaking, grep isn’t aware of the “user name” context. It’s just looking for text in files.

The aforementioned command more than likely mentioned some permission denied errors. You could run the same command as a super user (bonus points for using sudo !!) or you could pass in the -s or --no-messages argument to suppress the errors:

$ grep 'alligator' /etc -Rs

Searching files for multiple strings

Thus far, we’ve been working with basic text searches. This will get you quite far, but what if you wanted to search for multiple things and wanted to see which files contained any of the patterns?

The most basic OR usage is to split up your patterns with \|, an escaped pipe character:

$ grep 'alligator\|crocodile' /etc -Rs

If you are comfortable with regular expressions, you can pass in the -E or --extended-regexp argument which unlocks the power of regular expressions with grep:

$ grep -E 'alligator|crocodile' /etc -Rs
$ egrep 'alligator|crocodile' /etc -Rs # Also works!

You’ll notice we saved a bit of typing because we don’t need to escape the pipe character. Because grep -E invokes the egrep command, we can simply use it directly as well.

Depending on your system, grep is probably quite colorful in your terminal while egrep may not be. This is usually due to sane defaults and aliases on the part of the distribution.

Regardless of your system, you can always pass in the --color argument to enable colorized output from grep.

Case-insensitivity

Text can exist in many different forms. ALL CAPS. all lowercase. SoME oTheR ComBiNAtioN. So it’s worth nothing that the default nature of grep is case-sensitive matching.

Never fear, you can get around this restriction by passing in the -i or --ignore-case argument:

$ grep 'SoME CrAZy tExT' ~/some-file -i

Match an entire line

Similar to how grep is case-sensitive by default, it also does partial matches by default so the text pattern can exist anywhere in a line.

If you want to be a bit more strict, and only show the lines that match completely, you can pass in -x or --line-regexp:

$ grep 'Alligator, CEO, 888-555-1212' ~/some-file -x

This is the equivalent of using a regular expression that is wrapped in ^ (match beginning of the line) and $ (match ending of the line).

Show non-matching lines

Generally speaking, when we’re searching through files, you’re usually interested in the matching lines and not the lines that don’t match.

If you were interested in the lines that don’t match, you can invert the match with the -v or --invert-match argument:

$ grep 'alligator' ~/reptiles.txt -v

Depending on how many files you’re searching and the size of the files, this could generate a ton of output.

Count matching lines

That scenario I just mentioned, the one about a ton of output being generated, you can easily get around it by suppressing the output and only returning the number of matches with the -c or --count argument:

$ grep 'alligator' ~/reptiles.txt -vc

If using -c as part of a recursive search, every file will be listed out with the number of lines matched (or unmatched) next to it, which may be 0.

You could also use the -l or -L arguments to only show the files with or without matches. -l or --files-with-matches and -L or --files-without-match will allow you to filter the noise out quite a bit.

It’s worth noting that the moment you include -l or -L you’ll lose out on the count as it tells grep to ONLY output the matching file name. If you wanted to count the matches across multiple files, you could leverage another command, wc to count the total number of matches or unmatched lines:

$ grep 'alligator' /etc -Rvs | wc -l

Only output file names

The -l and -L arguments we just discussed allow you to do some really amazing things on the command-line. One of my favorite tricks is to open multiple matched files returned by grep with vim:

$ vim $(grep 'function alligator' ~/MyProject -Rl)

Conclusion

Like many command-line utilities, it’s hard to cover everything the command can do. It’s always highly recommended to delve into the man pages to get a full understanding of your a command has to offer.

Because the man contents are local to your system, you’ll get the best understanding of the command as it relates to your operating system. Depending on your operating system, you may have a different version of a command that may not actually have all of the arguments discussed available.

And who knows, it may even help you during your next technical interview

  Tweet It

🕵 Search Results

🔎 Searching...

Sponsored by #native_company# — Learn More
#native_title# #native_desc#
#native_cta#