How to count the occurrences of “something” in a file

This is something we need to do from time to time: filter some useful information from a file (e.g., a log file) and count the instances of each "object". While filtering out the relevant information may be tricky and there is no general rule for it, counting the instances and sorting them is pretty simple.

Let's call "filter" the program, chain of pipes, or whatever you used to extract the information you are longing for. Then, the count and sort process is just a chain of three commands:

filter | sort | uniq -c | sort -nr

The first sort is used to group equal instances together; then this is passed to uniq -c, which will "compress" each group to a single line, prefixed by the number of times that instance appeared in the output; then we feed all this to sort again, this time with -n (numeric sort) and -r (reverse sort, so that we have the most frequent instance listed first).

Nice, isn't it? 😉

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s