Text

Text: awk

AWK was released in 1977 and could be considered the original scripting language (Perl came in 1987, Python in 1991, and JavaScript in 1995). AWK is a complete language, but its sweet spot is small programs typed at the command line.

AWK’s strength is validating and manipulating textual data (strings and numbers). It was developed at a time when only grep and sed were available for searching files and performing basic textual substitutions. Being a Unix tool, it is designed to work well as part of the overall Unix tool chain.

AWK scans input files line by line with each line being broken out into fields. Each line is matched against a pattern and if there is a match the associated actions are taken. Common actions include manipulating the text in some way or performing a running calculation in order to generate a report.

Is it still worth learning AWK? My general answer is that it is always worth learning the core Unix shell tools really well as circumstances regularly pop up where these tools are, by far, the most efficient solution. Of course you can do everything that AWK does using a general purpose scripting language like Python, but the speed of writing a short one-liner AWK program that gets the job done is hard to beat. This is especially so when you get good at combining the Unix commands using pipes.

I am not trying to be complete in my description of AWK here (that’s what the resources below are for), rather my aim is to cover enough ground that I can quickly refresh the AWK basics and provide a number of AWK examples covering common use cases (sort of a mini-cookbook).

Resources

The AWK Peogramming Language, 2nd Edition, this is the official manual for the language written by the AWK creators themselves and recently updated in 2023 to keep things fresh.

Effective AWK Programming, 4th edition

AWK Versions

AWK has been around for a long time and there are a few different versions floating around. Most of the material on this page will apply to all versions, but you should ensure that you know the version you are running as there might be subtle differences.

AWK Programs

AWK is a line-oriented language and an AWK program consists of a series of pattern-action statements and function definitions.

pattern { action }
pattern { action }
function name(parameter-list) { statements }

Each line of the input file is scanned in order and each time that a line matches the pattern the action is executed. The pattern and the action are both optional (but, obviously, not both). If the pattern is missing the action is executed. If the action is missing the original input line is printed.

Of course there is a lot more detail to consider, but just knowing this line-oriented pattern-action approach gets you thinking about AWK the right way.

Running AWK

A common way to run AWK is as a single-line program on the command line. The program is entered between single quotes and it is followed the input file.

awk '{ print $1 }' survey.data

This program will print the first field of every line of the survey.data file. $1 is a special variable that references the first field on the input line.

awk '/foo/ { print }' survey.data

This will only print lines that contain “foo”. The print statement by itself prints the entire line, another way to do this is with the special variable $0 which represents the entire line. Omitting the action altogether would have the same effect.

awk '/foo/ ' survey.data

You can also specify a file containing the AWK program using the -f flag.

awk -f first-col.awk survey.data

This will execute AWK using the contents of file first-col.awk as the AWK program and run the program against the input file survey.data.

Calling AWK from Vim

It can be useful to call AWK from Vim. As an example, let’s say you had two columns of text in some part of the file you are editing and you wanted to switch those columns.

111 222
111 222

You could do a visual block selection of the lines in question and then issue the following AWK shell command to switch the columns.

:'<,'>!awk '{print $2 " " $1}'

And end up with this:

222 111
222 111

Once you understand how AWK works, there will be times in your editing where Vim macros or substitution can’t quite get the job done and AWK comes to the rescue. This relies on Vim’s ability to call any shell command and that Unix commands are designed to work with stdin and stdout.

Patterns

The pattern determines if the action will be performed and actions are a series of statements. If the pattern is absent the action is taken (i.e. the absence of a pattern equates to true).

Expressions

Regular Expressions

Range Patterns

Actions

Tools

Web

Languages

Data