Skip to content

Instantly share code, notes, and snippets.

@dalibor-drgon
Last active June 7, 2022 08:28
Show Gist options
  • Select an option

  • Save dalibor-drgon/6efd2e774d102cb4a5938ac18bf2567a to your computer and use it in GitHub Desktop.

Select an option

Save dalibor-drgon/6efd2e774d102cb4a5938ac18bf2567a to your computer and use it in GitHub Desktop.

AWK

High-level language for text processing and pattern matching. Has C-like syntax and many built-in functions for working with strings.

Example programs

To better understand, here are some example programs.

/* Display lines between 42 and 666 */
{
    /* NR is a variable that denotes number of row (line) starting from 1 */
    if (NR > 42 && NR < 666) { /* if line number > 42 and line number < 666 */
        print $0; /* print the whole line */
    }
}

If line starts with ".", display the content after it it.

{
    if (substr($0, 1, 1) == ".") {
        print substr($0, 2); 
    }
}

Then we have more complicated example that counts the number of occurences of words "dog" and "cat" on each line and displays the maximum at the end of the program.

BEGIN { /* executed once at the beggining */
    max_occurences = 0; /* initialize variable that we will use */
    FS = " "; /* Input Field Separator. Since we set it as zero, it will split each line into NR words */
    /* for more variables like FS, NF and NR see https://www.tutorialspoint.com/awk/awk_built_in_variables.htm */
}
{ /* This block is executed for each line */
    cur_occurences = 0;
    for (i = 1; i <= NF; i++) { /* loop over all words on current line */
        /* while $0 means whole line, $1 up to $NF are words */
        if ($i == "dog" || $i == "cat") { /* if one of the words is "cat" or "dog" */
            cur_occurences = cur_occurences + 1; /* increment current occurences count by 1 */
                }
    }
    /* Pick the max number of occurences */
        if(cur_occurences > max_occurences)
                max_occurences = cur_occurences;
}
END { /* executed once at the end */
    /* %d is placeholder for integer, see manual of awk or printf */
    printf "Max occurences per line: %d\n", max_occurences;
}

We can test the program as:

$ echo -e "dog was hungry.\ndog ate cat breakfast. cat was very upset\nbut cat hide its treats" | awk '<SCRIPT>'
Max occurences per line: 3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment