Awk Tutorial

saurav omar
4 min readFeb 18, 2019

--

What is AWK?

  • AWK, one of the most prominent text-processing or text filtering utility on GNU/Linux. Very and powerful programming language, solve complex problems in very less line of codes.
  • Its name is derived from the family names of its authors − Alfred Aho, Peter Weinberger, and Brian Kernighan.
  • Maintained by FSF (Free Software Foundation).
  • Basic Syntax of awk is awk ‘options’ file.

Print file using awk?

Its similar to cat /etc/resolve.conf. It prints file content in the console.

awk ‘//{print}’ /etc/resolv.conf       or awk ‘{print}’ /etc/resolv.conf

difference between the above two examples is in the first example it will print or if you want to print a specific line which contains patterns, whereas in the second example it's just print the content in the console, for example,

awk ‘/8.8.8.8/{print}’ /etc/resolv.conf

it will print line which contains “8.8.8.8”. the basic syntax of the first example is awk ‘/pattern/print’ file.

pattern: can be regex or string.

awk ‘/^saurav/{print}’ /etc/passwd.

in the above example line which starts with saurav will print.

awk ‘/*sql$/{print}’ /etc/passwd

in the above example, the line ends with sql will print, likewise. we can use regex to print matching pattern.

Print Column using awk?

By default IFS (Intermediate field separator) in bash is space. similarily in AWK default, IFS is tab or space.

Here is the file which contains 3 columns which I gonna used to explain:

SEQ Name Subject Marks
1) Saurav Physics 80
2) Deepak Maths 90
3) Dhoni Biology 87
4) Kedar English 85
5) Pandya History 89

Printing 3rd column: Here we are going to print 3 rd column

awk ‘//{print $3}’ example.txtOutput:Subject
Physics
Maths
Biology
English
History

Let see how to print column 2 and 4

awk ‘//{print $2 $4}’ example.txtOutput:NameMarks
Saurav80
Deepak90
Dhoni87
Kedar85
Pandya89

here we can see awk is printing column which is not separated. if you want to separate columns use ‘,’ (comma).

awk ‘//{print $2, $4;}’ example.txtOutput:Name Marks
Saurav 80
Deepak 90
Dhoni 87
Kedar 85
Pandya 89

Using printf in awk?

Printf helps here to format the output to print.

For Example:

awk ‘NR>1 {printf “Marks=%d Subject=%s\n”,$4, $3 }’ example.txtOutput:
Marks=80 Subject=Physics
Marks=90 Subject=Maths
Marks=87 Subject=Biology
Marks=85 Subject=English
Marks=89 Subject=History

As you can see in the above example printf function similar in C language works here.

Comparison Operators in AWK:

In awk, you can compare columns and print in the console

For Example:

awk ‘$4 > 85 {print;}’ example.txtSEQ Name Subject Marks
2) Deepak Maths 90
3) Dhoni Biology 87
5) Pandya History 89

in the above example print the line whose 4 th column (marks) is greater than 85.

So there are different comparison operators

  1. >:greater than
  2. <:less than
  3. >=:greater than or equal to
  4. <=: less than or equal to
  5. ==:equal to
  6. !=: not equal to
  7. some_value ~ / pattern/: – true if some_value matches the pattern
  8. some_value !~ / pattern/: – true if some_value does not match the pattern.

If we want to print the marks of Deepak:

awk ‘$2 ~ “Deepak” { print $0 ; }’ example.txtOutput:
2) Deepak Maths 90

similarily we can get the matching row using comparison operators.

Compound operation in AWK:

In awk, we can combine multiple expression to filter text. We can use && (and) and || (or) operators to achieve this.

Let see some examples.

Print marks of the people who have marks greater than 85 in History.

awk ‘($4 >= 85 ) && ($3 ~ “History”) { print $0 ; }’ example.txtOUTPUT:5) Pandya History 89

Print marks of the people who have marks greater than 85 or whose subject is History.

awk '($4 >= 85 ) || ($3 ~ "History") { print  $0 ; }' example.txtOUTPUT:2)  Deepak    Maths      90
3) Dhoni Biology 87
4) Kedar English 85
5) Pandya History 89

similarily we can achieve combining multiple expression to filter the text.

Next Keyword in AWK:

next keyword is somewhat similar as continue in a different programming language like java, scala. This really helps when there are the multiple expression to evaluate and the only one you want to print skip rest all the expressions.

For Example:

awk ‘ FNR == 1 {next};
$4 >= 85 { printf “%s\t%s\n”, $0,”EXEMPTION” ; next}
$4 < 85 {printf “%s\t%s\n”, $0,”PASSED”;} ‘
example.txt
Output:1) Saurav Physics 80 PASSED
2) Deepak Maths 90 EXEMPTION
3) Dhoni Biology 87 EXEMPTION
4) Kedar English 85 EXEMPTION
5) Pandya History 89 EXEMPTION

In the above example as we can see

first line FNR == 1 {next} check if its first line or row then go to next.

second line $4 >= 85 { printf “%s\t%s\n”, $0,”EXEMPTION” ; next} it check if the 4th column(marks) is greater than 85 then print and go to the next line .

Variables and Numeric Expressions:

Variables are place holders which store some value which stored in memory like other programming languages.

Syntax:

variable=value

Example:

marks=10
name=saurav

Numeric expressions are the expression which does numeric expressions. Like adding or dividing some numbers similar to other programming languages.

Syntax: operand operator operandExample:var1=1
var2=2
var3= var1 + var2

Let see some examples:

Print line number with every line in the console.

awk ‘FNR==1 {next};line= $0 //store content reads by awk{ line_no=+1 ; printf “%d\t%s\n”, line_no,line ; }’ //  increment line_no with every line readexample.txtOUTPUT:1 1) Saurav Physics 80
2 2) Deepak Maths 90
3 3) Dhoni Biology 87
4 4) Kedar English 85
5 5) Pandya History 89

Happy Coding :).

--

--

saurav omar
saurav omar

Written by saurav omar

Geek and Always ready to give more than 100%

No responses yet