Awk Tutorial
What is AWK?
- AWK, one of the most prominent text-processing or text filtering utility on GNU/Linux. Very and powerful programming language, solve complex problems in very less line of codes.
- Its name is derived from the family names of its authors − Alfred Aho, Peter Weinberger, and Brian Kernighan.
- Maintained by FSF (Free Software Foundation).
- Basic Syntax of awk is awk ‘options’ file.
Print file using awk?
Its similar to cat /etc/resolve.conf. It prints file content in the console.
awk ‘//{print}’ /etc/resolv.conf or awk ‘{print}’ /etc/resolv.conf
difference between the above two examples is in the first example it will print or if you want to print a specific line which contains patterns, whereas in the second example it's just print the content in the console, for example,
awk ‘/8.8.8.8/{print}’ /etc/resolv.conf
it will print line which contains “8.8.8.8”. the basic syntax of the first example is awk ‘/pattern/print’ file.
pattern: can be regex or string.
awk ‘/^saurav/{print}’ /etc/passwd.
in the above example line which starts with saurav will print.
awk ‘/*sql$/{print}’ /etc/passwd
in the above example, the line ends with sql will print, likewise. we can use regex to print matching pattern.
Print Column using awk?
By default IFS (Intermediate field separator) in bash is space. similarily in AWK default, IFS is tab or space.
Here is the file which contains 3 columns which I gonna used to explain:
SEQ Name Subject Marks
1) Saurav Physics 80
2) Deepak Maths 90
3) Dhoni Biology 87
4) Kedar English 85
5) Pandya History 89
Printing 3rd column: Here we are going to print 3 rd column
awk ‘//{print $3}’ example.txtOutput:Subject
Physics
Maths
Biology
English
History
Let see how to print column 2 and 4
awk ‘//{print $2 $4}’ example.txtOutput:NameMarks
Saurav80
Deepak90
Dhoni87
Kedar85
Pandya89
here we can see awk is printing column which is not separated. if you want to separate columns use ‘,’ (comma).
awk ‘//{print $2, $4;}’ example.txtOutput:Name Marks
Saurav 80
Deepak 90
Dhoni 87
Kedar 85
Pandya 89
Using printf in awk?
Printf helps here to format the output to print.
For Example:
awk ‘NR>1 {printf “Marks=%d Subject=%s\n”,$4, $3 }’ example.txtOutput:
Marks=80 Subject=Physics
Marks=90 Subject=Maths
Marks=87 Subject=Biology
Marks=85 Subject=English
Marks=89 Subject=History
As you can see in the above example printf function similar in C language works here.
Comparison Operators in AWK:
In awk, you can compare columns and print in the console
For Example:
awk ‘$4 > 85 {print;}’ example.txtSEQ Name Subject Marks
2) Deepak Maths 90
3) Dhoni Biology 87
5) Pandya History 89
in the above example print the line whose 4 th column (marks) is greater than 85.
So there are different comparison operators
>:
greater than<:
less than>=:
greater than or equal to<=:
less than or equal to==:
equal to!=:
not equal tosome_value ~ / pattern/:
– true if some_value matches the patternsome_value !~ / pattern/:
– true if some_value does not match the pattern.
If we want to print the marks of Deepak:
awk ‘$2 ~ “Deepak” { print $0 ; }’ example.txtOutput:
2) Deepak Maths 90
similarily we can get the matching row using comparison operators.
Compound operation in AWK:
In awk, we can combine multiple expression to filter text. We can use && (and) and || (or) operators to achieve this.
Let see some examples.
Print marks of the people who have marks greater than 85 in History.
awk ‘($4 >= 85 ) && ($3 ~ “History”) { print $0 ; }’ example.txtOUTPUT:5) Pandya History 89
Print marks of the people who have marks greater than 85 or whose subject is History.
awk '($4 >= 85 ) || ($3 ~ "History") { print $0 ; }' example.txtOUTPUT:2) Deepak Maths 90
3) Dhoni Biology 87
4) Kedar English 85
5) Pandya History 89
similarily we can achieve combining multiple expression to filter the text.
Next Keyword in AWK:
next keyword is somewhat similar as continue in a different programming language like java, scala. This really helps when there are the multiple expression to evaluate and the only one you want to print skip rest all the expressions.
For Example:
awk ‘ FNR == 1 {next};
$4 >= 85 { printf “%s\t%s\n”, $0,”EXEMPTION” ; next}
$4 < 85 {printf “%s\t%s\n”, $0,”PASSED”;} ‘
example.txtOutput:1) Saurav Physics 80 PASSED
2) Deepak Maths 90 EXEMPTION
3) Dhoni Biology 87 EXEMPTION
4) Kedar English 85 EXEMPTION
5) Pandya History 89 EXEMPTION
In the above example as we can see
first line FNR == 1 {next} check if its first line or row then go to next.
second line $4 >= 85 { printf “%s\t%s\n”, $0,”EXEMPTION” ; next} it check if the 4th column(marks) is greater than 85 then print and go to the next line .
Variables and Numeric Expressions:
Variables are place holders which store some value which stored in memory like other programming languages.
Syntax:
variable=value
Example:
marks=10
name=saurav
Numeric expressions are the expression which does numeric expressions. Like adding or dividing some numbers similar to other programming languages.
Syntax: operand operator operandExample:var1=1
var2=2
var3= var1 + var2
Let see some examples:
Print line number with every line in the console.
awk ‘FNR==1 {next};line= $0 //store content reads by awk{ line_no=+1 ; printf “%d\t%s\n”, line_no,line ; }’ // increment line_no with every line readexample.txtOUTPUT:1 1) Saurav Physics 80
2 2) Deepak Maths 90
3 3) Dhoni Biology 87
4 4) Kedar English 85
5 5) Pandya History 89
Happy Coding :).