How to Filter with Awk
β’ 2 min read
bash awk filtering conditional data processing field filtering
Quick Answer: Filter with Awk Conditions
To filter lines conditionally with awk, use: awk '$2 > 100' data.txt to print lines where field 2 is greater than 100. For text matching, use awk '/pattern/' to match pattern. Combine multiple conditions with && and ||.
Quick Comparison: Awk Filtering Methods
| Syntax | Condition Type | Best For | Speed |
|---|---|---|---|
| awk β$2 > 100β | Numeric | Field comparisons | Fast |
| awk β/pattern/β | String match | Pattern matching | Fast |
| awk β$1 == valueβ | Equality | Exact matches | Fast |
| awk βNR > 5β | Line number | Range filtering | Fast |
Bottom line: Use field conditions for numeric; use patterns for text matching.
Use conditional statements in awk to filter and process data based on patterns and field values. Learn if/else, comparison, and logical operators.
Method 1: Basic Conditions
# Print lines where field > value
awk '$2 > 100' data.txt
# Print lines where field equals value
awk '$1 == "John"' data.txt
# Print lines containing pattern
awk '/pattern/' data.txt
# Print lines where field matches regex
awk '$2 ~ /pattern/' data.txt
Detailed Example
Test file (sales.txt):
John 150 Active
Jane 200 Inactive
Bob 75 Active
Alice 320 Active
# Print sales > 100
awk '$2 > 100' sales.txt
# Output:
# John 150 Active
# Jane 200 Inactive
# Alice 320 Active
# Print only active sales
awk '$3 == "Active"' sales.txt
# Output:
# John 150 Active
# Bob 75 Active
# Alice 320 Active
Comparison Operators
| Operator | Meaning |
|---|---|
== | Equal |
!= | Not equal |
< | Less than |
> | Greater than |
<= | Less or equal |
>= | Greater or equal |
~ | Match regex |
!~ | Not match regex |
Logical Operators
# AND condition
awk '$2 > 100 && $3 == "Active"' data.txt
# OR condition
awk '$1 == "John" || $1 == "Jane"' data.txt
# NOT condition
awk '!($2 < 100)' data.txt
If-Else Statements
# Basic if
awk '$2 > 100 {print "High"; next} {print "Low"}' data.txt
# If-else block
awk '{
if ($2 > 100)
print $1 " has high value"
else
print $1 " has low value"
}' data.txt
# If-else if-else
awk '{
if ($2 > 200)
print "Very High"
else if ($2 > 100)
print "High"
else
print "Low"
}' data.txt
Pattern Matching Conditions
# Match regex
awk '/ERROR/ {print}' logfile.txt
# Partial match
awk '$1 ~ /^[0-9]/ {print}' file.txt # First field starts with digit
# Not matching
awk '$2 !~ /debug/ {print}' file.txt # Second field doesn't contain "debug"
# Case-insensitive match
awk 'tolower($1) ~ /john/ {print}' file.txt
String Comparisons
# String equality
awk '$1 == "apple"' file.txt
# String inequality
awk '$1 != "apple"' file.txt
# Length comparison
awk 'length($1) > 5' file.txt
Practical Example: Data Classification
#!/bin/bash
# File: classify_data.sh
data="$1"
awk '{
score = $2
status = $3
if (score >= 90) {
grade = "A"
} else if (score >= 80) {
grade = "B"
} else if (score >= 70) {
grade = "C"
} else {
grade = "F"
}
if (status == "Active") {
result = "PROCESS"
} else {
result = "SKIP"
}
printf "%-15s Score: %3d Grade: %s Action: %s\n", $1, score, grade, result
}' "$data"
Input:
John 95 Active
Jane 75 Inactive
Bob 85 Active
Output:
John Score: 95 Grade: A Action: PROCESS
Jane Score: 75 Grade: C Action: SKIP
Bob Score: 85 Grade: B Action: PROCESS
Multiple Conditions
# All conditions must be true
awk '$2 > 100 && $3 == "Active" && $4 != "error"' file.txt
# At least one condition true
awk '$1 == "error" || $1 == "warning" || $1 == "critical"' file.txt
# Complex condition with parentheses
awk '($2 > 100 || $2 < 50) && $3 == "Active"' file.txt
Filtering on Field Range
# Value between two numbers
awk '$2 >= 50 && $2 <= 150' data.txt
# Not between range
awk '!($2 >= 50 && $2 <= 150)' data.txt
Practical Example: Log Filter
#!/bin/bash
# File: filter_logs.sh
logfile="$1"
awk '
/ERROR/ && NF >= 5 {
time = $1
level = $2
code = $3
message = $4 " " $5
if (code >= 500) {
severity = "CRITICAL"
} else if (code >= 400) {
severity = "ERROR"
} else {
severity = "WARNING"
}
printf "[%s] %s: %s %s\n", time, severity, code, message
}
' "$logfile"
Conditional with Regex
# Match pattern AND other condition
awk '/ERROR/ && $2 > 10' file.txt
# Pattern OR number condition
awk '/critical/ || $3 > 100' file.txt
# Complex: Pattern match and field range
awk '$1 ~ /^[0-9]/ && $2 >= 100 && $2 <= 200' file.txt
Testing for Empty Fields
# Field is empty
awk '$3 == ""' file.txt
# Field is not empty
awk '$3 != ""' file.txt
# Field has length > 0
awk 'length($3) > 0' file.txt
Ternary Operator
# Conditional expression
awk '{print ($2 > 100) ? "High" : "Low"}' data.txt
# More complex
awk '{print $1, ($2 > 100 && $3 == "Active") ? "PROCESS" : "SKIP"}' data.txt
Practical Example: Performance Report
#!/bin/bash
# File: performance_report.sh
data="$1"
awk '
NR > 1 {
name = $1
response = $2
errors = $3
uptime = $4
# Determine status
if (response < 100 && errors == 0 && uptime > 99.5) {
status = "β GOOD"
} else if (response < 200 && errors < 5) {
status = "β FAIR"
} else {
status = "β POOR"
}
printf "%-10s Response: %4dms Errors: %2d Uptime: %.1f%% Status: %s\n", \
name, response, errors, uptime, status
}
' "$data"
Filtering Without Action
# Just print lines matching condition (no action)
awk '$2 > 100' data.txt
# Without 'print', entire line is printed if condition is true
# Same as: awk '$2 > 100 {print}' data.txt
Negation
# NOT operator
awk '!($2 > 100)' data.txt # Not greater than 100
awk '!($1 ~ /error/)' file.txt # Not containing error
awk '!/pattern/' file.txt # Lines NOT matching pattern
Common Mistakes
- Using = instead of == - assignment vs comparison
- Not quoting strings - string values need quotes:
"value" - Forgetting curly braces - if-else blocks need them
- Regex without ~ - use
~for pattern matching:$1 ~ /pat/ - Logical operator confusion - use
&&for AND,||for OR
Performance Tips
- Use simple conditions when possible
- Pattern matching is fast with
/regex/ - Numeric comparisons faster than string
- Avoid complex nested conditions
- Use
nextto skip remaining actions
Key Points
- Use
$field comparison_operator value - Use
~for regex matching - Combine with
&&(AND) and||(OR) - If-else blocks need curly braces
- Ternary operator useful for simple cases
Summary
Conditional filtering in awk is powerful for data processing. Master comparison operators, logical combinations, and regex matching. Use if-else for complex logic and ternary operators for simple conditions.