How to Filter Lines by Pattern in Bash
Quick Answer: Filter Lines by Pattern in Bash
To filter lines matching a pattern, use grep "pattern" file.txt. For inverse matching (lines NOT matching), use grep -v "pattern" file.txt. For case-insensitive matching, use grep -i "pattern" file.txt. The grep command is the standard filtering tool.
Quick Comparison: Line Filtering Methods
| Method | Syntax | Best For | Speed |
|---|---|---|---|
| grep | grep pattern file | Pattern matching | Fast |
| grep -v | grep -v pattern file | Exclude pattern | Fast |
| awk | awk '/pattern/' | Complex logic | Medium |
| sed | sed -n '/pattern/p' | Advanced patterns | Medium |
Bottom line: Use grep for simple patterns; use awk for complex filtering.
Select and extract lines matching specific patterns from files. This guide covers filtering with grep, awk, sed, and bash built-ins.
Method 1: Using grep - Basic Filtering
# Match lines containing pattern
grep "error" logfile.txt
# Output:
# ERROR: Database connection failed
# ERROR: Authentication failed
The simplest tool for line filtering is grep. It searches for lines matching a pattern.
grep Options
# Match lines NOT containing pattern (inverse)
grep -v "debug" logfile.txt
# Case-insensitive search
grep -i "warning" logfile.txt
# Count matching lines
grep -c "error" logfile.txt
# Show line numbers
grep -n "error" logfile.txt
# Show context (lines before/after)
grep -C 2 "error" logfile.txt
# Extended regex
grep -E "error|warn|fail" logfile.txt
Practical grep Examples
Test file (app.log):
2026-02-21 10:30:45 INFO: Application started
2026-02-21 10:30:46 ERROR: Database connection failed
2026-02-21 10:30:47 WARNING: Retry attempt 1
2026-02-21 10:30:48 ERROR: Timeout reached
2026-02-21 10:30:49 INFO: Application stopped
# Extract only errors
grep "ERROR:" app.log
# Extract errors and warnings
grep -E "ERROR|WARNING" app.log
# Show line numbers
grep -n "ERROR" app.log
# Count errors
grep -c "ERROR" app.log
Output:
2026-02-21 10:30:46 ERROR: Database connection failed
2026-02-21 10:30:48 ERROR: Timeout reached
2
Using awk for Conditional Filtering
Awk is more powerful for column-based filtering:
# Filter by field value
awk '$2 > 100' data.txt
# Filter by pattern and print specific fields
awk '/error/ {print $1, $3}' logfile.txt
# Multiple conditions
awk '$1 > 50 && $2 < 100' numbers.txt
Practical awk Examples
Test file (sales.txt):
John 150
Jane 200
Bob 75
Alice 180
# Show names with sales over 100
awk '$2 > 100 {print $1}' sales.txt
# Show sales over 100 with formatted output
awk '$2 > 100 {printf "%s: $%d\n", $1, $2}' sales.txt
# Sum sales for high performers
awk '$2 > 100 {total += $2} END {print "Total:", total}' sales.txt
Output:
John
Jane
Alice
John: $150
Jane: $200
Alice: $180
Total: 530
Using sed for Line Filtering
Sed excels at selecting specific lines:
# Keep only matching lines (delete non-matching)
sed -n '/pattern/p' file.txt
# Delete matching lines
sed '/pattern/d' file.txt
# Print lines 5-10
sed -n '5,10p' file.txt
Filtering with Bash While Loop
For more complex logic, use bash loops:
#!/bin/bash
# Filter lines where second field > 100
while IFS=' ' read -r name value; do
if [ "$value" -gt 100 ]; then
echo "$name: $value"
fi
done < sales.txt
Output:
John: 150
Jane: 200
Alice: 180
Practical Example: Log File Analysis
#!/bin/bash
# File: analyze_logs.sh
logfile="$1"
error_type="$2"
if [ -z "$logfile" ] || [ ! -f "$logfile" ]; then
echo "Usage: $0 <logfile> <error_type>"
exit 1
fi
echo "=== Filtering by: $error_type ==="
echo ""
# Show matching lines with numbers
grep -n "$error_type" "$logfile"
echo ""
echo "=== Statistics ==="
# Count occurrences
count=$(grep -c "$error_type" "$logfile")
echo "Total occurrences: $count"
# Show first occurrence
first=$(grep "$error_type" "$logfile" | head -1)
echo "First: $first"
# Show last occurrence
last=$(grep "$error_type" "$logfile" | tail -1)
echo "Last: $last"
Usage:
$ ./analyze_logs.sh app.log "ERROR"
=== Filtering by: ERROR ===
2:2026-02-21 10:30:46 ERROR: Database connection failed
4:2026-02-21 10:30:48 ERROR: Timeout reached
=== Statistics ===
Total occurrences: 2
First: 2026-02-21 10:30:46 ERROR: Database connection failed
Last: 2026-02-21 10:30:48 ERROR: Timeout reached
Filter and Transform
#!/bin/bash
# Extract error messages and format them
while IFS=: read -r timestamp level message; do
[ "$level" = " ERROR" ] && echo "ALERT: $(echo $message | xargs)"
done < app.log
Output:
ALERT: Database connection failed
ALERT: Timeout reached
Multiple Filters (Pipe Operators)
# Chain filters together
grep "ERROR" app.log | grep -v "retry" | cut -d' ' -f4-
# Show errors that are not retries
grep "ERROR" logfile.txt | grep -v "Retry"
# Extract specific time range and error type
grep "2026-02-21" app.log | grep "ERROR" | awk '{print $1, $2, $NF}'
Filter by Regex Pattern
#!/bin/bash
# Extract lines with email addresses
grep -E '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}' file.txt
# Extract lines with IP addresses
grep -E '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' file.txt
# Extract lines with URLs
grep -E 'https?://[^[:space:]]+' file.txt
Performance Comparison
| Tool | Speed | Flexibility | Best For |
|---|---|---|---|
| grep | Fastest | Medium | Simple pattern matching |
| awk | Medium | High | Column/field filtering |
| sed | Medium | High | Complex transformations |
| bash loop | Slowest | Highest | Complex logic |
Common Mistakes
- Not escaping regex special characters - use
\before.,*, etc. - Forgetting
-vreverses logic - it excludes instead of matches - Case sensitivity - use
-ifor case-insensitive - Mixing up field numbering - awk uses 1-based indexing
- Not quoting patterns - spaces in patterns cause issues
Quick Reference
grep "text" file # Match lines containing text
grep -v "text" file # Lines NOT containing text
grep -n "text" file # Show line numbers
grep -c "text" file # Count matches
grep -i "text" file # Case-insensitive
grep -E "pat1|pat2" file # Extended regex (OR)
awk '/text/' file # Awk pattern matching
awk '$2 > 100' file # Awk field condition
sed -n '/text/p' file # Sed print matching
Summary
Filtering lines is fundamental in bash. Use grep for simple pattern matching, awk for field-based filtering, and sed for complex transformations. For simple tasks, grep is usually fastest. For field-based operations, awk is more elegant. Chain them together for powerful combinations.