How to Find Differences Between Files
Quick Answer: Compare Two Files for Differences
Use diff file1 file2 to see what’s different between files, or diff -u file1 file2 for a more readable unified format. The output shows additions, deletions, and modifications side-by-side with context.
Quick Comparison: File Comparison Methods
| Tool | Output | Best For | Pros |
|---|---|---|---|
| diff | Line-based diff | Text files, patches | Human-readable, standard |
| diff -u | Unified format | Readability, patches | Context, clear format |
| diff -y | Side-by-side | Visual comparison | Easy to spot changes visually |
| cmp | Byte-by-byte | Binary files, exact match | Fast for equality check |
| vimdiff | Interactive vim | Manual review | Interactive navigation and editing |
Bottom line: Use diff -u for most text file comparisons. Use cmp -s to quickly check if files differ without viewing details.
Compare files and display differences. Learn using diff, vimdiff, cmp, and custom comparison methods.
Method 1: Basic diff Output Format
diff file1.txt file2.txt
This shows lines that differ between files. Less than (<) means in file1, greater than (>) means in file2.
When to Use Basic Diff
- Quick comparison of small files
- Understanding which lines changed
- Creating patches for manual review
- Checking if files have any differences
Detailed Example
Test files:
file1.txt:
apple
banana
cherry
date
file2.txt:
apple
blueberry
cherry
elderberry
diff file1.txt file2.txt
Output:
2c2
< banana
---
> blueberry
4c4
< date
---
> elderberry
Method 2: Diff Output Formats
Understanding diff output formats helps you choose the right one for your needs.
Default Format
# Shows line numbers and differences
diff file1 file2
Unified Format
# Shows context around changes (like git diff)
diff -u file1.txt file2.txt
# Output:
# @@ -1,4 +1,4 @@
# apple
# -banana
# +blueberry
# cherry
# -date
# +elderberry
Side-by-Side Format
# Shows files side by side
diff -y file1.txt file2.txt
# Output:
# apple apple
# banana | blueberry
# cherry cherry
# date | elderberry
Method 3: Diff Options and Flags
Different flags let you customize comparison behavior.
Common Diff Options
| Option | Purpose | Use Case |
|---|---|---|
-u | Unified format | Most readable, shows context |
-c | Context format | Traditional, three-way diffs |
-y | Side-by-side | Visual comparison, easier for humans |
-w | Ignore whitespace | When formatting differences don’t matter |
-i | Ignore case | Case-insensitive comparison |
-r | Recursive (directories) | Comparing entire directory trees |
--suppress-common-lines | Hide matching lines in -y | Focus only on differences |
When to Use Each Option
- Use
-ufor standard, readable output (most common choice) - Use
-ywhen you want side-by-side visual comparison - Use
-wwhen whitespace variations shouldn’t matter - Use
-ifor case-insensitive comparisons - Use
-rto compare entire directories recursively
Method 4: Compare Directories Recursively
# Show files that differ in two directories
diff -r directory1 directory2
# Recursive with unified format
diff -ru directory1 directory2 > changes.patch
When to Compare Directories
- Comparing backup versions of your entire project
- Finding what changed in source code directories
- Validating file synchronization between systems
- Creating patches for entire project updates
Practical Example: Configuration Audit
#!/bin/bash
# File: audit_config.sh
file1="$1"
file2="$2"
if [ ! -f "$file1" ] || [ ! -f "$file2" ]; then
echo "Usage: $0 <file1> <file2>"
exit 1
fi
echo "=== Configuration Comparison ==="
echo "File 1: $file1"
echo "File 2: $file2"
echo ""
# Count differences
diff_count=$(diff "$file1" "$file2" | grep "^[<>]" | wc -l)
echo "Total changes: $diff_count"
echo ""
# Show differences in unified format
echo "=== Differences ==="
diff -u "$file1" "$file2" | head -30
Check If Files Are Identical
#!/bin/bash
file1="$1"
file2="$2"
if [ ! -f "$file1" ] || [ ! -f "$file2" ]; then
echo "Usage: $0 <file1> <file2>"
exit 1
fi
if diff -q "$file1" "$file2" > /dev/null; then
echo "Files are identical"
exit 0
else
echo "Files differ"
exit 1
fi
Compare Only Specific Lines
# Extract specific lines and compare
diff <(sed -n '5,10p' file1.txt) <(sed -n '5,10p' file2.txt)
# Compare only lines containing pattern
diff <(grep "error" file1.txt) <(grep "error" file2.txt)
Ignore Specific Differences
# Ignore whitespace differences
diff -w file1.txt file2.txt
# Ignore blank lines
diff -B file1.txt file2.txt
# Ignore case differences
diff -i file1.txt file2.txt
# Ignore all whitespace (spaces, tabs)
diff -w -B file1.txt file2.txt
Compare CSV Files
#!/bin/bash
# Compare CSV files focusing on data differences
file1="$1"
file2="$2"
if [ ! -f "$file1" ] || [ ! -f "$file2" ]; then
echo "Usage: $0 <file1.csv> <file2.csv>"
exit 1
fi
echo "=== CSV Comparison ==="
echo ""
# Sort and compare (order-independent)
echo "Data differences (order-independent):"
diff -u <(sort "$file1") <(sort "$file2") | grep "^[+-]" | grep -v "^++\|^--"
echo ""
echo "Line count:"
echo "File 1: $(wc -l < "$file1") lines"
echo "File 2: $(wc -l < "$file2") lines"
Create Patch File
# Create a patch from differences
diff -u original.txt modified.txt > changes.patch
# Apply patch
patch original.txt < changes.patch
# Apply patch to multiple files
patch -p0 < changes.patch
Compare with Report
#!/bin/bash
# Generate difference report
file1="$1"
file2="$2"
if [ ! -f "$file1" ] || [ ! -f "$file2" ]; then
echo "Usage: $0 <file1> <file2>"
exit 1
fi
lines1=$(wc -l < "$file1")
lines2=$(wc -l < "$file2")
changed=$(diff "$file1" "$file2" | grep "^[<>]" | wc -l)
added=$(diff "$file1" "$file2" | grep "^>" | wc -l)
removed=$(diff "$file1" "$file2" | grep "^<" | wc -l)
echo "=== Difference Report ==="
echo ""
echo "File 1: $file1 ($lines1 lines)"
echo "File 2: $file2 ($lines2 lines)"
echo ""
echo "Changes Summary:"
echo " Total changes: $changed"
echo " Added lines: $added"
echo " Removed lines: $removed"
echo ""
echo "Details:"
diff -u "$file1" "$file2"
Binary File Comparison
# Compare binary files byte-by-byte
cmp file1.bin file2.bin
# Show first difference
cmp -l file1.bin file2.bin | head -1
# Verbose output
cmp -v file1.bin file2.bin
Find Duplicate Files
#!/bin/bash
# Find identical files in directories
dir1="$1"
dir2="$2"
if [ ! -d "$dir1" ] || [ ! -d "$dir2" ]; then
echo "Usage: $0 <dir1> <dir2>"
exit 1
fi
echo "Identical files:"
echo ""
for file1 in "$dir1"/*; do
if [ -f "$file1" ]; then
basename=$(basename "$file1")
file2="$dir2/$basename"
if [ -f "$file2" ]; then
if cmp -s "$file1" "$file2"; then
echo "✓ $basename (identical)"
fi
fi
fi
done
Using vimdiff for Interactive Comparison
# Open files in vim with side-by-side comparison
vimdiff file1.txt file2.txt
# Or equivalently
vim -d file1.txt file2.txt
Commands in vimdiff:
]c- next difference[c- previous difference:q- quit
Compare Sorted Files (Order-Independent)
# Compare ignoring order
diff <(sort file1.txt) <(sort file2.txt)
# Compare ignoring order and duplicates
diff <(sort -u file1.txt) <(sort -u file2.txt)
Find Common Lines (Intersection)
# Lines that appear in both files
comm -12 <(sort file1.txt) <(sort file2.txt)
# Lines only in file1
comm -23 <(sort file1.txt) <(sort file2.txt)
# Lines only in file2
comm -13 <(sort file1.txt) <(sort file2.txt)
Common Mistakes
- Not quoting filenames - fails with spaces in names
- Forgetting -r for directories - won’t recurse without it
- Misunderstanding output symbols -
<is file1,>is file2 - Order-sensitive comparison - use sort for data files
- Large file diffs - very large files can be slow to compare
Performance Tips
- Use
diff -qto quickly check if files differ - Use
cmp -sfor binary files (faster) - For large files, compare checksums first with
md5sum - Use
--suppress-common-lineswith-yto focus on changes
Quick Reference
Common diff command patterns:
# Basic comparison
diff file1 file2
# Unified format (most readable)
diff -u file1 file2
# Side-by-side comparison
diff -y file1 file2
# Ignore whitespace
diff -w file1 file2
# Compare directories
diff -r dir1 dir2
# Create patch file
diff -u original.txt modified.txt > changes.patch
# Check if files are identical
diff -q file1 file2
Summary
The diff command is essential for comparing files. Use unified format for readability, recursive mode for directories, and various flags for specific comparison needs. For binary files, use cmp instead. Always create backups before applying patches. When comparing large files, use diff -q first to quickly check if they differ.