How to Merge Files in Bash
Quick Answer: Combine Files into One
Use cat file1.txt file2.txt file3.txt > combined.txt to merge files. For CSV files with headers, skip headers from subsequent files with tail -n +2 to prevent duplication.
Quick Comparison: File Merging Methods
| Method | Use Case | Speed | Notes |
|---|---|---|---|
| cat | Simple concatenation | Fastest | Best for files without headers |
| cat with tail | CSV/data files | Very fast | Handle headers properly |
| awk | Adding file source tracking | Fast | Track which file each line came from |
| tar/compression | Merging with archiving | Medium | Useful for backup/storage |
| sort/uniq | Merge with deduplication | Medium | Remove duplicates while merging |
Bottom line: Use cat for speed, but handle headers carefully in CSV files. Always validate merged output before deleting originals.
Combine multiple files into a single file. Learn methods using cat, while loops, and handling headers properly.
Method 1: Basic Merge with cat
The simplest way to merge files. Perfect for text files without headers.
When to Use Basic cat Merge
- Simple text files without structure
- Log files without headers
- Code snippets combining scripts
- Any file where order doesn’t matter
cat file1.txt file2.txt file3.txt > combined.txt
This concatenates files in order. Perfect for files without headers.
Method 2: Merge Multiple Files by Pattern
# Merge all .txt files in current directory
cat *.txt > combined.txt
# Merge all .log files
cat *.log > all_logs.log
# Merge files matching pattern
cat data_*.txt > all_data.txt
When to Use Pattern-Based Merge
- Consolidating log files from same source
- Combining data files with consistent naming
- Batch processing multiple similar files
- Processing output from multiple tools
Method 3: Merge with Source Tracking
# Show which line came from which file
awk '{print FILENAME ": " $0}' file1.txt file2.txt file3.txt > combined.txt
When to Use Source Tracking
- Debugging multi-source data integration
- Audit trail of which file contributed what
- Quality assurance of merged data
- Tracing issues back to source files
Method 4: Merge CSV Files (Handle Headers Carefully)
When merging CSV files, keep header only once. This is critical for data integrity.
When to Use CSV Header Handling
- Combining monthly/quarterly data exports
- Consolidating reports from multiple systems
- Batch processing data files
- Data warehousing operations
#!/bin/bash
# Merge CSV files, keep header from first file
output="merged.csv"
# Write header from first file
head -1 file1.csv > "$output"
# Append data from all files (skip headers)
for file in file*.csv; do
tail -n +2 "$file" >> "$output"
done
echo "Merged into: $output"
Practical Example: Log Consolidation
#!/bin/bash
# File: merge_logs.sh
output_file="$1"
log_dir="${2:-.}"
if [ -z "$output_file" ]; then
echo "Usage: $0 <output_file> [directory]"
exit 1
fi
if [ ! -d "$log_dir" ]; then
echo "ERROR: Directory not found: $log_dir"
exit 1
fi
echo "Merging logs from: $log_dir"
# Find all log files and merge with timestamps
cat "$log_dir"/*.log | sort > "$output_file"
# Count lines
line_count=$(wc -l < "$output_file")
echo "Merged $line_count lines into: $output_file"
# Show sample
echo ""
echo "First 5 lines:"
head -5 "$output_file"
Usage:
$ chmod +x merge_logs.sh
$ ./merge_logs.sh all_logs.txt /var/log
Merging logs from: /var/log
Merged 12450 lines into: all_logs.txt
First 5 lines:
2026-02-21 10:30:45 ERROR: Database error
2026-02-21 10:30:46 WARNING: High memory usage
...
Merge with Source Tracking
#!/bin/bash
# Keep track of source file for each line
for file in file1.txt file2.txt file3.txt; do
if [ -f "$file" ]; then
while IFS= read -r line; do
echo "$file: $line"
done < "$file"
fi
done > combined_with_source.txt
Merge Large Files Efficiently
#!/bin/bash
# Efficiently merge large files without loading into memory
output="$1"
if [ -z "$output" ]; then
echo "Usage: $0 <output_file>"
exit 1
fi
> "$output" # Clear output file
# Append each file to output
for file in *.txt; do
if [ -f "$file" ]; then
cat "$file" >> "$output"
fi
done
echo "Merged to: $output"
Merge and Sort
# Merge and sort all numbers
cat *.txt | sort -n > sorted_combined.txt
# Merge and remove duplicates
cat *.txt | sort -u > unique_combined.txt
# Merge, sort, and count occurrences
cat *.txt | sort | uniq -c > counted_combined.txt
Merge with Separator
#!/bin/bash
# Add separator between files when merging
output="combined.txt"
separator="--- END OF FILE ---"
> "$output"
for file in file*.txt; do
if [ -f "$file" ]; then
cat "$file" >> "$output"
echo "$separator" >> "$output"
fi
done
echo "Merged with separators to: $output"
Merge CSV with Header Deduplication
#!/bin/bash
# Merge multiple CSV files
input_files=("$@")
output="merged.csv"
if [ ${#input_files[@]} -eq 0 ]; then
echo "Usage: $0 file1.csv file2.csv ..."
exit 1
fi
# Process first file completely
head -n 1 "${input_files[0]}" > "$output"
tail -n +2 "${input_files[0]}" >> "$output"
# Process remaining files (skip header)
for file in "${input_files[@]:1}"; do
if [ -f "$file" ]; then
tail -n +2 "$file" >> "$output"
fi
done
echo "Merged into: $output"
wc -l < "$output" | xargs echo "Total lines:"
Usage:
$ ./merge_csv.sh users_jan.csv users_feb.csv users_mar.csv
Merged into: merged.csv
Total lines: 301
Merge with Progress Indicator
#!/bin/bash
# Merge files with progress
output="$1"
shift
files=("$@")
> "$output"
total=${#files[@]}
count=0
for file in "${files[@]}"; do
if [ -f "$file" ]; then
cat "$file" >> "$output"
((count++))
echo "[$count/$total] Merged: $file"
else
echo "WARNING: File not found: $file"
fi
done
echo "Completed: $output"
Merge JSON Files
#!/bin/bash
# Merge JSON arrays from multiple files
output="merged.json"
echo "[" > "$output"
for file in *.json; do
# Extract content without outer brackets
tail -n +2 "$file" | head -n -1 >> "$output"
echo "," >> "$output"
done
# Remove last comma and close array
head -n -1 "$output" > "${output}.tmp"
echo "]" >> "${output}.tmp"
mv "${output}.tmp" "$output"
echo "Merged JSON into: $output"
Merge and Compress
#!/bin/bash
# Merge files and compress
output="archive.tar.gz"
tar -czf "$output" file1.txt file2.txt file3.txt
echo "Merged and compressed: $output"
echo "Size: $(du -h $output | cut -f1)"
Validate Before Merge
#!/bin/bash
# Verify files before merging
output="$1"
shift
total_lines=0
for file in "$@"; do
if [ ! -f "$file" ]; then
echo "ERROR: File not found: $file"
exit 1
fi
lines=$(wc -l < "$file")
echo " $file: $lines lines"
((total_lines += lines))
done
echo ""
echo "Total lines: $total_lines"
read -p "Merge these files? (y/n) " -r
if [[ $REPLY =~ ^[Yy]$ ]]; then
cat "$@" > "$output"
echo "Merged to: $output"
fi
Performance Tips
- Use
catfor simple concatenation (fastest) - Avoid using
>>in loops (opens/closes file each time) - For large files, use
ddortarfor better performance - Keep merged files reasonably sized (under 1GB is typical)
Common Mistakes
- Overwriting existing file - redirect to temp file, then move
- Not handling headers - skip headers in all but first file
- Order matters - list files in desired order
- Permissions - check write permissions on output directory
- Duplicate data - verify files don’t have overlapping content
Quick Reference
Common merge patterns:
# Simple merge
cat file1 file2 file3 > merged.txt
# Merge with pattern
cat *.txt > combined.txt
# Merge CSV (skip headers after first)
head -1 file1.csv > merged.csv
tail -n +2 file*.csv >> merged.csv
# Merge and sort
cat file1.txt file2.txt | sort > sorted.txt
# Merge with deduplication
cat file1.txt file2.txt | sort -u > unique.txt
Summary
Merging files is straightforward with cat for speed and simplicity. For CSV and structured data files, always handle headers carefully with tail -n +2 to skip them from subsequent files. Always validate merged results before deleting originals. Use sort -u when deduplication is needed. Create backups or test on copies first to ensure nothing is lost in the merge process.