Skip to main content

How to Merge Files in Bash

• 3 min read
bash

Quick Answer: Combine Files into One

Use cat file1.txt file2.txt file3.txt > combined.txt to merge files. For CSV files with headers, skip headers from subsequent files with tail -n +2 to prevent duplication.

Quick Comparison: File Merging Methods

MethodUse CaseSpeedNotes
catSimple concatenationFastestBest for files without headers
cat with tailCSV/data filesVery fastHandle headers properly
awkAdding file source trackingFastTrack which file each line came from
tar/compressionMerging with archivingMediumUseful for backup/storage
sort/uniqMerge with deduplicationMediumRemove duplicates while merging

Bottom line: Use cat for speed, but handle headers carefully in CSV files. Always validate merged output before deleting originals.


Combine multiple files into a single file. Learn methods using cat, while loops, and handling headers properly.

Method 1: Basic Merge with cat

The simplest way to merge files. Perfect for text files without headers.

When to Use Basic cat Merge

  • Simple text files without structure
  • Log files without headers
  • Code snippets combining scripts
  • Any file where order doesn’t matter
cat file1.txt file2.txt file3.txt > combined.txt

This concatenates files in order. Perfect for files without headers.

Method 2: Merge Multiple Files by Pattern

# Merge all .txt files in current directory
cat *.txt > combined.txt

# Merge all .log files
cat *.log > all_logs.log

# Merge files matching pattern
cat data_*.txt > all_data.txt

When to Use Pattern-Based Merge

  • Consolidating log files from same source
  • Combining data files with consistent naming
  • Batch processing multiple similar files
  • Processing output from multiple tools

Method 3: Merge with Source Tracking

# Show which line came from which file
awk '{print FILENAME ": " $0}' file1.txt file2.txt file3.txt > combined.txt

When to Use Source Tracking

  • Debugging multi-source data integration
  • Audit trail of which file contributed what
  • Quality assurance of merged data
  • Tracing issues back to source files

Method 4: Merge CSV Files (Handle Headers Carefully)

When merging CSV files, keep header only once. This is critical for data integrity.

When to Use CSV Header Handling

  • Combining monthly/quarterly data exports
  • Consolidating reports from multiple systems
  • Batch processing data files
  • Data warehousing operations
#!/bin/bash

# Merge CSV files, keep header from first file

output="merged.csv"

# Write header from first file
head -1 file1.csv > "$output"

# Append data from all files (skip headers)
for file in file*.csv; do
  tail -n +2 "$file" >> "$output"
done

echo "Merged into: $output"

Practical Example: Log Consolidation

#!/bin/bash

# File: merge_logs.sh

output_file="$1"
log_dir="${2:-.}"

if [ -z "$output_file" ]; then
  echo "Usage: $0 <output_file> [directory]"
  exit 1
fi

if [ ! -d "$log_dir" ]; then
  echo "ERROR: Directory not found: $log_dir"
  exit 1
fi

echo "Merging logs from: $log_dir"

# Find all log files and merge with timestamps
cat "$log_dir"/*.log | sort > "$output_file"

# Count lines
line_count=$(wc -l < "$output_file")
echo "Merged $line_count lines into: $output_file"

# Show sample
echo ""
echo "First 5 lines:"
head -5 "$output_file"

Usage:

$ chmod +x merge_logs.sh
$ ./merge_logs.sh all_logs.txt /var/log
Merging logs from: /var/log
Merged 12450 lines into: all_logs.txt

First 5 lines:
2026-02-21 10:30:45 ERROR: Database error
2026-02-21 10:30:46 WARNING: High memory usage
...

Merge with Source Tracking

#!/bin/bash

# Keep track of source file for each line

for file in file1.txt file2.txt file3.txt; do
  if [ -f "$file" ]; then
    while IFS= read -r line; do
      echo "$file: $line"
    done < "$file"
  fi
done > combined_with_source.txt

Merge Large Files Efficiently

#!/bin/bash

# Efficiently merge large files without loading into memory

output="$1"

if [ -z "$output" ]; then
  echo "Usage: $0 <output_file>"
  exit 1
fi

> "$output"  # Clear output file

# Append each file to output
for file in *.txt; do
  if [ -f "$file" ]; then
    cat "$file" >> "$output"
  fi
done

echo "Merged to: $output"

Merge and Sort

# Merge and sort all numbers
cat *.txt | sort -n > sorted_combined.txt

# Merge and remove duplicates
cat *.txt | sort -u > unique_combined.txt

# Merge, sort, and count occurrences
cat *.txt | sort | uniq -c > counted_combined.txt

Merge with Separator

#!/bin/bash

# Add separator between files when merging

output="combined.txt"
separator="--- END OF FILE ---"

> "$output"

for file in file*.txt; do
  if [ -f "$file" ]; then
    cat "$file" >> "$output"
    echo "$separator" >> "$output"
  fi
done

echo "Merged with separators to: $output"

Merge CSV with Header Deduplication

#!/bin/bash

# Merge multiple CSV files

input_files=("$@")
output="merged.csv"

if [ ${#input_files[@]} -eq 0 ]; then
  echo "Usage: $0 file1.csv file2.csv ..."
  exit 1
fi

# Process first file completely
head -n 1 "${input_files[0]}" > "$output"
tail -n +2 "${input_files[0]}" >> "$output"

# Process remaining files (skip header)
for file in "${input_files[@]:1}"; do
  if [ -f "$file" ]; then
    tail -n +2 "$file" >> "$output"
  fi
done

echo "Merged into: $output"
wc -l < "$output" | xargs echo "Total lines:"

Usage:

$ ./merge_csv.sh users_jan.csv users_feb.csv users_mar.csv
Merged into: merged.csv
Total lines: 301

Merge with Progress Indicator

#!/bin/bash

# Merge files with progress

output="$1"
shift
files=("$@")

> "$output"

total=${#files[@]}
count=0

for file in "${files[@]}"; do
  if [ -f "$file" ]; then
    cat "$file" >> "$output"
    ((count++))
    echo "[$count/$total] Merged: $file"
  else
    echo "WARNING: File not found: $file"
  fi
done

echo "Completed: $output"

Merge JSON Files

#!/bin/bash

# Merge JSON arrays from multiple files

output="merged.json"

echo "[" > "$output"

for file in *.json; do
  # Extract content without outer brackets
  tail -n +2 "$file" | head -n -1 >> "$output"
  echo "," >> "$output"
done

# Remove last comma and close array
head -n -1 "$output" > "${output}.tmp"
echo "]" >> "${output}.tmp"
mv "${output}.tmp" "$output"

echo "Merged JSON into: $output"

Merge and Compress

#!/bin/bash

# Merge files and compress

output="archive.tar.gz"

tar -czf "$output" file1.txt file2.txt file3.txt

echo "Merged and compressed: $output"
echo "Size: $(du -h $output | cut -f1)"

Validate Before Merge

#!/bin/bash

# Verify files before merging

output="$1"
shift

total_lines=0

for file in "$@"; do
  if [ ! -f "$file" ]; then
    echo "ERROR: File not found: $file"
    exit 1
  fi

  lines=$(wc -l < "$file")
  echo "  $file: $lines lines"
  ((total_lines += lines))
done

echo ""
echo "Total lines: $total_lines"
read -p "Merge these files? (y/n) " -r
if [[ $REPLY =~ ^[Yy]$ ]]; then
  cat "$@" > "$output"
  echo "Merged to: $output"
fi

Performance Tips

  • Use cat for simple concatenation (fastest)
  • Avoid using >> in loops (opens/closes file each time)
  • For large files, use dd or tar for better performance
  • Keep merged files reasonably sized (under 1GB is typical)

Common Mistakes

  1. Overwriting existing file - redirect to temp file, then move
  2. Not handling headers - skip headers in all but first file
  3. Order matters - list files in desired order
  4. Permissions - check write permissions on output directory
  5. Duplicate data - verify files don’t have overlapping content

Quick Reference

Common merge patterns:

# Simple merge
cat file1 file2 file3 > merged.txt

# Merge with pattern
cat *.txt > combined.txt

# Merge CSV (skip headers after first)
head -1 file1.csv > merged.csv
tail -n +2 file*.csv >> merged.csv

# Merge and sort
cat file1.txt file2.txt | sort > sorted.txt

# Merge with deduplication
cat file1.txt file2.txt | sort -u > unique.txt

Summary

Merging files is straightforward with cat for speed and simplicity. For CSV and structured data files, always handle headers carefully with tail -n +2 to skip them from subsequent files. Always validate merged results before deleting originals. Use sort -u when deduplication is needed. Create backups or test on copies first to ensure nothing is lost in the merge process.