How to Merge Files in Bash | ActiveDirectoryTools

Quick Answer: Combine Files into One

Use cat file1.txt file2.txt file3.txt > combined.txt to merge files. For CSV files with headers, skip headers from subsequent files with tail -n +2 to prevent duplication.

Quick Comparison: File Merging Methods

Method	Use Case	Speed	Notes
cat	Simple concatenation	Fastest	Best for files without headers
cat with tail	CSV/data files	Very fast	Handle headers properly
awk	Adding file source tracking	Fast	Track which file each line came from
tar/compression	Merging with archiving	Medium	Useful for backup/storage
sort/uniq	Merge with deduplication	Medium	Remove duplicates while merging

Bottom line: Use cat for speed, but handle headers carefully in CSV files. Always validate merged output before deleting originals.

Combine multiple files into a single file. Learn methods using cat, while loops, and handling headers properly.

Method 1: Basic Merge with cat

The simplest way to merge files. Perfect for text files without headers.

When to Use Basic cat Merge

Simple text files without structure
Log files without headers
Code snippets combining scripts
Any file where order doesn’t matter

cat file1.txt file2.txt file3.txt > combined.txt

This concatenates files in order. Perfect for files without headers.

Method 2: Merge Multiple Files by Pattern

# Merge all .txt files in current directory
cat *.txt > combined.txt

# Merge all .log files
cat *.log > all_logs.log

# Merge files matching pattern
cat data_*.txt > all_data.txt

When to Use Pattern-Based Merge

Consolidating log files from same source
Combining data files with consistent naming
Batch processing multiple similar files
Processing output from multiple tools

Method 3: Merge with Source Tracking

# Show which line came from which file
awk '{print FILENAME ": " $0}' file1.txt file2.txt file3.txt > combined.txt

When to Use Source Tracking

Debugging multi-source data integration
Audit trail of which file contributed what
Quality assurance of merged data
Tracing issues back to source files

Method 4: Merge CSV Files (Handle Headers Carefully)

When merging CSV files, keep header only once. This is critical for data integrity.

When to Use CSV Header Handling

Combining monthly/quarterly data exports
Consolidating reports from multiple systems
Batch processing data files
Data warehousing operations

#!/bin/bash

# Merge CSV files, keep header from first file

output="merged.csv"

# Write header from first file
head -1 file1.csv > "$output"

# Append data from all files (skip headers)
for file in file*.csv; do
  tail -n +2 "$file" >> "$output"
done

echo "Merged into: $output"

Practical Example: Log Consolidation

#!/bin/bash

# File: merge_logs.sh

output_file="$1"
log_dir="${2:-.}"

if [ -z "$output_file" ]; then
  echo "Usage: $0 <output_file> [directory]"
  exit 1
fi

if [ ! -d "$log_dir" ]; then
  echo "ERROR: Directory not found: $log_dir"
  exit 1
fi

echo "Merging logs from: $log_dir"

# Find all log files and merge with timestamps
cat "$log_dir"/*.log | sort > "$output_file"

# Count lines
line_count=$(wc -l < "$output_file")
echo "Merged $line_count lines into: $output_file"

# Show sample
echo ""
echo "First 5 lines:"
head -5 "$output_file"

Usage:

$ chmod +x merge_logs.sh
$ ./merge_logs.sh all_logs.txt /var/log
Merging logs from: /var/log
Merged 12450 lines into: all_logs.txt

First 5 lines:
2026-02-21 10:30:45 ERROR: Database error
2026-02-21 10:30:46 WARNING: High memory usage
...

Merge with Source Tracking

#!/bin/bash

# Keep track of source file for each line

for file in file1.txt file2.txt file3.txt; do
  if [ -f "$file" ]; then
    while IFS= read -r line; do
      echo "$file: $line"
    done < "$file"
  fi
done > combined_with_source.txt

Merge Large Files Efficiently

#!/bin/bash

# Efficiently merge large files without loading into memory

output="$1"

if [ -z "$output" ]; then
  echo "Usage: $0 <output_file>"
  exit 1
fi

> "$output"  # Clear output file

# Append each file to output
for file in *.txt; do
  if [ -f "$file" ]; then
    cat "$file" >> "$output"
  fi
done

echo "Merged to: $output"

Merge and Sort

# Merge and sort all numbers
cat *.txt | sort -n > sorted_combined.txt

# Merge and remove duplicates
cat *.txt | sort -u > unique_combined.txt

# Merge, sort, and count occurrences
cat *.txt | sort | uniq -c > counted_combined.txt

Merge with Separator

#!/bin/bash

# Add separator between files when merging

output="combined.txt"
separator="--- END OF FILE ---"

> "$output"

for file in file*.txt; do
  if [ -f "$file" ]; then
    cat "$file" >> "$output"
    echo "$separator" >> "$output"
  fi
done

echo "Merged with separators to: $output"

Merge CSV with Header Deduplication

#!/bin/bash

# Merge multiple CSV files

input_files=("$@")
output="merged.csv"

if [ ${#input_files[@]} -eq 0 ]; then
  echo "Usage: $0 file1.csv file2.csv ..."
  exit 1
fi

# Process first file completely
head -n 1 "${input_files[0]}" > "$output"
tail -n +2 "${input_files[0]}" >> "$output"

# Process remaining files (skip header)
for file in "${input_files[@]:1}"; do
  if [ -f "$file" ]; then
    tail -n +2 "$file" >> "$output"
  fi
done

echo "Merged into: $output"
wc -l < "$output" | xargs echo "Total lines:"

Usage:

$ ./merge_csv.sh users_jan.csv users_feb.csv users_mar.csv
Merged into: merged.csv
Total lines: 301

Merge with Progress Indicator

#!/bin/bash

# Merge files with progress

output="$1"
shift
files=("$@")

> "$output"

total=${#files[@]}
count=0

for file in "${files[@]}"; do
  if [ -f "$file" ]; then
    cat "$file" >> "$output"
    ((count++))
    echo "[$count/$total] Merged: $file"
  else
    echo "WARNING: File not found: $file"
  fi
done

echo "Completed: $output"

Merge JSON Files

#!/bin/bash

# Merge JSON arrays from multiple files

output="merged.json"

echo "[" > "$output"

for file in *.json; do
  # Extract content without outer brackets
  tail -n +2 "$file" | head -n -1 >> "$output"
  echo "," >> "$output"
done

# Remove last comma and close array
head -n -1 "$output" > "${output}.tmp"
echo "]" >> "${output}.tmp"
mv "${output}.tmp" "$output"

echo "Merged JSON into: $output"

Merge and Compress

#!/bin/bash

# Merge files and compress

output="archive.tar.gz"

tar -czf "$output" file1.txt file2.txt file3.txt

echo "Merged and compressed: $output"
echo "Size: $(du -h $output | cut -f1)"

Validate Before Merge

#!/bin/bash

# Verify files before merging

output="$1"
shift

total_lines=0

for file in "$@"; do
  if [ ! -f "$file" ]; then
    echo "ERROR: File not found: $file"
    exit 1
  fi

  lines=$(wc -l < "$file")
  echo "  $file: $lines lines"
  ((total_lines += lines))
done

echo ""
echo "Total lines: $total_lines"
read -p "Merge these files? (y/n) " -r
if [[ $REPLY =~ ^[Yy]$ ]]; then
  cat "$@" > "$output"
  echo "Merged to: $output"
fi

Performance Tips

Use cat for simple concatenation (fastest)
Avoid using >> in loops (opens/closes file each time)
For large files, use dd or tar for better performance
Keep merged files reasonably sized (under 1GB is typical)

Common Mistakes

Overwriting existing file - redirect to temp file, then move
Not handling headers - skip headers in all but first file
Order matters - list files in desired order
Permissions - check write permissions on output directory
Duplicate data - verify files don’t have overlapping content

Quick Reference

Common merge patterns:

# Simple merge
cat file1 file2 file3 > merged.txt

# Merge with pattern
cat *.txt > combined.txt

# Merge CSV (skip headers after first)
head -1 file1.csv > merged.csv
tail -n +2 file*.csv >> merged.csv

# Merge and sort
cat file1.txt file2.txt | sort > sorted.txt

# Merge with deduplication
cat file1.txt file2.txt | sort -u > unique.txt

Summary

Merging files is straightforward with cat for speed and simplicity. For CSV and structured data files, always handle headers carefully with tail -n +2 to skip them from subsequent files. Always validate merged results before deleting originals. Use sort -u when deduplication is needed. Create backups or test on copies first to ensure nothing is lost in the merge process.