Skip to main content

Bash Regular Expressions - Pattern Matching and Validation

5 min read
bash regex pattern matching validation

Quick Answer: How Do You Use Regular Expressions in Bash?

Use the =~ operator with double brackets to test if a string matches a regex pattern: [[ $text =~ pattern ]]. Common patterns: [0-9]+ (numbers), [a-z]+ (letters), ^ (start), $ (end), .* (any characters).

Quick Comparison: Regex Methods

MethodUse CasePowerComplexity
Glob patterns [[ $var == pattern ]]Simple matching, no regexLimitedVery simple
Regex =~ [[ $var =~ regex ]]Complex patterns, validationHighModerate
grep patternsFinding in filesHighModerate
sed patternsText replacementHighModerate

Bottom line: Use =~ for pattern matching in Bash, grep/sed for file operations.


Regular expressions enable powerful pattern matching and validation. This guide covers Bash regex syntax, matching operators, and practical validation examples you’ll use for input validation and data extraction.

Table of Contents

  1. Regex Basics
  2. Character Classes
  3. Quantifiers
  4. Anchors
  5. Matching Operators
  6. Practical Examples
  7. Validation Patterns
  8. Best Practices
  9. Frequently Asked Questions

Regex Basics

The =~ operator tests if a string matches a regex pattern. Double brackets [[ are required for this to work.

Simple Pattern

The simplest regex is just a literal string. This matches if the string is found anywhere in the text.

text="hello world"

if [[ $text =~ world ]]; then
  echo "Match found"
fi

This outputs “Match found” because “world” is in the text. For pattern matching, you use special characters (like ., *, +) which we’ll cover next.

Literal Characters

Some characters have special meaning in regex. To match them literally, escape them with a backslash.

# Match exact characters
text="file.txt"
[[ $text =~ \.txt ]]        # Match .txt (escaped dot)

The \. matches a literal dot. Without the backslash, . would match any character, not just a dot. This is crucial for matching filenames and paths.


Character Classes

Character classes let you match groups of characters without listing each one individually. This is essential for patterns like “any digit” or “any lowercase letter”.

Common Classes

ClassMatchesExample
.Any charactera.c matches “abc”, “adc”, “a c”
[abc]a, b, or c[abc] matches single ‘a’, ‘b’, or ‘c’
[^abc]NOT a, b, or cMatches anything except those letters
[a-z]a through zMatches any lowercase letter
[0-9]Any digitMatches single digit 0-9
\wWord character (letters, digits, _)Matches alphanumeric or underscore
\sWhitespaceMatches space, tab, newline

Examples

These patterns match specific types of content:

# Match digit
[[ "abc123" =~ [0-9] ]] && echo "Contains digit"

# Match word characters (letters, numbers, underscore)
[[ "hello_world" =~ ^[a-z_]+$ ]] && echo "Valid identifier"

# Match uppercase letter
[[ "Hello" =~ [A-Z] ]] && echo "Contains uppercase"

Quantifiers

Quantifiers control how many times a pattern repeats. They’re what make regex powerful—without them, you’d have to match each occurrence individually.

QuantifierMeaningExample
*0 or morea*b matches “b”, “ab”, “aab”, etc.
+1 or morea+b matches “ab”, “aab”, but not “b”
?0 or 1a?b matches “b” or “ab” (optional)
{n}Exactly na{3} matches exactly “aaa”
{n,}n or morea{3,} matches “aaa”, “aaaa”, etc.
{n,m}Between n and ma{2,4} matches “aa”, “aaa”, or “aaaa”

Examples

These patterns match repeated sequences:

# One or more digits (entire string must be digits)
[[ "123" =~ ^[0-9]+$ ]] && echo "All digits"

# Optional hyphen (useful for phone number flexibility)
[[ "555-1234" =~ ^[0-9]{3}-?[0-9]{4}$ ]] && echo "Phone format"

# Zero or more spaces (flexible spacing)
[[ "hello  world" =~ hello\ *world ]] && echo "Match with spaces"

Anchors

Anchors position patterns at specific locations—the start, end, or boundaries of strings. They don’t match characters themselves; they match positions.

AnchorMeaningExample
^Start of string^hello matches “hello” only at the beginning
$End of stringworld$ matches “world” only at the end
\bWord boundary\bword\b matches “word” as a complete word

Examples

These patterns match at specific locations:

# Match at start (ensure string begins with expected text)
[[ "hello world" =~ ^hello ]] && echo "Starts with hello"

# Match at end (check file extensions, endings)
[[ "hello world" =~ world$ ]] && echo "Ends with world"

# Exact match
[[ "hello" =~ ^hello$ ]] && echo "Exact match"

When to Use Anchors

Use anchors when:

  • Validating complete input (email, username)
  • Checking file extensions
  • Ensuring patterns are at boundaries
  • Avoiding partial matches

Matching Operators

Basic Matching ([[ =~ ]])

The =~ operator tests if a string matches a regex. It returns 0 (success/true) if it matches, 1 (failure/false) if it doesn’t.

text="hello123"

if [[ $text =~ [0-9]+ ]]; then
  echo "Contains numbers"
fi

This checks if the text contains any numbers. The [0-9]+ pattern means “one or more digits.”

Capture Groups

Extract parts of matched text using parentheses. The captured groups are stored in the BASH_REMATCH array.

text="Name: John"

if [[ $text =~ Name:\ ([a-z]+) ]]; then
  echo "Name is ${BASH_REMATCH[1]}"  # Output: John (captured name)
fi

The parentheses create a capture group. BASH_REMATCH[0] is the entire match, BASH_REMATCH[1] is the first group, etc. This is how you extract data from strings.

Case-Insensitive Matching (Bash 4+)

For case-insensitive matching, use the nocasematch option:

shopt -s nocasematch     # Enable case-insensitive

if [[ "HELLO" =~ hello ]]; then
  echo "Case-insensitive match"
fi

shopt -u nocasematch     # Disable case-insensitive

This is useful for user input validation where you don’t want to care about the user’s capitalization.

When to Use Regex Matching

Use =~ with regex when:

  • Validating user input (email, phone, URL)
  • Pattern matching complex strings
  • Extracting data with capture groups
  • You need more power than glob patterns

Use glob patterns [[ == *pattern* ]] when:

  • Simple substring checking
  • Speed is critical (glob is slightly faster)
  • The pattern doesn’t need regex features

Quick Reference

# Basic matching
[[ $text =~ pattern ]]              # Returns true if matches

# Character classes
[[ $text =~ [0-9] ]]                # Contains digit
[[ $text =~ [a-z]+ ]]               # Contains lowercase letters
[[ $text =~ [A-Z] ]]                # Contains uppercase letter

# Quantifiers
[[ $text =~ ^[0-9]+$ ]]             # Entire string is digits
[[ $text =~ [0-9]{3}-[0-9]{4} ]]   # Specific pattern (like 123-4567)

# Anchors
[[ $text =~ ^hello ]]               # Starts with hello
[[ $text =~ world$ ]]               # Ends with world
[[ $text =~ ^exact$ ]]              # Exact match

# Capture groups
if [[ $text =~ ([0-9]+)-([a-z]+) ]]; then
  echo "${BASH_REMATCH[1]}"         # First captured group
  echo "${BASH_REMATCH[2]}"         # Second captured group
fi

# Case-insensitive
shopt -s nocasematch
[[ $text =~ pattern ]]              # Now case-insensitive
shopt -u nocasematch

Practical Examples

Email Validation

validate_email() {
  local email="$1"
  if [[ $email =~ ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$ ]]; then
    echo "Valid email"
  else
    echo "Invalid email"
  fi
}

validate_email "john@example.com"

URL Validation

validate_url() {
  local url="$1"
  if [[ $url =~ ^https?://[a-zA-Z0-9.-]+\.[a-zA-Z]{2,} ]]; then
    echo "Valid URL"
  else
    echo "Invalid URL"
  fi
}

validate_url "https://example.com"

Phone Number

validate_phone() {
  local phone="$1"
  if [[ $phone =~ ^[0-9]{3}-[0-9]{3}-[0-9]{4}$ ]]; then
    echo "Valid phone"
  else
    echo "Invalid phone"
  fi
}

validate_phone "555-123-4567"

Validation Patterns

Username (alphanumeric and underscore)

[[ $username =~ ^[a-zA-Z0-9_]{3,20}$ ]]

IP Address (simplified)

[[ $ip =~ ^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$ ]]

Date (YYYY-MM-DD)

[[ $date =~ ^[0-9]{4}-[0-9]{2}-[0-9]{2}$ ]]

Hex Color (#RRGGBB)

[[ $color =~ ^#[0-9a-fA-F]{6}$ ]]

Best Practices

1. Quote Regex Carefully

# Good
pattern="^[0-9]+$"
[[ $text =~ $pattern ]]

# Be careful with spaces in pattern
pattern="hello world"
[[ "hello world" =~ $pattern ]]

2. Use Proper Anchors

# Good (exact match)
[[ $text =~ ^pattern$ ]]

# Less strict (pattern anywhere)
[[ $text =~ pattern ]]

3. Test Patterns

# Test various inputs
for input in "valid" "VALID" "invalid123"; do
  if [[ $input =~ ^[a-z]+$ ]]; then
    echo "$input matches"
  fi
done

4. Document Complex Patterns

# Email validation regex
# - Local part: letters, numbers, dots, hyphens
# - @ symbol
# - Domain: letters, numbers, hyphens, dots
# - TLD: 2+ letters
email_regex="^[a-zA-Z0-9.-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"

Frequently Asked Questions

Q: What’s the difference between grep and [[ =~ ]]?

A: grep searches lines in files. [[ =~ ]] does regex matching in variables.

Q: Are Bash regexes PCRE?

A: No, Bash uses ERE (Extended Regular Expressions). Some PCRE features not available.

Q: How do I debug regex patterns?

A: Test incrementally: start simple, add complexity gradually.

Q: Can I use case-insensitive matching?

A: Yes: shopt -s nocasematch before matching.


Next Steps

Explore related topics: