How to Extract Substring in Bash
Quick Answer: Extract a Substring in Bash
To extract a substring from a Bash string, use parameter expansion: ${string:start:length}. This extracts length characters starting at position start. To get the last N characters, use ${string: -N}.
Quick Comparison: Substring Extraction Methods
| Method | Speed | Best For | Syntax |
|---|---|---|---|
| Parameter expansion | Fastest | All substring tasks | ${string:pos:len} |
| cut -c | Very fast | Piped input, ranges | cut -c1-5 |
| sed regex | Fast | Pattern-based | sed 's/.*\(...\).*/\1/' |
| awk substr | Very fast | Field-based | awk '{print substr($0,1,5)}' |
| grep -o | Fast | Pattern matching | grep -o '...' |
Bottom line: Use parameter expansion for variables, use cut for piped input.
Extract portions of strings using parameter expansion in Bash. Substring extraction is essential for parsing, data validation, and string manipulation tasks. This tutorial covers multiple methods and practical applications.
Basic Syntax
The fundamental syntax for substring extraction uses curly braces and position indicators:
${string:position} # From position to end
${string:position:length} # Specific length from position
${string: -number} # Last N characters
Method 1: Extract from Position to End
Extract from a starting position to the end of the string.
text="Hello, World!"
echo ${text:0} # Output: Hello, World! (entire string)
echo ${text:7} # Output: World! (from position 7 onward)
echo ${text:6} # Output: World! (from position 6 onward)
# More examples
echo ${text:1} # Output: ello, World!
echo ${text:13} # Output: (empty - past end)
Example with positional reference:
$ text="ABCDEFGH"
$ echo ${text:0} # A B C D E F G H
$ echo ${text:3} # D E F G H
$ echo ${text:5} # F G H
Method 2: Extract Specific Length
Extract a substring of specified length starting at a position.
text="Hello, World!"
echo ${text:0:5} # Output: Hello (5 chars from position 0)
echo ${text:7:5} # Output: World (5 chars from position 7)
echo ${text:0:13} # Output: Hello, World! (entire string)
echo ${text:7:10} # Output: World! (more length than available)
Example:
$ text="0123456789"
$ echo ${text:2:3} # 234
$ echo ${text:0:4} # 0123
$ echo ${text:5:2} # 56
Method 3: Extract Last N Characters
Use negative index to extract from the end of the string. Note the space before the minus sign.
text="Hello, World!"
echo ${text: -6} # Output: World! (last 6 characters)
echo ${text: -1} # Output: ! (last 1 character)
echo ${text: -13} # Output: Hello, World! (entire string)
# Without space after colon causes error
echo ${text:-6} # Wrong - uses default value operator
Example:
$ text="filename.txt"
$ echo ${text: -4} # .txt
$ echo ${text: -8} # name.txt
$ echo ${text: -3} # txt
Method 4: Using Negative Index with Length
Combine negative index with length parameter.
text="Hello, World!"
echo ${text: -6:5} # Output: World (last 6 chars, take 5)
echo ${text: -4:2} # Output: ld (last 4 chars, take 2)
filename="document.pdf"
echo ${filename: -4:3} # Output: .pd (last 4 chars, take 3)
Method 5: Pattern-Based Extraction
Extract text before or after specific patterns.
# Remove from end (shortest match from right)
path="/home/user/documents/file.txt"
dir=${path%/*} # Output: /home/user/documents
dir=${path%%/*} # Output: (empty - longest match)
# Remove from start (shortest match from left)
filename=${path##*/} # Output: file.txt
filename=${path#*/} # Output: home/user/documents/file.txt
# Extract between delimiters
data="name:John,age:30"
name=${data#*:} # Output: John,age:30
Practical Examples
Example 1: Extract File Extension
#!/bin/bash
file="$1"
# Method 1: Last 4 characters (assumes 3-char extension)
extension=${file: -3}
echo "Extension: .$extension"
# Method 2: From last dot onward
extension=${file##*.}
echo "Extension: $extension"
# Method 3: Verify it's actually an extension
if [[ "$extension" =~ ^[a-z0-9]{1,4}$ ]]; then
echo "Valid extension: $extension"
fi
Usage:
$ bash script.sh document.pdf
Extension: pdf
Extension: pdf
Valid extension: pdf
Example 2: Parse Full Path
#!/bin/bash
path="$1"
# Extract directory
directory=${path%/*}
echo "Directory: $directory"
# Extract filename
filename=${path##*/}
echo "Filename: $filename"
# Extract extension from filename
extension=${filename##*.}
echo "Extension: $extension"
# Extract basename without extension
basename_no_ext=${filename%.*}
echo "Base name: $basename_no_ext"
Input:
$ bash script.sh /home/user/documents/report.pdf
Directory: /home/user/documents
Filename: report.pdf
Extension: pdf
Base name: report
Example 3: Extract from CSV Line
#!/bin/bash
# Parse CSV line with known positions
csv_line="John,30,Engineer,50000"
# Split by delimiter manually (for simple cases)
first_part=${csv_line:0:4} # "John"
after_first=${csv_line:5:2} # "30"
# Or use cut/awk for complex CSV
echo "First 5 chars: ${csv_line:0:5}" # John,
echo "Chars 6-10: ${csv_line:5:5}" # 30,En
Example 4: Validate Username Length
#!/bin/bash
username="$1"
min_length=3
max_length=20
actual_length=${#username}
if [ "$actual_length" -lt "$min_length" ]; then
echo "Too short"
elif [ "$actual_length" -gt "$max_length" ]; then
echo "Too long"
else
# Show preview (first 10 chars)
preview=${username:0:10}
echo "Username accepted: $preview"
fi
Example 5: Truncate Output
#!/bin/bash
# Truncate long text for display
text="$1"
max_length=50
if [ ${#text} -gt $max_length ]; then
display=${text:0:$((max_length-3))}...
else
display=$text
fi
echo "Display: $display"
Example:
$ bash script.sh "This is a very long text that needs to be shortened"
Display: This is a very long text that needs to be sh...
Example 6: Extract Version Numbers
#!/bin/bash
version="v2.3.4"
# Remove 'v' prefix
version=${version:1} # 2.3.4
# Get major version
major=${version:0:1} # 2
# Get minor version
minor=${version:2:1} # 3
# Get patch version
patch=${version:4:1} # 4
echo "Major: $major, Minor: $minor, Patch: $patch"
Output:
Major: 2, Minor: 3, Patch: 4
Example 7: Process Multiple Substrings
#!/bin/bash
# Extract and process multiple parts
text="UserID=12345,Token=abc123def456"
# Extract user ID
user_id=${text:7:5} # 12345
echo "User ID: $user_id"
# Extract token (after "Token=")
token_part=${text#*Token=} # abc123def456
token=${token_part:0:6} # abc123
echo "Token (first 6): $token"
Advanced Techniques
Variable Substring Parameters
#!/bin/bash
# Using variables for positions
text="Hello, World!"
start=7
length=5
substring=${text:$start:$length}
echo "$substring" # Output: World
Dynamic Extraction
#!/bin/bash
# Extract based on pattern
text="email: john@example.com, phone: 555-1234"
# Find substring before comma
first_part=${text%%,*} # email: john@example.com
echo "$first_part"
# Find between delimiters
between=${text#*: } # john@example.com, phone: 555-1234
between=${between%%,*} # john@example.com
echo "$between"
Comparison: Different Methods
#!/bin/bash
text="Hello, World!"
# Method 1: Positional extraction
result1=${text:0:5} # Hello
# Method 2: Pattern-based
result2=${text%,*} # Hello
# Method 3: From end
result3=${text: -6} # World!
echo "Method 1: $result1"
echo "Method 2: $result2"
echo "Method 3: $result3"
Performance Comparison
For extracting substrings:
| Method | Speed | Use Case |
|---|---|---|
| Parameter expansion | Fastest | Simple extraction |
| Pattern matching | Fastest | Pattern-based |
| cut command | Fast | Field extraction |
| grep/awk | Medium | Complex patterns |
Best choice: Use parameter expansion for speed and simplicity.
Important Considerations
Index Counting
Bash uses 0-based indexing (first character is position 0):
text="ABCDE"
echo ${text:0} # A B C D E (position 0 = first char)
echo ${text:1} # B C D E (position 1 = second char)
echo ${text:4} # E (position 4 = fifth char)
Empty Results
Operations past the string end return empty, not errors:
text="Hello"
echo ${text:100} # (empty, no error)
echo ${text:2:100} # llo (takes what's available)
Negative Index Space
The space before negative index is required:
text="Hello"
echo ${text: -2} # lo (correct)
echo ${text:-2} # (uses default value operator, wrong)
Key Points
- Use
${string:position}to extract from position to end - Use
${string:position:length}for fixed-length extraction - Use
${string: -N}for last N characters (note the space) - Use
${string%pattern}to remove from end - Use
${string##pattern}to remove from start - 0-based indexing: first character is at position 0
- All operations return empty if out of bounds
Quick Reference
# From position to end
${text:7} # From position 7 onward
# Specific length
${text:0:5} # 5 chars from position 0
# Last N characters
${text: -6} # Last 6 chars
# Last N chars, take M
${text: -6:5} # Last 6 chars, take first 5
# Remove extension
${filename%.*} # Remove from last dot
# Remove path
${path##*/} # Get filename only
Recommended Pattern
#!/bin/bash
text="$1"
# For simple position-based extraction
substring=${text:0:10}
echo "First 10 chars: $substring"
# For pattern-based extraction
directory=${text%/*} # Remove /filename
echo "Directory: $directory"
# For last N characters
extension=${text: -3}
echo "Last 3 chars: $extension"