Skip to main content

How to Validate URL in Bash

• 2 min read
bash validation regex url http https pattern matching

Quick Answer: Validate URL in Bash

To validate a basic URL, use: [[ "$url" =~ ^https?://[a-zA-Z0-9.-]+\.[a-zA-Z]{2,} ]]. This checks for HTTP/HTTPS protocol, domain, and TLD. For stricter validation including paths and ports, create more comprehensive patterns.

Quick Comparison: URL Validation Methods

MethodComplexityAccuracyBest ForPerformance
Simple regexLowGoodBasic format checkFastest
Comprehensive regexModerateBetterURLs with paths/portsMedium
curl -I testHighExcellentActual reachabilitySlowest

Bottom line: Use simple regex for format; use curl for reachability testing.


Validating URLs

URL validation is crucial for forms, configurations, and data processing. Bash regex patterns can validate basic URL structure.

Method 1: Simple URL Pattern (Format Check)

Basic regex validation for HTTP/HTTPS URLs:

#!/bin/bash

url="https://example.com"
pattern='^https?://[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'

if [[ "$url" =~ $pattern ]]; then
  echo "Valid URL"
fi

More Comprehensive Pattern

With paths and parameters:

#!/bin/bash

validate_url() {
  local url="$1"
  local pattern='https?://[a-zA-Z0-9./?=_%:-]*'

  if [[ "$url" =~ $pattern ]]; then
    return 0
  fi
  return 1
}

validate_url "https://example.com/path?query=value" && echo "Valid"

Extracting URL Parts

#!/bin/bash

parse_url() {
  local url="$1"

  # Extract protocol
  protocol="${url%%://*}"

  # Extract domain
  url="${url#*://}"
  domain="${url%%/*}"

  # Extract path
  path="/${url#*/}"
  [ "$path" = "/" ] && path=""

  echo "Protocol: $protocol"
  echo "Domain: $domain"
  echo "Path: $path"
}

parse_url "https://example.com/api/users"

Protocol Checking

#!/bin/bash

url="https://secure.example.com"

# Check protocol
if [[ "$url" == https://* ]]; then
  echo "Secure connection"
elif [[ "$url" == http://* ]]; then
  echo "Insecure connection"
fi

Real-World Example: Download Validation

#!/bin/bash

is_valid_download_url() {
  local url="$1"

  # Must be https
  if [[ "$url" != https://* ]]; then
    echo "ERROR: Must use HTTPS"
    return 1
  fi

  # Must end with expected extension
  if [[ ! "$url" =~ \.(zip|tar\.gz|exe)$ ]]; then
    echo "ERROR: Invalid file type"
    return 1
  fi

  return 0
}

is_valid_download_url "https://example.com/file.zip"

URL Parsing Function

#!/bin/bash

parse_url_complete() {
  local url="$1"

  # Use curl to parse (more reliable)
  if command -v curl &>/dev/null; then
    curl -s -o /dev/null -w "Status: %{http_code}\n" "$url"
  else
    # Simple regex validation
    [[ "$url" =~ ^https?:// ]] && echo "URL format OK"
  fi
}

Batch URL Validation

#!/bin/bash

validate_url_list() {
  local file="$1"
  local valid=0 invalid=0

  while IFS= read -r url || [[ -n "$url" ]]; do
    [ -z "$url" ] && continue

    if [[ "$url" =~ ^https?://[a-zA-Z0-9.-]+ ]]; then
      ((valid++))
    else
      ((invalid++))
      echo "Invalid: $url"
    fi
  done < "$file"

  echo "Valid: $valid, Invalid: $invalid"
}

validate_url_list "urls.txt"

Important Notes

  • Simple regex validates format, not actual URL existence
  • Use curl -I to check if URL responds
  • https:// is more secure than http://
  • URL length should not exceed 2048 characters
  • Special characters need encoding in URLs

Quick Reference

# Simple validation
[[ "$url" =~ ^https?://[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$ ]]

# Check protocol
[[ "$url" == https://* ]] && echo "HTTPS"

# Extract domain
domain="${url#*://}"
domain="${domain%%/*}"

# Check if URL is reachable
curl -s -o /dev/null -w "%{http_code}" "$url"

Summary

Use regex patterns to validate URL format, but test actual connectivity with curl for production use. Always require HTTPS for sensitive operations.