Introduction to Regex
Master regular expressions: pattern matching for text processing.
A Simple Analogy
Regex is like a search template. Instead of searching for exact "john@example.com", you search for "any text @ any text . com" to find all email addresses. Regex describes patterns instead of specific strings.
What Is Regex?
Regular expressions (regex) are patterns for matching and manipulating text. They let you search, validate, and extract data using symbolic patterns instead of literal strings.
Why Use Regex?
- Pattern matching: Find text matching criteria
- Validation: Check email, phone, password format
- Extraction: Pull data from unstructured text
- Replacement: Find and replace complex patterns
- Parsing: Extract fields from logs or documents
Basic Patterns
| Pattern | Matches |
|---------|---------|
| . | Any single character |
| * | Zero or more of previous |
| + | One or more of previous |
| ? | Zero or one of previous |
| [abc] | Any of: a, b, or c |
| [^abc] | Not a, b, or c |
| \d | Any digit (0-9) |
| \w | Word character (a-z, A-Z, 0-9, _) |
| \s | Whitespace |
| ^ | Start of string |
| $ | End of string |
Common Examples
using System.Text.RegularExpressions;
// Email validation
var emailPattern = @"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$";
bool isEmail = Regex.IsMatch("user@example.com", emailPattern);
// Phone validation (10 digits)
var phonePattern = @"^\d{3}-\d{3}-\d{4}$";
bool isPhone = Regex.IsMatch("123-456-7890", phonePattern);
// Extract numbers from text
var numbers = Regex.Matches("Price: $19.99, Quantity: 5", @"\d+");
// Result: ["19", "99", "5"]
// Replace pattern
string text = "2025-02-20";
string formatted = Regex.Replace(text, @"(\d{4})-(\d{2})-(\d{2})", "$3/$2/$1");
// Result: "20/02/2025"
Common Validations
// Password: min 8 chars, uppercase, lowercase, digit, special char
var passwordPattern = @"^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$";
// URL
var urlPattern = @"^https?://[^\s/$.?#].[^\s]*$";
// Credit card (basic)
var cardPattern = @"^\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}$";
// IP address
var ipPattern = @"^(\d{1,3}\.){3}\d{1,3}$";
Practical Example
public class TextProcessor
{
// Extract email addresses
public static List<string> ExtractEmails(string text)
{
var pattern = @"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}";
var matches = Regex.Matches(text, pattern);
return matches.Cast<Match>().Select(m => m.Value).ToList();
}
// Validate password strength
public static bool IsStrongPassword(string password)
{
var pattern = @"^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&]).{8,}$";
return Regex.IsMatch(password, pattern);
}
// Clean whitespace
public static string NormalizeWhitespace(string text)
{
return Regex.Replace(text, @"\s+", " ").Trim();
}
}
Real-World Use Cases
- Log parsing: Extract errors from application logs
- CSV extraction: Pull data from unstructured CSVs
- HTML scraping: Extract links and text from HTML
- Data validation: Ensure format compliance
- Data transformation: Convert between formats
Related Concepts to Explore
- Lookahead and lookbehind assertions
- Backreferences in replacements
- Named groups for clarity
- Performance optimization
- Regex in different languages
Summary
Regular expressions provide powerful pattern matching for text processing. Master basic patterns to validate input, extract data, and transform strings efficiently in any application.