4.1. Regular expressions

4.1.1. What are regular expressions?

A regular expression is a pattern that describes a set of strings. Regular expressions are constructed analogously to arithmetic expressions by using various operators to combine smaller expressions.

The fundamental building blocks are the regular expressions that match a single character. Most characters, including all letters and digits, are regular expressions that match themselves. Any metacharacter with special meaning may be quoted by preceding it with a backslash.

4.1.2. Regular expression metacharacters

A regular expression may be followed by one of several repetition operators (metacharacters):

Table 4-1. Regular expression operators

OperatorEffect
.Matches any single character.
?The preceding item is optional and will be matched, at most, once.
*The preceding item will be matched zero or more times.
+The preceding item will be matched one or more times.
{N}The preceding item is matched exactly N times.
{N,}The preceding item is matched N or more times.
{N,M}The preceding item is matched at least N times, but not more than M times.
-represents the range if it's not first or last in a list or the ending point of a range in a list.
^Matches the empty string at the beginning of a line; also represents the characters not in the range of a list.
$Matches the empty string at the end of a line.
\bMatches the empty string at the edge of a word.
\BMatches the empty string provided it's not at the edge of a word.
\<Match the empty string at the beginning of word.
\>Match the empty string at the end of word.

Two regular expressions may be concatenated; the resulting regular expression matches any string formed by concatenating two substrings that respectively match the concatenated subexpressions.

Two regular expressions may be joined by the infix operator "|"; the resulting regular expression matches any string matching either subexpression.

Repetition takes precedence over concatenation, which in turn takes precedence over alternation. A whole subexpression may be enclosed in parentheses to override these precedence rules.

4.1.3. Basic versus extended regular expressions

In basic regular expressions the metacharacters "?", "+", "{", "|", "(", and ")" lose their special meaning; instead use the backslashed versions "\?", "\+", "\{", "\|", "\(", and "\)".

Check in your system documentation whether commands using regular expressions support extended expressions.