Search for string patterns (POSIX)
grep [-E|-F] [-chilnqsvx] [-e expression | -f expression_file]... [file...] grep [-E|-F] [-chilnqsvx] expression [file...]
Historical UNIX versions:
egrep [-chilnqsvx] [-e expression | -f expression_file]... [file...] egrep [-chilnqsvx] expression [file...] fgrep [-chilnqsvx] [-e expression | -f expression_file]... [file...] fgrep [-chilnqsvx] expression [file...]
The grep utility searches input for lines matching the expression(s) given. When an input line matches any of the expressions, it is said to be "selected." By default, selected lines are written to standard output.
Numerous options allow variations upon the output format. For example, to reverse the meaning of the output, the -v option could be used.
There are three types of regular expressions understood by grep: basic, extended, and fixed. If you don't specify -E or -F, the expression(s) are taken to be basic regular expressions.
Basic and extended regular expressions are similar to arithmetic expressions in that larger expressions are formed by combining smaller expressions and operators according to some precedence rule.
Regular expressions have an "invisible" operator, i.e. concatenation. The concatenation of two expressions means match the one on the left, then the one on the right.
The smallest expression is a single character.
The following table summarizes the Basic Regular Expressions (BRE), and the precedence of the operators:
Expression | Meaning |
---|---|
\( expression \) | Subexpression. Match the pattern expression. Used for back references (see below), and precedence |
\N | Back-reference. Match the exact string that the Nth subexpression did |
. | (Dot) match any single character |
[charset] | Match any member of the set charset (see below) |
c | Match any nonspecial character |
\c | Match literal c. The character may not be (, ), {, }, or any digit from 1 through 9. The \ is usually used to escape *, $, ^, ., [ and ]. \\ matches a literal "\". \ has no special meaning inside a bracket expression. |
limited_expression* | Match any number of repetitions of limited_expression including zero. |
limited_expression\{M\} | Match exactly M repetitions of limited_expression |
limited_expression\{,N\} | Match zero to N repetitions of limited_expression |
limited_expression\{M,N\} | Match M to N repetitions of limited_expression |
expr0expr1 | (Concatenation) match expr0 then expr1 |
^expression | Match expression only at beginning of line |
expression$ | Match expression only at end of line |
A limited_expression is restricted to a a back-reference, a subexpression, or a BRE matching a single character.
A charset is formed by concatenation of the following operators:
Expression | Meaning |
---|---|
c | Any character c |
c-d | Any character in the range from c to d |
[:alpha:] | Any alphabetic character |
[:upper:] | Any uppercase character |
[:lower:] | Any lowercase character |
[:digit:] | Any numeric character |
[:alnum:] | Any numeric or alphabetic character |
[:xdigit:] | Any character used to represent a hexadecimal number |
[:space:] | Any character that is a whitespace |
[:print:] | Any printable character |
[:punct:] | Any character that is punctuation |
[:graph:] | Any character with a graphic representation |
[:cntrl:] | Any character used for control |
If the charset begins with the caret (^), the set is inverted. For example:
[^[:alpha:]]
means match any nonalphabetic character. (This can also be expressed by [^a-zA-Z].)
The Extended Regular Expressions (ERE) are an enriched set of regular expression operators. In particular, the Extended Regular Expressions support an operator for alternation, thus allowing a match of one expression or another. It is also important to note that the parenthesis syntax is different from Basic Regular Expressions, and the semantics are subtly different. There are no back-references in Extended Regular Expressions.
The following list summarizes the Extended Regular Expressions:
Expression | Meaning |
---|---|
(expression) | Match expression; useful for altering precedence |
. | (Dot) match any single character |
c | Match any nonspecial character c |
\c | Match literal c. Normally used to escape ERE special characters. |
[charset] | Match any element of charset |
limited_expression* | Match any number of repetitions of limited_expression, including zero |
limited_expression+ | Match 1 to any number of repetitions of limited_expression |
limited_expression? | limited_expression is optional (match 0 or 1 repetition) |
limited_expression\{M\} | Match exactly M repetitions of limited_expression |
limited_expression\{,N\} | Match zero to N repetitions of limited_expression |
limited_expression\{M,N\} | Match M to N repetitions of limited_expression |
expr0expr1 | (Concatenation) match expr0 then expr1 |
expr0|expr1 | (Alternation) match expr0 or expr1 (not both) |
^expression | Match expression only at the beginning of a line |
expression$ | Match expression only at the end of a line |
For extended regular expressions, a limited_expression is restricted to an expression matching a single character or an expression enclosed in parentheses.
Fixed Regular Expressions consist of a set of strings of characters. They don't permit the operators of extended or basic regular expressions. The algorithm used is extremely efficient for locating one of a set of strings within another string. Thus, if you don't need the various operators of basic or extended regular expressions, the fixed expressions are a better choice.
Display lines in Phone.List containing telephone numbers:
grep '[[:digit:]]\{3\}-[[:digit:]]\{4\}' Phone.List
Display all occurrences of the words "steve" and "barney" in the Phone.List file:
grep -F -e steve -e barney Phone.List