grep regular expression

grep command

The grep program is a standard UNIX utility that searches through a set of files for an arbitrary text pattern, specified through a regular expression. By default, grep is case-sensitive (use -i to ignore case). By default, grep ignores the context of a string (use -w to match words only). By default, grep shows the lines that match (use -v to show those that don’t match).



Understanding Regular Expressions

Regular Expressions are a feature of UNIX. They describe a pattern to match, a sequence of characters, not words, within a line of text. Here is a quick summary of the special characters used in the grep tool and their meaning:

^ (Caret)        =    match expression at the start of a line, as in ^A.
$ (Question)     =    match expression at the end of a line, as in A$.
 (Back Slash)   =    turn off the special meaning of the next character, as in ^.
[ ] (Brackets)   =    match any one of the enclosed characters, as in [aeiou].
                      Use Hyphen "-" for a range, as in [0-9].
[^ ]             =    match any one character except those enclosed in [ ], as in [^0-9].
. (Period)       =    match a single character of any value, except end of line.
* (Asterisk)     =    match zero or more of the preceding character or expression.
{x,y}          =    match x to y occurrences of the preceding.
{x}            =    match exactly x occurrences of the preceding.
{x,}           =    match x or more occurrences of the preceding.


Examples

grep smug files         {search files for lines with 'smug'}
grep '^smug' files      {'smug' at the start of a line}
grep 'smug$' files      {'smug' at the end of a line}
grep '^smug$' files     {lines containing only 'smug'}
grep '^s' files        {lines starting with '^s', "" escapes the ^}
grep '[Ss]mug' files    {search for 'Smug' or 'smug'}
grep 'B[oO][bB]' files  {search for BOB, Bob, BOb or BoB }
grep '^$' files         {search for blank lines}
grep '[0-9][0-9]' file  {search for pairs of numeric digits}

Back Slash “” is used to escape the next symbol, for example, turn off the special meaning that it has. To look for a Caret “^” at the start of a line, the expression is ^^.

Period “.” matches any single character. So b.b will match “bob”, “bib”, “b-b”, etc.

Asterisk “*” does not mean the same thing in regular expressions as in wildcarding; it is a modifier that applies to the preceding single character, or expression such as [0-9]. An asterisk matches zero or more of what precedes it. Thus [A-Z]* matches any number of upper-case letters, including none, while [A-Z][A-Z]* matches one or more upper-case letters.


Here are a few more examples of grep to show you what can be done:

grep '[a-zA-Z]'         {any line with at least one letter}
grep '[^a-zA-Z0-9]        {anything not a letter or number}
grep '[0-9]{3}-[0-9]{4}' {999-9999, like phone numbers}
grep '^.$'               {lines with exactly one character}
grep '"smug"'                 {'smug' within double quotes}
grep '"*smug"*'            {'smug', with or without quotes}
grep '^.'         {any line that starts with a Period "."}
grep '^.[a-z][a-z]' {line start with "." and 2 lc letters}

Source:
http://www.robelle.com/smugbook/regexpr.html

Comments are closed.