Special character sequences
Printable characters, such as “a” or “b”, are defined by simply typing them into a regular expression. In addition, there are some shorthands for common non-printable characters and character classes.
Special character sequences are listed in the following table:
Sequence | Description |
---|---|
\a
|
Bell (BEL) = \x07
|
\t
|
Horizontal tab (HT) = \x09
|
\n
|
Linefeed (LF) = \x0A
|
\f
|
Formfeed (FF) = \x0C
|
\r
|
Carriage return (CR) = \x0D
|
\e
|
Escape (ESC) = \x1B
|
\OOO
|
Octal code OOO of the character. |
\xHH
|
Hexadecimal code HH of the character. Case-insensitive. For example, "\xaa " is regarded to be the same as
"\xAA ". |
\c<char>
|
Control character that corresponds to Ctrl+<char > , where <char> is an uppercase letter. |
\w
|
"word" class character = [A-Za-z0-9_]
|
\W
|
Non-"word" class character = [^A-Za-z0-9_]
|
\s
|
Whitespace character = [ \t\r\n\f]
|
\S
|
Non-whitespace character = [^ \t\r\n\f]
|
\d
|
Digit character = [0-9]
|
\D
|
Non-digit character = [^0-9]
|
\b
|
Backspace (BS) = \x08
Note: Allowed only in bracket expressions.
|
\Q
|
Quotes all metacharacters between \Q and \E . Backslashes are regarded as normal characters.
For example, " |
Example of using special character sequences
# This fingerprint matches HTTP content
# for which the length is >= 10000
# The situation context for this regular expression could be either
# "HTTP Request Header Line" or "HTTP Reply Header Line"
Content-Length: \d\d\d\d\d
# The regular expression could be also written as shown below
Content-Length: \d{5}