Meta Characters
Character | Description |
---|---|
\a | Match a BELL , \u0007 . |
\A | Match at the beginning of the input. Differs from ^ in that \A will not match after a new line within the input. |
\b , outside of a [Set] | Match if the current position is a word boundary. Boundaries occur at the transitions between word (\w ) and non-word (\W ) characters, with combining marks ignored. For better word boundaries, see ICU Boundary Analysis. |
\b , within a [Set] | Match a BACKSPACE , \u0008 . |
\B | Match if the current position is not a word boundary. |
\cX | Match a control-X character. |
\d | Match any character with the Unicode General Category of Nd (Number, Decimal Digit.). |
\D | Match any character that is not a decimal digit. |
\e | Match an ESCAPE , \u001B . |
\E | Terminates a \Q ... \E quoted sequence. |
\f | Match a FORM FEED , \u000C . |
\G | Match if the current position is at the end of the previous match. |
\n | Match a LINE FEED , \u000A . |
\N{UNICODE CHARACTER NAME} | Match the named character. |
\p{UNICODE PROPERTY NAME} | Match any character with the specified Unicode Property. |
\P{UNICODE PROPERTY NAME} | Match any character not having the specified Unicode Property. |
\Q | Quotes all following characters until \E . |
\r | Match a CARRIAGE RETURN , \u000D . |
\s | Match a white space character. White space is defined as [\t\n\f\r\p{Z}] . |
\S | Match a non-white space character. |
\t | Match a HORIZONTAL TABULATION , \u0009 . |
\uhhhh | Match the character with the hex value hhhh . |
\Uhhhhhhhh | Match the character with the hex value hhhhhhhh . Exactly eight hex digits must be provided, even though the largest Unicode code point is \U0010ffff . |
\w | Match a word character. Word characters are [\p{Ll}\p{Lu}\p{Lt}\p{Lo}\p{Nd}] . |
\W | Match a non-word character. |
\x{hhhh} | Match the character with hex value hhhh . |
\xhh | Match the character with two digit hex value hh . |
\X | Match a Grapheme Cluster. Partial implementation for ICU 2.4, does not handle Hangul syllables. |
\Z | Match if the current position is at the end of input, but before the final line terminator, if one exists. |
\z | Match if the current position is at the end of input. |
\0nnn | Match the character with octal value nnn . |
\n | Back Reference. Match whatever the nth capturing group matched. n must be > 1 and < total number of capture groups in the pattern. |
[pattern] | Match any one character from the set. See UnicodeSet for a full description of what may appear in the pattern. |
. | Match any character. |
^ | Match at the beginning of a line. |
$ | Match at the end of a line. |
\ | Quotes the following character. Characters that must be quoted to be treated as literals are * ? + [ ( ) { } ^ $ |