Character | Description | Notes | |
---|---|---|---|
\a | Match a BELL, \u0007 | ||
\A | Match at the beginning of the input. Differs from ^ in that \A will not match after a new line within the input. | ||
\b, outside of a [Set] | Match if the current position is a word boundary. Boundaries occur at the transitions betweem word (\w) and non-word (\W) characters, with combining marks ignored. For better word boundaries, see ICU Boundary Analysis. | ||
\b, within a [Set] | Match a BACKSPACE, \u0008. | ||
\B | Match if the current position is not a word boundary. | ||
\cX | Match a control-X character. | ||
\d | Match any character with the Unicode General Category of Nd (Number, Decimal Digit.) | ||
\D | Match any character that is not a decimal digit. | ||
\e | Match an ESCAPE, \u001B. | ||
\E | Terminates a \Q ... \E quoted sequence. | ||
\f | Match a FORM FEED, \u000C. | ||
\G | Match if the current position is at the end of the previous match. | ||
\n | Match a LINE FEED, \u000A. | ||
\N{UNICODE CHARACTER NAME} | Match the named character. | post 2.4 | |
\p{UNICODE PROPERTY NAME} | Match any character with the specified Unicode Property. | ||
\P{UNICODE PROPERTY NAME} | Match any character not having the specified Unicode Property. | ||
\Q | Quotes all following characters until \E. | ||
\r | Match a CARRIAGE RETURN, \u000D. | ||
\s | Match a white space character. White space is defined as [\t\n\f\r\p{Z}]. | ||
\S | Match a non-white space character. | ||
\t | Match a HORIZONTAL TABULATION, \u0009. | ||
\uhhhh | Match the character with the hex value hhhh. | ||
\Uhhhhhhhh | Match the character with the hex value hhhhhhhh. Exactly eight hex digits must be provided, even though the largest Unicode code point is \U0010ffff. | ||
\w | Match a word character. Word characters are [\p{Ll}\p{Lu}\p{Lt}\p{Lo}\p{Nd}]. | ||
\W | Match a non-word character. | ||
\x{hhhh} | Match the character with hex value hhhh | post 2.4 | |
\xhh | Match the character with two digit hex value hh | post 2.4 | |
\X | Match a Grapheme Cluster. Partial implementation for ICU 2.4, does not handle Hangul syllables. | ||
\Z | Match if the current position is at the end of input, but before the final line terminator, if one exists. | ||
\z | Match if the current position is at the end of input. | ||
\0nnn | Match the character with octal value nnn | post 2.4 | |
\n | Back Reference. Match whatever the nth capturing group matched. n must be > 1 and < total number of capture groups in the pattern | post 2.4 | |
[pattern] | Match any one character from the set. See UnicodeSet for a full description of what may appear in the pattern | ||
. | Match any character. | ||
^ | Match at the beginning of a line. | ||
$ | Match at the end of a line. | ||
\ | Quotes the following character. Characters that must be quoted to be treated as literals are * ? + [ ( ) { } ^ $ | \ . / |
Overview
Content Tools