Page History

Character	Description

Notes

\

a

`a`	Match a `BELL`, `\`

u0007

u0007.

\

A

`A`	Match at the beginning of the input. Differs from `^` in that `\A` will not match after a new line within the input.

\b, outside of a [Set]

Match if the current position is a word boundary. Boundaries occur at the transitions

betweem

between word (\w) and non-word (\W) characters, with combining marks ignored. For better word boundaries, see ICU Boundary Analysis.

\b, within a [Set]

Match a BACKSPACE, \u0008.

\

B

`B`	Match if the current position is not a word boundary.

\

cX

`cX`	Match a control-X character.

\

d

`d`	Match any character with the Unicode General Category of `Nd` (Number, Decimal Digit.)

.

\

D

`D`	Match any character that is not a decimal digit.

\

e

`e`	Match an `ESCAPE`, `\u001B`.

\

E

`E`	Terminates a `\Q` ... `\E` quoted sequence.

\

f

`f`	Match a `FORM FEED`, `\u000C`.

\

G

`G`	Match if the current position is at the end of the previous match.

\

n

`n`	Match a `LINE FEED`, `\u000A`.

\N{UNICODE CHARACTER NAME}

Match the named character.

post 2.4

\p{UNICODE PROPERTY NAME}

Match any character with the specified Unicode Property.

\P{UNICODE PROPERTY NAME}

Match any character not having the specified Unicode Property.

\

Q

`Q`	Quotes all following characters until `\E`.

\

r

`r`	Match a `CARRIAGE RETURN`, `\u000D`.

\

s

`s`	Match a white space character. White space is defined as `[\t\n\f\r\p{Z}]`.

\

S

`S`	Match a non-white space character.

\

t

`t`	Match a `HORIZONTAL TABULATION`, `\u0009`.

\

uhhhh

uhhhh Match the character with the hex value hhhh.

\

Uhhhhhhhh

Uhhhhhhhh Match the character with the hex value hhhhhhhh. Exactly eight hex digits must be provided, even though the largest Unicode code point is \U0010ffff.

\

w

`w`	Match a word character. Word characters are `[\p{Ll}\p{Lu}\p{Lt}\p{Lo}\p{Nd}]`.

\

W

`W`	Match a non-word character.

\x{hhhh}

post 2

Match the character with hex value

hhhh

hhhh.

4

\

xhh post 2

xhh Match the character with two digit hex value

hh

hh.

4

\

X

`X`	Match a Grapheme Cluster. Partial implementation for ICU 2.4, does not handle Hangul syllables.

\

Z

`Z`	Match if the current position is at the end of input, but before the final line terminator, if one exists.

\

z

`z`	Match if the current position is at the end of input.

\

0nnn post 2

0nnn Match the character with octal value

nnn

nnn.

4

\

n

`n`	Back Reference. Match whatever the nth capturing group matched. `n` must be > 1 and < total number of capture groups in the

pattern post 2

pattern.

4

[pattern]

Match any one character from the set. See UnicodeSet for a full description of what may appear in the

pattern

pattern.

.

Match any character.

^

^

Match at the beginning of a line.

$

$

Match at the end of a line.

\

Quotes the following character. Characters that must be quoted to be treated as literals are * ? + [ ( ) { } ^ $

\ . /

Page tree

Versions Compared

Old Version 1

New Version Current

Key