Home | Engines | Reference | Improve this section
Languages FeaturesThe following features are supported:
The following features are not supported:
Main article | Reference | Back to top | Improve this section: 1, 2
Flags control certain aspects of the matching behavior of a pattern.
SyntaxThe following flags are supported:
i
— Ignore Case. Matches character classes using a case-insensitive comparison.m
— Multiline. Causes the anchors ^
and $
to match the start and end of each line (respectively), rather than the start and end of the input.n
— Explicit captures. Regular Capturing Groups are not captured, only Named Capturing Groups are captured.s
— Singleline. Causes the wildcard .
to match newline characters.x
— Extended Mode. Ignores whitespace in a pattern. Spaces must instead be represented by \s
or \
(an escaped space).xx
— “Extended More” Mode. Same as x
but unescaped spaces and horizontal tab characters are also ignored inside character classes.Main article | Reference | Back to top | Improve this section: 1, 2
Anchors match the start or end of a line.
Syntax^
— Matches the start of a line when the m
(multiline) flag is set. Otherwise, matches the start of the input.$
— Matches the end of a line when the m
(multiline) flag is set. Otherwise, matches the end of the input.Main article | Reference | Back to top | Improve this section: 1, 2
A Buffer Boundary is an Atom that matches the start or the end of the input. This differs slightly from ^
and $
which can be affected by RegExp flags like m
.
\A
— Matches the start of the input.\z
— Matches the end of the input.\Z
— A zero-width assertion consisting of an optional newline at the end of the buffer. Equivalent to (?=\n?\z)
.Main article | Reference | Back to top | Improve this section: 1, 2
A Word Boundary is an Atom that matches the start or the end of a word.
Syntax\b
— Matches the start or the end of a word.\B
— Matches when not at the start or the end of a word.[[:<:]]
— Matches the start of a word. Equivalent to: \b(?=\w)
[[:>:]]
— Matches the end of a word. Equivalent to: \b(?<=\w)
Main article | Back to top | Improve this section: 1, 2
See Also Feature: Continuation Escape❌ This feature is not supported.
Main article | Reference | Back to top | Improve this section: 1, 2
A Continuation Escape is a zero-width assertion that matches either the start of the input or the start of the last match.
Syntax\G
— Matches either the start of the input or the start of the last match.Main article | Reference | Back to top | Improve this section: 1, 2
An Alternative represents two or more branches in a pattern. If first branch of a pattern fails to match, each alternative is attempted from left to right until a match is found.
Syntax…|…
— Matches the pattern to the left of the |
. If that fails, matches the pattern to the right of |
.Main article | Reference | Back to top | Improve this section: 1, 2
A Wildcard matches a single, non-newline character.
Syntax.
— Matches any character except newline characters. If the s
(single-line) flag is set then this matches any character.Main article | Reference | Back to top | Improve this section: 1, 2
A Character Class is an Atom that specifies a set of characters to match a single character in the set.
Syntax[…]
— Where …
is one or more single characters or character class escapes, excluding ^
at the start and -
between two entries in the set. Matches a character in the set. Example: [abc]
matches a
, b
, or c
.[^…]
— Where …
is one or more single characters or character class escapes, excluding -
between two entries in the set. Matches any character not in the set. Example: [^abc]
matches d
, e
, or f
, etc., but not a
, b
, or c
.[a-z]
— Where a and z are single characters or character escapes. Matches any character in the range between a and z (inclusive). Example: [a-c]
matches a
, b
, or c
, but not d
.Main article | Reference | Back to top | Improve this section: 1, 2
A Posix Character Class is a member of a Character Class set that specifies a named, pre-defined set of characters.
Syntax[[:name:]]
— Where name is in a set of predefined names. Matches any character in the set.Main article | Reference | Back to top | Improve this section: 1, 2
A Negated Posix Character Class is a member of a Character Class set that specifies a named, pre-defined set of excluded characters.
Syntax[[:^name:]]
— Where name is in a set of predefined names. Matches any character not in the set.Main article | Back to top | Improve this section: 1, 2
See Also❌ This feature is not supported.
Main article | Back to top | Improve this section: 1, 2
See Also❌ This feature is not supported.
Main article | Reference | Back to top | Improve this section: 1, 2
A Character Class Escape is a single character escape that represents an entire character class. They can be used as an element of a Character Class or as an Atom. It is often the case that a lower-case escape character is the inclusive set, while an upper-case variant of the same character excludes that set.
Syntax\d
— A decimal digit character in the range 0-9. Equivalent to [0-9]
.
\p{Nd}
instead.\D
— Any character not in the range 0-9. Equivalent to [^0-9]
.
\P{Nd}
instead.\w
— Any “word” character. Equivalent to [a-zA-Z0-9_]
.
[\p{L}\p{N}_]
instead.\W
— Any non-“word” character. Equivalent to [^a-zA-Z0-9_]
.
[^\p{L}\p{N}_]
instead.\s
— Any whitespace character. Equivalent to [\x09-\x0d\x20]
, but may depend on locale.
[\p{Z}\h\v]
instead (where \h
and \v
are defined below).\S
— Any non-whitespace character.
[^\p{Z}\h\v]
instead (where \h
and \v
are defined below).\h
— Any horizontal whitespace character. Equivalent to [\x09\x20\xa0\u{1680}\u{180e}\u{2000}-\u{200a}\u{202f}\u{205f}\u{3000}]
.\H
— Any non-horizontal whitespace character. Equivalent to [^\x09\x20\xa0\u{1680}\u{180e}\u{2000}-\u{200a}\u{202f}\u{205f}\u{3000}]
.\v
— Any vertical whitespace character. Equivalent to [\x0a-x0d\x85\u{2028}\u{2029}]
.\V
— Any non-vertical whitespace character. Equivalent to [^\x0a-x0d\x85\u{2028}\u{2029}]
.\N
— Any character that is not a newline. Similar to .
, but is not affected by the s
RegExp flag.Main article | Reference | Back to top | Improve this section: 1, 2
A Line Endings Escape is an Atom that matches any line ending character sequence.
Syntax\R
— Equivalent to (?>\r\n?|[\x0A-\x0C\x85\u{2028}\u{2029}])
Main article | Reference | Back to top | Improve this section: 1, 2
A Character Property Escape is an escape sequence used to match a character with a specific character property.
Syntax\pX
— Where X is a single character. Matches a character that has the property X.\p{name}
— Where name is a predefined property name. Matches a character that has the property name.\PX
— Where X is a single character. Matches a character that does not have the property X.\P{name}
— Where name is a predefined property name. Matches a character that does not have the property name.Main article | Back to top | Improve this section: 1, 2
See Also❌ This feature is not supported.
Main article | Back to top | Improve this section: 1, 2
See Also❌ This feature is not supported.
Main article | Back to top | Improve this section: 1, 2
See Also❌ This feature is not supported.
Main article | Back to top | Improve this section: 1, 2
See Also❌ This feature is not supported.
Main article | Back to top | Improve this section: 1, 2
See Also❌ This feature is not supported.
Main article | Back to top | Improve this section: 1, 2
See Also❌ This feature is not supported.
Main article | Reference | Back to top | Improve this section: 1, 2
Quoted Characters are a sequence of characters treated as literal characters rather than RegExp characters.
Syntax\Q…\E
— All characters following \Q
and preceding the next \E
are treated as literal characters. Example: \Q.+\E
matches .+
but not aa
.\Q…
— If there is no trailing \E
, all characters until the end of the pattern are treated as literal characters.Main article | Reference | Back to top | Improve this section: 1, 2
Quantifiers specify repetition of an Atom. By default, quantifiers are “greedy” in that they attempt to match as many instances of the preceding Atom as possible to satisfy the pattern before backtracking.
Syntax*
— Matches the preceding Atom zero or more times. Example: a*b
matches b
, ab
, aab
, aaab
, etc.+
— Matches the preceding Atom one or more times. Example: a+b
matches ab
, aab
, aaab
, etc., but not b
.?
— Matches the preceding Atom zero or one times. Example: a?b
matches b
, ab
.{n}
— Where n is an integer. Matches the preceding Atom exactly n times. Example: a{2}
matches aa
but not a
or aaa
.{n,}
— Where n is an integer. Matches the preceding Atom at-least n times. Example: a{2,}
matches aa
, aaa
, aaaa
, etc., but not a
.{n,m}
— Where n and m are integers, and m >= n. Matches the preceding Atom at-least n times and at-most m times. Example: a{2,3}
matches aa
, aaa
, aaaa
, etc., but not a
or aaaa
.Main article | Reference | Back to top | Improve this section: 1, 2
Lazy Quantifiers specify repetition of an Atom, but attempt to match as few instances of the preceding Atom as possible to satisfy the pattern before advancing.
Syntax*?
— Matches the preceding Atom zero or more times.+?
— Matches the preceding Atom one or more times.??
— Matches the preceding Atom zero or one times.{n}?
— Where n is an integer. Matches the preceding Atom exactly n times.{n,}?
— Where n is an integer. Matches the preceding Atom at-least n times.{n,m}?
— Where n and m are integers, and m >= n. Matches the preceding Atom at-least n times and at-most m times.Main article | Reference | Back to top | Improve this section: 1, 2
Possessive Quantifiers are like greedy (i.e., regular) quantifiers, except that backtracking is not performed.
Syntax*+
— Match zero or more characters without backtracking.++
— Match one or more characters without backtracking.?+
— Match zero or one characters without backtracking.{n,}+
— Where n is an integer. Matches the preceding Atom at-least n times without backtracking.{n,m}+
— Where n and m are integers, and m >= n. Matches the preceding Atom at-least n times and at-most m times without backtracking.Main article | Reference | Back to top | Improve this section: 1, 2
A Capturing Group is a subexpression that can be treated as an Atom and can be repeated using Quantifiers and referenced using Backreferences by index. A Capturing Group can be captured and returned by the matching algorithm.
Syntax(…)
— Groups the subexpression as a single Atom. The result is captured and returned by the matching algorithm.Main article | Reference | Back to top | Improve this section: 1, 2
A Named Capturing Group is a subexpression that can be captured and returned by the matching algorithm. A Named Capturing Group is also an Atom and can be repeated using Quantifiers and referenced using Backreferences by name.
Syntax(?<name>…)
— Groups the subexpression as a single Atom associated with the provided name. The result is captured and returned by the matching algorithm.(?'name'…)
— Groups the subexpression as a single Atom associated with the provided name. The result is captured and returned by the matching algorithm.(?P<name>…)
— Groups the subexpression as a single Atom associated with the provided name. The result is captured and returned by the matching algorithm.Main article | Reference | Back to top | Improve this section: 1, 2
A Non-capturing Group is a subexpression that can be treated as an Atom and can be repeated using Quantifiers but cannot be referenced using Backreferences. A Non-capturing Group is not captured by the matching algorithm.
Syntax(?:…)
— Groups the subexpression as a single Atom.Main article | Reference | Back to top | Improve this section: 1, 2
Backreferences allow a pattern to re-match a previously matched capture group1 2 either by number (n) or by name.
Syntax\n
— Where n is an integer >= 1. Matches the same string as the capture group n.
\gn
— Where n is an integer >= 1. Matches the same string as the capture group n.\g-n
— Where n is an integer >= 1. Matches the nth previous capture group.\g+n
— Where n is an integer >= 1. Matches the nth next capture group.\g{n}
— Where n is an integer >= 1. Matches the same string as the capture group n.\g{-n}
— Where n is an integer >= 1. Matches the nth previous capture group.\g{+n}
— Where n is an integer >= 1. Matches the nth next capture group.\g{name}
— Matches the named capture group with the name name.\k{name}
— Matches the named capture group with the name name.\k<name>
— Matches the named capture group with the name name.\k'name'
— Matches the named capture group with the name name.(?P=name)
— Matches the named capture group with the name name.Main article | Reference | Back to top | Improve this section: 1, 2
A Comment is a sequence of characters that is ignored by pattern matching and can be used to document a pattern.
Syntax(?#…)
— The entire expression is removed from the pattern. A comment may not contain other (
or )
characters.Main article | Reference | Back to top | Improve this section: 1, 2
A Line Comment is a sequence of characters starting with #
and ending with \n
(or the end of the pattern) that is ignored by pattern matching and can be used to document a pattern.
#…\n
— The rest of the line is removed from the pattern. Only supported when either the x
(extended mode) or xx
(extended more mode) RegExp flags are set.Main article | Reference | Back to top | Improve this section: 1, 2
Modifiers allow you to change the currently active RegExp flags within a subexpression.
Syntax(?imnsxx-imnsxx)
- Sets or unsets (using -
) the specified RegExp flags starting at the current position until the next closing )
or the end of the pattern. Example: (?-i)A(?i)B(?-i)C
matches ABC
, AbC
.(?imnsxx-imnsxx:…)
- Sets or unsets (using -
) the specified RegExp flags for the subexpression. Example: (?-i:A(?i:B)C)
matches ABC
, AbC
.(?^)
- Unsets all RegExp flags.(?^imnsxx)
- Unsets all RegExp flags and sets the requested flags.Main article | Reference | Back to top | Improve this section: 1, 2
A Branch Reset resets the subexpression count at the start of each Alternative (|
), which affects numbering for Backreferences and captured results returned from the matching algorithm.
(?|…)
— Resets the subexpression count at the start of each Alternative.Main article | Reference | Back to top | Improve this section: 1, 2
A Lookahead is a zero-width assertion that matches if the provided pattern would match the characters to the right of the current position.
Syntax(?=…)
— Positive Lookahead. Matches if the provided pattern would match but does not advance the current position.(?!…)
— Negative Lookahead. Matches if the provided pattern would not match, but does not advance the current position.Main article | Reference | Back to top | Improve this section: 1, 2
A Lookbehind is a zero-width assertion that matches if the provided pattern would match the characters to the left of the current position.
Syntax(?<=…)
— Positive Lookbehind. Matches if the provided pattern would match the preceding characters, but does not advance the current position. The pattern must have a fixed length (unbounded quantifiers are not permitted).(?<!…)
— Negative Lookbehind. Matches if the provided pattern would not match the preceding characters, but does not advance the current position. The pattern must have a fixed length (unbounded quantifiers are not permitted).Main article | Reference | Back to top | Improve this section: 1, 2
A Non-Backtracking Expression is matched independent of neighboring patterns, and will not backtrack in the event of a failed match. This is often used to improve performance.
Syntax(?>…)
— Matches the provided pattern, but no backtracking is performed if the match fails.Main article | Reference | Back to top | Improve this section: 1, 2
A Recursive Expression provides a mechanism for re-evaluating a capture group inside of itself, to handle cases such as matching balanced parenthesis or brackets, etc.
Syntax(?R)
— Reevaluates the entire pattern starting at the current position.(?0)
— Reevaluates the entire pattern starting at the current position.(?n)
— Where n is an integer >= 1. Re-evaluates the capture group whose offset is n.(?-n)
— Where n is an integer >= 1. Re-evaluates the capture group whose offset is the nth capture group declared to the left of the current Atom. Example: (?-1)
would revaluate the last declared capture group.(?+n)
— Where n is an integer >= 1. Re-evaluates the capture group whose offset is the nth capture group declared to the right of the current Atom. Example: (?+1)
would evaluate the next declared capture group.(?&name)
— Re-evaluates the named capture group with the provided name.(?P>name)
— Re-evaluates the named capture group with the provided name.Main article | Reference | Back to top | Improve this section: 1, 2
A Conditional Expression checks a condition and evaluates its first alternative if the condition is true; otherwise, it evaluates its second alternative.
Syntax(?(condition)condition|condition)
— Matches yes-pattern if condition is true; otherwise, matches no-pattern.(?(condition)condition)
— Matches yes-pattern if condition is true; otherwise, matches the empty string.The following conditions are supported:
(?(?=test-pattern)…)
— Evaluates to true if a lookahead for test-pattern matches; Otherwise, evaluates to false.(?(?!test-pattern)…)
— Evaluates to true if a negative lookahead for test-pattern matches; Otherwise, evaluates to false.(?(n)…)
— Evaluates to true if the capture group at offset n was successfully matched; Otherwise, evaluates to false.(?(<name>)…)
— Evaluates to true if the named capture group with the name name was successfully matched; Otherwise, evaluates to false.(?('name')…)
— Evaluates to true if the named capture group with the name name was successfully matched; Otherwise, evaluates to false.(?(R)…)
— Evaluates to true if inside a recursive expression; Otherwise, evaluates to false.(?(Rn)…)
— Evaluates to true if inside a recursive expression for the capture group at offset n; Otherwise, evaluates to false.(?(R&name)…)
— Evaluates to true if inside a recursive expression for the named capture group with the name name; Otherwise, evaluates to false.(?(DEFINE)…)
— Always evaluates to false. This allows you to define Subroutines.(?(VERSION=version)…)
— Evaluates to true if the PCRE version is equal to supplied version; otherwise, evaluates to false.(?(VERSION>=version)…)
— Evaluates to true if the PCRE version is greater than or equal to the supplied version; otherwise, evaluates to false.Main article | Reference | Back to top | Improve this section: 1, 2
A Subroutine is a pre-defined capture group or named capture group that can be reused in multiple places within the pattern. These capture groups can optionally be placed in a [DEFINE condition].
Syntax(?(DEFINE)…)
— Defines a set of reusable capture groups that can be referenced elsewhere in the pattern.(?n)
— Where n is an integer >= 1. Evaluates the capture group whose offset is n.(?-n)
— Where n is an integer >= 1. Evaluates the capture group whose offset is the nth capture group declared to the left of the current Atom. Example: (?-1)
would revaluate the last declared capture group.(?+n)
— Where n is an integer >= 1. Evaluates the capture group whose offset is the nth capture group declared to the right of the current Atom. Example: (?+1)
would evaluate the next declared capture group.(?&name)
— Evaluates the named capture group with the provided name.\g<name>
— Evaluates the named capture group with the provided name.(?(DEFINE)
(?<Year>\d{4}|[+-]\d{5,})
(?<Month>0[1-9]|1[0-2])
(?<Day>0[1-9]|2[0-9]|3[01])
)
(?<Date>(?&Year)-(?&Month)-(?&Day)|(?&Year)(?&Month)(?&Day))
Feature: Callouts
Main article | Reference | Back to top | Improve this section: 1, 2
A Callout is a user-defined function that can be evaluated while matching.
Syntax(?C)
— Invokes the user defined function with the argument 0
.(?Cn)
— Where n is an integer. Invokes the user defined function with the argument n.(?C\`arg\`)
— Where arg is any character except `
. If an `
must be included it should be escaped by doubling it (i.e., ``
). Invokes the user defined function with the argument arg.(?C'arg')
— Where arg is any character except '
. If an '
must be included it should be escaped by doubling it (i.e., ''
). Invokes the user defined function with the argument arg.(?C"arg")
— Where arg is any character except "
. If an "
must be included it should be escaped by doubling it (i.e., ""
). Invokes the user defined function with the argument arg.(?C^arg^)
— Where arg is any character except ^
. If an ^
must be included it should be escaped by doubling it (i.e., ^^
). Invokes the user defined function with the argument arg.(?C%arg%)
— Where arg is any character except %
. If an %
must be included it should be escaped by doubling it (i.e., %%
). Invokes the user defined function with the argument arg.(?C#arg#)
— Where arg is any character except #
. If an #
must be included it should be escaped by doubling it (i.e., ##
). Invokes the user defined function with the argument arg.(?C$arg$)
— Where arg is any character except $
. If an $
must be included it should be escaped by doubling it (i.e., $$
). Invokes the user defined function with the argument arg.(?C{arg})
— Where arg is any character except }
. If an }
must be included it should be escaped by doubling it (i.e., }}
). Invokes the user defined function with the argument arg.Main article | Reference | Back to top | Improve this section: 1, 2
A Backtracking Control Verb is a special pattern usually in the form of (*VERB)
or (*VERB:arg)
that performs some special behavior with respect to backtracking.
(*PRUNE)
, (*PRUNE:name)
— Prunes the backtracking tree.(*SKIP)
, (*SKIP:name)
— Prunes the backtracking tree and preceding text cannot be part of any match of the pattern.(*MARK:name)
, (*:name)
— Marks a point in the string where a certain part of the pattern has been matched.(*THEN)
, (*THEN:name)
— When backtracked into on failure causes the engine to attempt the next alternative in the innermost enclosing group with alternatives.(*COMMIT)
, (*COMMIT:arg)
— When backtracked into on failure causes the match to fail outright.(*FAIL)
, (*F)
, (*FAIL:arg)
— Matches nothing and always fails. Equivalent to (?!)
.(*ACCEPT)
, (*ACCEPT:arg)
— Causes the end of successful matching at the point where the verb was encountered.RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4