pub struct Builder { }
Available on crate feature meta
only.
A builder for configuring and constructing a Regex
.
The builder permits configuring two different aspects of a Regex
:
Builder::configure
will set high-level configuration options as described by a Config
.Builder::syntax
will set the syntax level configuration options as described by a util::syntax::Config
. This only applies when building a Regex
from pattern strings.Once configured, the builder can then be used to construct a Regex
from one of 4 different inputs:
Builder::build
creates a regex from a single pattern string.Builder::build_many
creates a regex from many pattern strings.Builder::build_from_hir
creates a regex from a regex-syntax::Hir
expression.Builder::build_many_from_hir
creates a regex from many regex-syntax::Hir
expressions.The latter two methods in particular provide a way to construct a fully feature regular expression matcher directly from an Hir
expression without having to first convert it to a string. (This is in contrast to the top-level regex
crate which intentionally provides no such API in order to avoid making regex-syntax
a public dependency.)
As a convenience, this builder may be created via Regex::builder
, which may help avoid an extra import.
This example shows how to enable multi-line mode by default and change the line terminator to the NUL byte:
use regex_automata::{meta::Regex, util::syntax, Match};
let re = Regex::builder()
.syntax(syntax::Config::new().multi_line(true))
.configure(Regex::config().line_terminator(b'\x00'))
.build(r"^foo$")?;
let hay = "\x00foo\x00";
assert_eq!(Some(Match::must(0, 1..4)), re.find(hay));
§Example: disable UTF-8 requirement
By default, regex patterns are required to match UTF-8. This includes regex patterns that can produce matches of length zero. In the case of an empty match, by default, matches will not appear between the code units of a UTF-8 encoded codepoint.
However, it can be useful to disable this requirement, particularly if you’re searching things like &[u8]
that are not known to be valid UTF-8.
use regex_automata::{meta::Regex, util::syntax, Match};
let mut builder = Regex::builder();
builder.syntax(syntax::Config::new().utf8(false));
builder.configure(Regex::config().utf8_empty(false));
let re = builder.build(r"(?-u:\xFF)foo(?-u:\xFF)")?;
let hay = b"\xFFfoo\xFF";
assert_eq!(Some(Match::must(0, 0..5)), re.find(hay));
let re = builder.build(r"")?;
let hay = "☃";
assert_eq!(re.find_iter(hay).collect::<Vec<Match>>(), vec![
Match::must(0, 0..0),
Match::must(0, 1..1),
Match::must(0, 2..2),
Match::must(0, 3..3),
]);
Source§ Source
Creates a new builder for configuring and constructing a Regex
.
Builds a Regex
from a single pattern string.
If there was a problem parsing the pattern or a problem turning it into a regex matcher, then an error is returned.
§ExampleThis example shows how to configure syntax options.
use regex_automata::{meta::Regex, util::syntax, Match};
let re = Regex::builder()
.syntax(syntax::Config::new().crlf(true).multi_line(true))
.build(r"^foo$")?;
let hay = "\r\nfoo\r\n";
assert_eq!(Some(Match::must(0, 2..5)), re.find(hay));
Source
Builds a Regex
from many pattern strings.
If there was a problem parsing any of the patterns or a problem turning them into a regex matcher, then an error is returned.
§Example: finding the pattern that caused an errorWhen a syntax error occurs, it is possible to ask which pattern caused the syntax error.
use regex_automata::{meta::Regex, PatternID};
let err = Regex::builder()
.build_many(&["a", "b", r"\p{Foo}", "c"])
.unwrap_err();
assert_eq!(Some(PatternID::must(2)), err.pattern());
§Example: zero patterns is valid
Building a regex with zero patterns results in a regex that never matches anything. Because this routine is generic, passing an empty slice usually requires a turbo-fish (or something else to help type inference).
use regex_automata::{meta::Regex, util::syntax, Match};
let re = Regex::builder()
.build_many::<&str>(&[])?;
assert_eq!(None, re.find(""));
Source
Builds a Regex
directly from an Hir
expression.
This is useful if you needed to parse a pattern string into an Hir
for other reasons (such as analysis or transformations). This routine permits building a Regex
directly from the Hir
expression instead of first converting the Hir
back to a pattern string.
When using this method, any options set via Builder::syntax
are ignored. Namely, the syntax options only apply when parsing a pattern string, which isn’t relevant here.
If there was a problem building the underlying regex matcher for the given Hir
, then an error is returned.
This example shows how one can hand-construct an Hir
expression and build a regex from it without doing any parsing at all.
use {
regex_automata::{meta::Regex, Match},
regex_syntax::hir::{Hir, Look},
};
let hir = Hir::concat(vec![
Hir::look(Look::StartCRLF),
Hir::literal("foo".as_bytes()),
Hir::look(Look::EndCRLF),
]);
let re = Regex::builder()
.build_from_hir(&hir)?;
let hay = "\r\nfoo\r\n";
assert_eq!(Some(Match::must(0, 2..5)), re.find(hay));
Ok::<(), Box<dyn std::error::Error>>(())
Source
Builds a Regex
directly from many Hir
expressions.
This is useful if you needed to parse pattern strings into Hir
expressions for other reasons (such as analysis or transformations). This routine permits building a Regex
directly from the Hir
expressions instead of first converting the Hir
expressions back to pattern strings.
When using this method, any options set via Builder::syntax
are ignored. Namely, the syntax options only apply when parsing a pattern string, which isn’t relevant here.
If there was a problem building the underlying regex matcher for the given Hir
expressions, then an error is returned.
Note that unlike Builder::build_many
, this can only fail as a result of building the underlying matcher. In that case, there is no single Hir
expression that can be isolated as a reason for the failure. So if this routine fails, it’s not possible to determine which Hir
expression caused the failure.
This example shows how one can hand-construct multiple Hir
expressions and build a single regex from them without doing any parsing at all.
use {
regex_automata::{meta::Regex, Match},
regex_syntax::hir::{Hir, Look},
};
let hir1 = Hir::concat(vec![
Hir::look(Look::StartCRLF),
Hir::literal("foo".as_bytes()),
Hir::look(Look::EndCRLF),
]);
let hir2 = Hir::concat(vec![
Hir::look(Look::StartCRLF),
Hir::literal("bar".as_bytes()),
Hir::look(Look::EndCRLF),
]);
let re = Regex::builder()
.build_many_from_hir(&[&hir1, &hir2])?;
let hay = "\r\nfoo\r\nbar";
let got: Vec<Match> = re.find_iter(hay).collect();
let expected = vec![
Match::must(0, 2..5),
Match::must(1, 7..10),
];
assert_eq!(expected, got);
Ok::<(), Box<dyn std::error::Error>>(())
Source
Configure the behavior of a Regex
.
This configuration controls non-syntax options related to the behavior of a Regex
. This includes things like whether empty matches can split a codepoint, prefilters, line terminators and a long list of options for configuring which regex engines the meta regex engine will be able to use internally.
This example shows how to disable UTF-8 empty mode. This will permit empty matches to occur between the UTF-8 encoding of a codepoint.
use regex_automata::{meta::Regex, Match};
let re = Regex::new("")?;
let got: Vec<Match> = re.find_iter("☃").collect();
assert_eq!(got, vec![
Match::must(0, 0..0),
Match::must(0, 3..3),
]);
let re = Regex::builder()
.configure(Regex::config().utf8_empty(false))
.build("")?;
let got: Vec<Match> = re.find_iter("☃").collect();
assert_eq!(got, vec![
Match::must(0, 0..0),
Match::must(0, 1..1),
Match::must(0, 2..2),
Match::must(0, 3..3),
]);
Ok::<(), Box<dyn std::error::Error>>(())
Source
Configure the syntax options when parsing a pattern string while building a Regex
.
These options only apply when Builder::build
or Builder::build_many
are used. The other build methods accept Hir
values, which have already been parsed.
This example shows how to enable case insensitive mode.
use regex_automata::{meta::Regex, util::syntax, Match};
let re = Regex::builder()
.syntax(syntax::Config::new().case_insensitive(true))
.build(r"δ")?;
assert_eq!(Some(Match::must(0, 0..2)), re.find(r"Δ"));
Ok::<(), Box<dyn std::error::Error>>(())
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4