RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://www.elastic.co/docs/reference/query-languages/query-dsl/regexp-syntax below:

Regular expression syntax | Reference

Regular expression syntax

A regular expression is a way to match patterns in data using placeholder characters, called operators.

Elasticsearch supports regular expressions in the following queries:

Elasticsearch uses Apache Lucene's regular expression engine to parse these queries.

Luceneâs regular expression engine supports all Unicode characters. However, the following characters are reserved as operators:

. ? + * | { } [ ] ( ) " \

Depending on the optional operators enabled, the following characters may also be reserved:

To use one of these characters literally, escape it with a preceding backslash or surround it with double quotes. For example:

renders as a literal '@'
renders as a literal '\'
renders as 'john@smith.com'

Note

The backslash is an escape character in both JSON strings and regular expressions. You need to escape both backslashes in a query, unless you use a language client, which takes care of this. For example, the string a\b needs to be indexed as "a\\b":

 PUT my-index-000001/_doc/1 {
  "my_field": "a\\b"
}

This document matches the following regexp query:

 GET my-index-000001/_search {
  "query": {
    "regexp": {
      "my_field.keyword": "a\\\\.*"
    }
  }
}

Luceneâs regular expression engine does not use the Perl Compatible Regular Expressions (PCRE) library, but it does support the following standard operators.

.: Matches any character. For example:

matches 'aba', 'abb', 'abz', etc.

?: Repeat the preceding character zero or one times. Often used to make the preceding character optional. For example:

matches 'ab' and 'abc'

+: Repeat the preceding character one or more times. For example:

matches 'ab', 'abb', 'abbb', etc.

*: Repeat the preceding character zero or more times. For example:

matches 'a', 'ab', 'abb', 'abbb', etc.

{}: Minimum and maximum number of times the preceding character can repeat. For example:

matches 'aa'
matches 'aa', 'aaa', and 'aaaa'
matches 'a` repeated two or more times

|: OR operator. The match will succeed if the longest pattern on either the left side OR the right side matches. For example:

matches 'abc' and 'xyz'

( â¦ ): Forms a group. You can use a group to treat part of the expression as a single character. For example:

matches 'abc' and 'abcdef' but not 'abcd'

[ â¦ ]: Match one of the characters in the brackets. For example:

matches 'a', 'b', 'c'

Inside the brackets, - indicates a range unless - is the first character or escaped. For example:

matches 'a', 'b', or 'c'
'-' is first character. Matches '-', 'a', 'b', or 'c'
Escapes '-'. Matches 'a', 'b', 'c', or '-'

A ^ before a character in the brackets negates the character or range. For example:

[^abc]
[^a-c]
[^-abc]
[^abc\-]

matches any character except 'a', 'b', or 'c'
matches any character except 'a', 'b', or 'c'
matches any character except '-', 'a', 'b', or 'c'
matches any character except 'a', 'b', 'c', or '-'

Note

Character range classes such as [a-c] do not behave as expected when using case_insensitive: true â they remain case sensitive. For example, [a-c]+ with case_insensitive: true will match strings containing only the characters 'a', 'b', and 'c', but not 'A', 'B', or 'C'. Use [a-zA-Z] to match both uppercase and lowercase characters.

This is due to a known limitation in Lucene's regular expression engine. See Lucene issue #14378 for details.

You can use the flags parameter to enable more optional operators for Luceneâs regular expression engine.

To enable multiple operators, use a | separator. For example, a flags value of COMPLEMENT|INTERVAL enables the COMPLEMENT and INTERVAL operators.

ALL (Default): Enables all optional operators.
"" (empty string): Alias for the ALL value.
COMPLEMENT: Enables the ~ operator. You can use ~ to negate the shortest following pattern. For example:

matches 'adc' and 'aec' but not 'abc'

EMPTY: Enables the # (empty language) operator. The # operator doesnât match any string, not even an empty string.

If you create regular expressions by programmatically combining values, you can pass # to specify "no string." This lets you avoid accidentally matching empty strings or other unwanted strings. For example:

matches 'abc' but nothing else, not even an empty string

INTERVAL: Enables the <> operators. You can use <> to match a numeric range. For example:

matches 'foo1', 'foo2' ... 'foo99', 'foo100'
matches 'foo01', 'foo02' ... 'foo99', 'foo100'

INTERSECTION: Enables the & operator, which acts as an AND operator. The match will succeed if patterns on both the left side AND the right side matches. For example:

matches 'aaabbb'

ANYSTRING: Enables the @ operator. You can use @ to match any entire string.

You can combine the @ operator with & and ~ operators to create an "everything except" logic. For example:

matches everything except terms beginning with 'abc'

NONE: Disables all optional operators.

Luceneâs regular expression engine does not support anchor operators, such as ^ (beginning of line) or $ (end of line). To match a term, the regular expression must match the entire string.

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4