The CirrusSearch extension implements searching for MediaWiki using Elasticsearch.
Elasticsearch is a standalone third-party software you must install as a requirement for this extension. It is a database system that provides search and indexing functionality, where the current text of your wiki pages gets indexed for faster and improved search results. The communication between MediaWiki and Elasticsearch is done through web services.
See also the help page on using this extension.
GoalsIn addition to the standard MediaWiki requirements for PHP, CirrusSearch requires PHP to be compiled with cURL support.
You must install Elasticsearch or Opensearch.
Every version of Elasticsearch changes how web services work and causes compatibility problems. You must install the version of Elasticsearch compatible with the version of MediaWiki you are currently using:
MediaWiki 1.39+ require Elasticsearch 7.10.2 (6.8.23+ is possible using a compatibility layer ). See this revision for compatibility information with earlier versions of MediaWiki.
MediaWiki 1.44+ is compatible with Opensearch 1.3.
Elasticsearch versions before 6.8 are incompatible with PHP 8+.
Take note that a Java installation like OpenJDK is needed in addition. It's best to use the official Elasticsearch Docker image or a self-hosted version. A managed product like Amazon OpenSearch (formerly Amazon Elasticsearch) can work but may require additional configuration depending on its specifics. For example, Amazon OpenSearch only listens for Elasticsearch API requests over HTTPS on port 443 (i.e., it does not expose the default Elasticsearch port 9200), so a TLS-enabled proxy (e.g., Nginx) can enable CirrusSearch to communicate with an Amazon OpenSearch cluster.
Even though the instructions below tell you only to run Composer when installing from git, it may be necessary to issue it anyway to install all PHP dependencies.
Elastica
folder to your extensions/
directory.cd extensions/ git clone https://gerrit.wikimedia.org/r/mediawiki/extensions/Elastica
composer install --no-dev
in the extension directory. (See T173141 for potential complications.)wfLoadExtension( 'Elastica' );
CirrusSearch
folder to your extensions/
directory.cd extensions/ git clone https://gerrit.wikimedia.org/r/mediawiki/extensions/CirrusSearch
composer install --no-dev
in the extension directory. (See T173141 for potential complications.)wfLoadExtension( 'CirrusSearch' );
$IP/extensions/CirrusSearch/README
. Note that all info in it might not apply to your version of the extension, especially the version of Elasticsearch supported.This is an optional step. You will need to install the search-extra plugin for this. Do so by following these steps:
/usr/share/elasticsearch/bin/elasticsearch-plugin/elasticsearch-plugin install org.wikimedia.search:extra:7.10.2-wmf12
LocalSettings.php
file:
$wgCirrusSearchWikimediaExtraPlugin[ 'regex' ] = [ 'build', 'use', 'max_inspect' => 10000 ];
systemctl restart elasticsearch
php path/to/extensions/CirrusSearch/maintenance/UpdateSearchIndexConfig.php --startOver
php path/to/extensions/CirrusSearch/maintenance/ForceSearchIndex.php
Please follow the upgrade instructions in the CirrusSearch UPGRADE file.
ConfigurationThe configuration parameters of CirrusSearch are documented at the "settings.txt" file. See also documentation on CirrusSearch configuration profiles.
Elasticsearch will fail to index for CirrusSearch if one uses a database name for MySQL containing a capital character, e.g., "MyWikiDatabaseName." To mitigate this, CirrusSearch provides the $wgCirrusSearchIndexBaseName
configuration parameter, which one needs to set, e.g., $wgCirrusSearchIndexBaseName = 'mywikidatabasename';
.
CirrusSearch extension defines a number of hooks that other extensions can make use of to extend the core schema and modify documents. The following hooks are available:
CirrusSearch features can be used in API queries. Searching happens via the normal search API, action=query&list=search
; you can use CirrusSearch-specific features, such as the morelike:
special prefix to find pages related to Marie Curie and radium:
api.php?action=query&list=search&srsearch=morelike:Marie_Curie%7Cradium&srlimit=10&srprop=size&formatversion=2
Custom APIs and parameters are provided for querying CirrusSearch configuration and debug information:
action=cirrusdump
module: 2014?action=cirrusdumpcirrusDumpQuery
parameter to Special:Search or search API queries: https://en.wikipedia.org/wiki/Special:Search/cat%20dog%20chicken?cirrusDumpQuerycirrusDumpResult
parameter to Special:Search or search API queries: https://en.wikipedia.org/wiki/Special:Search/cat%20dog%20chicken?cirrusDumpResultcirrusExplain
, can be passed with cirrusDumpResult
to have the Lucene explanation of the score included with the result dump: https://en.wikipedia.org/wiki/Special:Search/cat%20dog%20chicken?cirrusDumpResult&cirrusExplain It can also be used to get the explanation in a human-readable format, by giving it one of the values verbose
, pretty
or hot
, such as: https://en.wikipedia.org/wiki/Special:Search/cat%20dog%20chicken?cirrusDumpResult&cirrusExplain=prettycirrus-config-dump
, cirrus-settings-dump
, cirrus-mapping-dump
, cirrus-profiles-dump
modules to obtain dump from the CirrusSearch setup: api.php?action=cirrus-config-dump&formatversion=2Elasticsearch service can be run with the Vagrant role (cirrussearch
) and MediaWiki Vagrant.
For Docker, you can use a command like docker run -d --name elasticsearch -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" elasticsearch:6.8.2
. Then follow the installation and configuration directions. If your web host is in a container, you'll want to make sure the above container is on the same network, and in the LocalSettings.php
file, you will want to reference the elasticsearch
as the hostname. This will not have the WMF plugins but can be sufficient for basic testing.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4