Elasticsearch: Difference between revisions

From Freephile Wiki
No edit summary
describes a couple of major feature points of Elasticsearch
Line 10: Line 10:


== About ==
== About ==
Elasticsearch is a distributed RESTful search engine built for the cloud. Features include:
Elasticsearch is a distributed RESTful search engine built for the cloud. See https://www.elastic.co/about


* Distributed and Highly Available Search Engine.
== Features ==
** Each index is fully sharded with a configurable number of shards.
See [[mw:Help:CirrusSearch]] for help on how to best use the search functionality (including regex searches).
** Each shard can have one or more replicas.
 
** Read / Search operations performed on either one of the replica shard.
# Different indexes are created for the entire contents of the wiki. Each index is weighted differently. So, for example, "Lead-in" text is the wikitext between the top of the page and the first heading. Words found here are deemed more relevant to a users search query than the same word if found in the body text of an article. So, in this wiki, [{{fullurl:Special:Search|search=yaml|fulltext=Search}} searching for the word "YAML"] puts the [[Ansible]] article ahead of the [[Eclipse]] article in search results.
* Multi Tenant with Multi Types.
# Content as well as all files uploaded into the system are indexed. For example, [{{fullurl:Special:Search|search=fai|fulltext=Search|profile=all}} a search for "FAI"] lists both the [[Cloning]] article as well as the [[:File:Fai poster a4.pdf|PDF file]]  And the file is not listed only because of the file name, but also because of the (indexed) file content.  [{{fullurl:Special:Search|search=ed%20roman|fulltext=Search|profile=all}} A search for "Ed Roman"] will bring up the Enterprise Java Beans Design Patterns PDF file ([{{fullurl:File:Ejbdesignpatterns.pdf|page=13}} see p. 13 where Ed Roman is mentioned].)
** Support for more than one index.
** Support for more than one type per index.
** Index level configuration (number of shards, index storage, ...).
* Various set of APIs
** HTTP RESTful API
** Native Java API.
** All APIs perform automatic node operation rerouting.
* Document oriented
** No need for upfront schema definition.
** Schema can be defined per type for customization of the indexing process.
* Reliable, Asynchronous Write Behind for long term persistency.
* (Near) Real Time Search.
* Built on top of Lucene
** Each shard is a fully functional Lucene index
** All the power of Lucene easily exposed through simple configuration / plugins.
* Per operation consistency
** Single document level operations are atomic, consistent, isolated and durable.
* Open Source under the Apache License, version 2 ("ALv2")


== Video ==
== Video ==
* [https://vimeo.com/136326424 Building Elasticsearch: From Idea to {code} to Adoption] The back side of a napkin, a pen, and a few beverages are often the ingredients that yield good ideas. Elasticsearch had a different origin. It started with a need for a simple search box for a collection of recipes. '''Shay Banon''', creator of Elasticsearch and CTO at Elastic, shares the history behind pushing the code for his first open source project that led to the creation of Elasticsearch and it�s rapid adoption by users worldwide. -- ''RISE | August 2015''
* [https://vimeo.com/136326424 Building Elasticsearch: From Idea to {code} to Adoption] The back side of a napkin, a pen, and a few beverages are often the ingredients that yield good ideas. Elasticsearch had a different origin. It started with a need for a simple search box for a collection of recipes. '''Shay Banon''', creator of Elasticsearch and CTO at Elastic, shares the history behind pushing the code for his first open source project that led to the creation of Elasticsearch and it�s rapid adoption by users worldwide. -- ''RISE | August 2015''
* https://www.elastic.co/about


== Elasticsearch for MediaWiki ==
== Elasticsearch for MediaWiki ==
Line 124: Line 105:


== Resources ==
== Resources ==
* See [[mw:Help:CirrusSearch]] for help on how to best use the search functionality (including regex searches).  In a nutshell: the search index is faster, and more complete.
* See [[mw:Help:CirrusSearch]] for help on how to best use the search functionality (including regex searches).
* https://phabricator.wikimedia.org/diffusion/ECIR/browse/master/CirrusSearch.php
* https://phabricator.wikimedia.org/diffusion/ECIR/browse/master/CirrusSearch.php
* https://git.wikimedia.org/blob/mediawiki%2Fextensions%2FCirrusSearch.git/HEAD/README
* https://git.wikimedia.org/blob/mediawiki%2Fextensions%2FCirrusSearch.git/HEAD/README

Revision as of 01:07, 31 January 2016