What is Elasticsearch and why is it used
Posted: Tue Jan 21, 2025 9:04 am
Elasticsearch is a high-performance search provider that supports replication and rebuilding without downtime. It is perfect for CJK and has support for many more languages. It was designed as a search for all types of data (text, numeric, geographic, structured and unstructured). Elastic also supports many different types of queries from Lucene to SQL queries.
Additionally, unlike Examine, Elastic provides great development tools, such as Kibana, that allow you to simulate queries, debug them, and analyze indexes. Elastic is also designed for high availability, which means load balancing, replication, zero-downtime reindexing, and more.
The story behind the project
The idea of building an Examine Elastic provider came when I saw a presentation about search in Umbraco v7 at the Polish Festival 2018, where Ismail Mayat presented a POC of an indexer for v7 that used content crawling. After the presentation, I was looking for a better way to do it, without using external processes to index the content.
I found a few useful sources. First was a package called 'Umbraco. Elasticsearch', created by Phil Oyston, and an article called 'Elasticating Examine - an experimental exam provider', written by Tristan Thompson. Neither of the two solutions brought me satisfaction, as the first one was creating unnecessary logic around Umbraco and reimplementing the Examine provider, and the second one was only for custom indexes and required some changes to work with Umbraco.
The V7 package
For Umbraco v7, I reviewed all available Elastic packages and el salvador email list decided not to use them as a base for my project. In my opinion, this was not a good approach since I was reimplementing indexing, management, and other options that I would just replace in Umbraco instead of still having indexes in two places (Umbraco Examine Files and Elastic Instance).
At the time, the solution proposed by Tristan Thompson was closer to my idea, as it only created a translation layer between Examine and Elastic. I decided to continue my work based on what was already working in that experimental provider.
One of the first changes I made was to upgrade Elastic to 6.5 and start working on allowing indexing of all types of content such as Media, Content, and Members. At that point, everything was working and I decided to start replacing the internal index with Elastic. Here I encountered a few small issues:
Because we were only using published versions of the content, the search was not always relevant to the actual content that was not published.
The index could not display the health status in the Umbraco Backoffice.
All document properties were moved to the properties object, and because of that operation, all properties had to use a prefix in the index field names.
If an implementation were based entirely on NEST, it would not support Umbraco Lucene queries.
I started by solving a problem with the implementation of NEST/Lucene queries, where I decided to expose two forms of query:
Additionally, unlike Examine, Elastic provides great development tools, such as Kibana, that allow you to simulate queries, debug them, and analyze indexes. Elastic is also designed for high availability, which means load balancing, replication, zero-downtime reindexing, and more.
The story behind the project
The idea of building an Examine Elastic provider came when I saw a presentation about search in Umbraco v7 at the Polish Festival 2018, where Ismail Mayat presented a POC of an indexer for v7 that used content crawling. After the presentation, I was looking for a better way to do it, without using external processes to index the content.
I found a few useful sources. First was a package called 'Umbraco. Elasticsearch', created by Phil Oyston, and an article called 'Elasticating Examine - an experimental exam provider', written by Tristan Thompson. Neither of the two solutions brought me satisfaction, as the first one was creating unnecessary logic around Umbraco and reimplementing the Examine provider, and the second one was only for custom indexes and required some changes to work with Umbraco.
The V7 package
For Umbraco v7, I reviewed all available Elastic packages and el salvador email list decided not to use them as a base for my project. In my opinion, this was not a good approach since I was reimplementing indexing, management, and other options that I would just replace in Umbraco instead of still having indexes in two places (Umbraco Examine Files and Elastic Instance).
At the time, the solution proposed by Tristan Thompson was closer to my idea, as it only created a translation layer between Examine and Elastic. I decided to continue my work based on what was already working in that experimental provider.
One of the first changes I made was to upgrade Elastic to 6.5 and start working on allowing indexing of all types of content such as Media, Content, and Members. At that point, everything was working and I decided to start replacing the internal index with Elastic. Here I encountered a few small issues:
Because we were only using published versions of the content, the search was not always relevant to the actual content that was not published.
The index could not display the health status in the Umbraco Backoffice.
All document properties were moved to the properties object, and because of that operation, all properties had to use a prefix in the index field names.
If an implementation were based entirely on NEST, it would not support Umbraco Lucene queries.
I started by solving a problem with the implementation of NEST/Lucene queries, where I decided to expose two forms of query: