An implementation of an effective search is one of the most difficult tasks in development, but it's also a key to success of many websites and applications. A quick search and retrieval of relevant results affect the quality of customer service so that visitors can efficiently find the content they seek.
High traffic sites and sites with a difficult internal structure running search queries against a database can affect overall performance. To resolve this problem and reduce a number of queries to the database we can extend the standard Drupal functionality using different search platforms. Also, these solutions provide a more flexible functionality such as a faceted, fuzzy, or reversed search.
In this article we described the most useful and easy-to-integrate with Drupal search engines. Overall, we made a review of strengths and weaknesses of different systems depending on project requirements.
At the site design stage we need to figure out what content site's visitors should be able to find and then figure out the best instrument to implement it depending on our requirements.
Drupal comes with the built-in search module, which may be enough for small websites or websites with a simple structure. But it doesn't provide much customization (for example, you may want to display different kinds of products, depending on a user role, or you may want to introduce a faceted classification). This module can perform poorly on high-traffic and large websites because of all the queries running through a database. For large scale websites with lots of content, this option could potentially eat up your server resources and slow down the website.
For more complicated projects we have one of the most powerful and indispensable modules for Drupal - Search API. It’s a toolset for creating searches on Drupal sites and this module can also integrate other search tools at your site. It has a support of different additional search services, like Sphinx or Solr (a full list of supported projects can be found here), integrates with the Views module and understands Drupal's content architecture well. The Drupal 8 version of the module comes with a lot of changes. At first, Drupal 7 has 2 modules for integration with Solr - one of the most popular search engines: Search API and Apache Solr Search. Now Search API is the only module for integration with Solr in Drupal 8. Also, the new version of the module comes with out-of-the-box search backend that uses its own Drupal database. At this moment this module is used by 7-8 thousands of websites, but it's still at the beta stage, so it will take some time to release it.
The simplest and most undemanding to hosting requirements solution is Database Search API. As it is said on the module page: ‘It is therefore a cheap and simple alternative to backends like Solr, but can also be a great option for larger sites if you know what you're doing’. This module is built for the Search API module search solution. It provides a much stronger search than the out-of-the-box Drupal core search offerings and can be used on any Drupal website and hosting environment.
Google Custom Search
Another good option for a small website is crawler based search engines, for example, Google Custom Search. It’s an embedded search that relies on the third-party Google crawler and sitemap.xml data to crawl through the website. There are several advantages of using this engine.
At first, you don’t need to store a search index because it comes from Google service. Also, you can integrate it with an XML sitemap for better control over an indexed content. Lastly, a searching process tends to be realized fast.
The downside of this approach is that it doesn’t provide control over the search process, for example, for search results order or how results are displayed and this approach doesn't have a faceted or field-based search.
Usually the search results appear in an iframe or other page, but the paid version of Google's Custom Search allows you to develop your own user interface for the search results.
External Search Platform
A search engine that runs outside of Drupal on your web server is one of the most advanced tools for the site search.
Let's say, if you have a couple of thousands of records, it is not scary. If you have a couple of hundreds of thousands or millions of records, a MySQL query will take a lot of time. Caching does not always save a situation. Here a search engine can show itself in all its glory.
On the other hand, these engines may be difficult to deploy if your website is hosted on standard shared PHP/MySQL hosting, some of them may require you to install additional libraries.
Solr - the solution based on Lucene - is one of the most famous search engines, significantly expanding its capabilities. It is a separate enterprise-level server that provides a search web service. Standard Solr accepts documents over HTTP in XML format and returns a result through HTTP (XML, JSON or another format). It fully supports clustering and replication across multiple servers, supports the faceted search and filtering, has advanced configuration tools.
Another solution for the search server is ElasticSearch. It’s a REST based, distributed search engine that is also powered by Lucene library. It offers built-in JSON + HTTP API.
Solr is more oriented at a text search while Elasticsearch is often used for better performance in analytical querying, filtering, and grouping. Comparing them both, Elasticsearch is a better choice for applications that require not only the text search but also a complex time series search and aggregations.
Both search engines support the clustering and distributed architecture. Elasticsearch is simple to scale and it's highly useful when it comes to use cases where large clusters are required. It has a built-in component called Zen that uses its own internal coordination mechanism to handle a cluster state.
Solr supports distributed SolrCloud deployment mode, depends on Apache ZooKeeper.
Summing up all the above, in this article we covered several approaches of organizing a search on a Drupal site.
For small websites we can use built-in Drupal search or a combination of Search API and a database search, which provides us with more functions.
Anyway, Crawler Search Engine may be another good and cheap solution that can reduce a database server load.
For the websites with a lot of content and visitors we can implement an external search backends. Although search engines may be difficult to deploy if your website is hosted on standard shared PHP/MySQL, this problem can be solved with external web hosts.
Among the Drupal community Solr is the most popular search engine, it has a big number of additional modules, for example, for integration with Commerce or Ubercart, autocomplete features, geo search and others. So if you are looking for a turnkey solution, Apache Solr may be a good choiсe for you.
Another good option is ElasticSearch, often used in analytical querying and grouping due to its aggregation and percolation features. A module for integration with Drupal - elasticsearch connector - supports a faceted search, integration with watchdog, autocomplete and location features.