The client is an international IT company that develops software for creating technical documentation.
There was so much information on the client’s website that it needed a site search. The client provided a detailed specification for developers, which covered a significant part of the requirements for formats, functioning, search results, and the Drupal CMS itself.
Search on a Drupal site
Often, the Search API module is used to configure search on a Drupal site. This module serves as an interlayer between Drupal and various search engines, including Apache Solr, which the ADCI Solutions team uses on projects most often. The Search API has a lot of settings for both the Drupal site and Apache Solr. The latter takes into account what information is important to store so that the search functions well.
In such cases, Drupal developers’ work comes down to installing the module and configuring standard parameters and the server. It looked like the work on this project would follow the same pattern. But our assumption was wrong.
The client gave us access to the development, stage, and production servers. To set up Apache Solr on all three servers from scratch, our team lead tried on the role of a DevOps engineer. In addition, it was crucial to choose the right version of Solr: Solr 7 was more trustworthy, but Solr 8 agreed with Drupal the best. We eventually settled on Solr 8.
The brand new task and its initial conditions did not let us complete the job quickly, and the work dragged on for a couple of dozen hours. On the bright side, now we have the necessary skills to get it done in 2-3 hours on a similar project. But one way or another, the time spent on a task depends on the circumstances.
As we’ve mentioned, the spec explained how the search should work in great detail. But several points were still missing.
A site search consists of two major parts: the search box in the website header with a magnifying glass icon and the page with search results.
There are also search categories to filter general search results.
A search result includes a title, a snippet with the keyword and its derivatives in bold, and breadcrumbs.
The site has several language versions. The search results depend on the language version the person is using: searching for German words in the English version will not yield anything, and vice versa.
Breadcrumbs have become the number one problem. In the design layout, they were very long and bled off the page. The client had an idea to hide the part that did not fit on the page with an ellipsis. But each part of the breadcrumbs was supposed to be clickable; if we had hidden the part that didn’t fit, the user would not understand where they were and would not be able to go to the next section. We successfully explained this to the client and set up line breaking using CSS.
Search results page
According to the technical specifications, a search result summary had to include the keywords from the query, and the keywords had to be put in bold. Initially it didn't work that way. The summary was generated by the Search API that crawled the page from top to bottom and pulled out pieces of text containing the requested word. Sometimes the summary included the first part of a sentence containing the word, but the word itself was not visible. In the Search API, the PHP highlight processor was responsible for this, but we developed a custom module, pasted the PHP file into it, and configured the highlighting mechanism as we needed.
Search results sorting
We arranged for the search results to be sorted by relevance. Materials with the keyword in the heading ranked highest, followed by results with this word and its derivatives in the article body. For this sorting, we used Ngram, a plugin for Solr. It recognizes words derived from the query, which makes the search deeper and more complete.Search box
Using the Boosting results procedure, each search result is assigned its own place in the search results. The position is calculated by multiplying the boost factor, which is set by the developer, and the relevance score of the content piece, which is set in Solr. The higher the boost factor value, the higher the search result will rank.
With the exception of Solr issues that do not affect search results, our team carried out the work successfully.