What factors should you consider when selecting a hosted or open source site search solution?
Google may have set our expectations for how search works, but for site search there are a plethora of great options to choose from. In this buying guide, we explain some of the features to evaluate and what to look for when selecting a site search platform.
First, let’s define some terms and present some search data.
Site search (also called on-site search) is a search engine that is confined to a single site, connected sites, or domains. For example, you may want to only search a specific site, or you may have multiple subdomains and a parent www site to search across.
Google, Bing, DuckDuckGo and other search engines are ideal for driving visitors to your site, but for providing a great on-site experience, a site search engine is required.
Unlike general internet search engines, site search engines can do things like:
Here are just a few interesting facts about the importance of good site search:
Site search is a broad term that could refer to website search, on-site e-commerce search, intranet search, and more. For this guide, we are limiting an overview to features primarily for the following three use cases:
With many open source and hosted solutions to choose from, which site search platform is right for you? We explain the factors that go into selecting a site search product below.
What happens as users type in a query on your site search today? Do they get auto-suggestions (like Google and Amazon’s search) or do they get more visual results (like IMDB or AnnTaylor.com search)?
Does your site search know how to handle typos or misspelled words? Does it search only web pages (HTML) or does it also index documents (PDF, DOC, DOCX, etc)?
Site search user experience experience (UX) can vary a lot. By default, most content management systems (CMS) and legacy solutions offer very limited functionality and poor user experience.
Today there are many features for site search to build a great end-user experience for your audience. Let's look at some of the most important ones.
How your site is indexed is important. Most sites can be indexed through search crawlers, but mobile apps or e-commerce sites typically require APIs to connect to SQL and NoSQL data stores. Without getting into all the detail, some questions to ask include: how quickly will this solution index my site? When an edit or change is made, how quickly is that re-indexed? If a new page is added, how does that get added to the index. If the site search solutions that you’re evaluating cannot provide great answers to those questions, it may be time to look elsewhere.
What if your site search engine could connect outcomes like signups, conversions, and sales to the search queries that led to that result? Google optimizes ad results in precisely this fashion — displaying ads with the highest conversion rates. It’s available for site search as well, but to do that your search provider needs to be able to connect to business data (ideally in real time) to continually optimize results based on user behavior.
Nearly 1 in 10 search queries are misspelled. That’s a lot of misspelling! Your site search tool needs to know how to handle misspellings or many of your customers will leave your site mistakenly believing you don't offer what they need. Ideally, a search engine should include typo tolerance — a “did you mean?” feature, fuzzy search spell checking, and/or at least display multiple possible results based on similar keyword searches.
In old school search, users type in a query and hit return to see results. Consumers, however, expect at minimum a Google or Amazon-like search with suggestions, or autocomplete or autosuggest, as they type. Dynamic search suggestions not only provide instant gratification, but they also help sort out misspellings faster (see “spell checking” above).
You say “sneakers,” I say “running shoes.” Your users are typing in search terms that may be different from how your site is optimized. At a minimum, a modern search engine should support synonyms and/or "fuzzy search”. Newer search engines are starting to use vectors, too, a mathematical approach to understanding language that will eventually supersede the need to build synonym libraries. However, until vectors are broadly available, your site search should support a synonym library.
Search suggestions (see above) are a very common search UI feature. Now many companies are using using instant search on their site where search results change in real time as users type in their queries. Instant search works well for more “visual” product sites. It may be worth A/B testing your site search (more on that below) to see what works best for your CTR.
How do you know if your site visitors are finding what they need? Your site search solution needs to have metrics — or at least be able to work with 3rd party solutions like Google Analytics or another business intelligence solution — so you can analyze how users are interacting with search results. This can help you to determine whether certain results need to be boosted, how search trends are changing over time, which queries are returning poor results, and more.
What if, for a given query, you changed results? Would you have better or worse click-throughs, conversions, sales, or user satisfaction? A/B testing can help. Tests could be performed on everything from search terms to how your data has been indexed to the search results design. A modern search solution should feature search A/B testing and provide guidance on what search algorithm helps your company improve its bottom-line results based on whatever criteria you’ve established.
Your site search should have a way to manage how search results are ranked. For example, you may want to exclude documents from being indexed. Or you may want to boost specific products or certain types of content (e.g., blogs) and lower ranking for other types of content (e.g., comments). These are powerful ways to impart business knowledge into your search results.
Speed is an invisible feature, but incredibly important. Amazon showed that just .001 second differences in returning results meant huge losses. It’s not just true for online stores. Your users are accustomed to Google-like speed. Anything less is likely to send them elsewhere. Your search tool or search provider should be clocking-in in milliseconds.
Documents (.doc, .docs, .pdf) available for download on your site — whether they’re gated or not — should be indexable as well. If you don’t want them indexed, that’s something you should be able to manage through rules (more on that below!).
If your search provider doesn’t support machine learning algorithm ranking out of the box, look elsewhere. It’s not a nice-to-have anymore. It’s a must-have, especially for sites with large indexes. Search engines can use reinforcement learning to score search results and user behavior to automatically improve results over time. A smart search engine should know when a user converts (signups, shopping cart, sale, download, or something else) on a search to boost results like that for future queries.
Built-in personalization features and/or connecting to third-party personalization solutions should be on the list as well. Search personalization creates contextual profiles for individual visitors to personalize their search results and display relevant content. Personalize results based on user preferences, location, gender, past purchase history, product type, and more.
Allowing your customers to filter results by price, content type, author, or other factors is very useful especially for sites that have hundreds or thousands of records. Your search platform should support both search facets and filters to help customers narrow down results to find exactly what they need.
Does your search engine support Latin script languages (think English, French, Spanish, etc) and multi-byte symbol-based queries (Chinese, Japanese, etc.)? Moreover, are the machine learning models your search provider includes multilingual? You may not need it today, but when you do, you’ll want a search engine that can scale to meet the needs of new audiences.
Companies like Amazon, Reddit, and Aliexpress can afford to hire full-time search engineers and system reliability engineers (SREs) to run their site search (Amazon alone has over 1000 people in their search team!). Most organizations don’t have the luxury to hire a battalion of engineers. Many others would prefer to leverage their engineering talent in other parts of the business.
There’s a spectrum of needs and where your company and use case falls can largely help make this decision. Every use case is different, so take this only as a way to frame thinking.
Which is best for site search?
✅ SaaS: There’s nothing for you to manage. The service is fully managed so you can sleep better at night.
❌ On-Prem: You are responsible for upkeep. One or more engineers or IT staff required. For search that directly drives revenue (e.g. ecommerce product pages), this also means you need people on call 24-7.
✅ SaaS: Your provider should be set up on elastic infrastructure that can expand or contract to meet demand.
❌ On-Prem: This can be set up on elastic infrastructure and you’ll need to size servers and CPUs accordingly to manage data I/O. There’s more management overhead, but also more control.
✅ SaaS: Hosting company should have a service-level agreement to manage upgrades or outages with an agreed-upon time frame.
❌ On-Prem: Internal SLA plus SLA from your IaaS or PaaS provider.
🟨 SaaS: Depending on the security and privacy level required, many SaaS companies encrypt data (both in transit and at rest) and offer SSO and other security tooling.
🟨 On-Prem: You have a lot more control over how you want to secure your service, but there’s extra costs to do so.
✅ SaaS: All-inclusive pricing.
❌ On-Prem: Infrastructure costs will be smaller, but that is the only thing and this is arguably the smallest of all costs. Engineering, devops, monitoring, staff on-call, etc. will add up fast.
✅ SaaS: Generally speaking, this is where SaaS software shines. Your search provider manages upgrades and improvements.
❌ On-Prem: Your team will need to upgrade software on a scheduled time slot. For many businesses, this happens only once every 6 to 12 months.
🟨 SaaS: Managed for you and should be ready to scale with high availability (HA) and failover.
🟨 On-Prem: With a large enough team, you should be able to manage uptime and build high availability and failover into your architecture. Search is generally equal to or more complex than database management, so typically you should allocate resources as you would with DBAs.
❌ SaaS: If the hosting provider has a community, it can be incredibly valuable especially for new customers.
✅ On-Prem: If you’re using an open source solution, this is where it can really shine. Popular OSS projects have large communities that can provide support, best practices, and troubleshooting.
Related to the question above of hosting or going on-prem is whether the solution should be open source or not. If you choose to use a free open source search engine it typically means you’ll be going on-prem (see above). (To add to your choices, there are open source options, most notably Elastic, that offer SaaS services as well. But beware, many open source companies are rethinking their licenses.)
For some companies who want to use open source and have the capacity to manage it in-house, this is a small price to pay to support open source development. We love open source ourselves and have open sourced many modules.
For many businesses, however, the complexity of managing an open source search engine on premises, the need for high-availability, and the desire to receive enterprise-level support outweigh the value of using free open source software (FOSS) for site search. This is a business question and one that’s worth asking.
Most site search is priced on the number of monthly queries and/or the total number of records stored in the search index. There are a great many other features that may be marked up (for example, some search providers require a premium payment for machine learning). Questions you will want to answer before selecting a provider:
Who will be managing your site search software? Will it be developers only, or will your marketers, SEO specialists, product managers, and others also participate? Who will be responsible for ensuring it’s running the way you want and improving user satisfaction?
In-house configuration of open source software typically requires search engineering, which is difficult even for experienced engineers. For service based delivery the work of maintenance and optimization can potentially fall upon employees without technical skills.
Even after site search is running and optimized, you can’t ignore it. It requires continual optimization to meet the demands of customers on your site. A best practice is to determine your long-term needs and find a solution that will work for your team.
As mentioned earlier, site search security is important if your index contains sensitive data or you’re limiting access to your content to select users. This is a point that goes beyond the question of how to find the right search solution for your team, but it’s important to identify how important it is for your organization before selecting a search platform.
Whew, you made it all the way through. Congratulations. Was this guide helpful? Please drop us a line to let us know.
If you’re considering a new solution, be sure to try Sajari. We offer a fully-functional 14-day free trial, or we would be delighted to provide a live demo to your team.