Synonyms and spell correction

Autocorrection of spelling mistakes, synonym matching and contextual understanding are critical to our high performance site search. How does your site search engine perform?

Introduction

Language is tricky and people can't spell very well, so often literal meaning can easily be misleading, or lead to nothing at all. The human mind has an amazing ability to fill in the gaps, correct mistakes and understand things that can't even be written (sarcasm, etc). Search needs to try to do these things too, but that's not as easy as it sounds. Many search engines are terribly slow when turning on things like synonyms and "fuzzy search" (more on this below), others don't even support these features.

Sajari offers a range of useful features to cover not only synonyms and fuzzy search, but also additional features to deal with contextual forks (e.g. where text components of a query have ambiguous meaning). Below is a brief outline of these useful search features.



Synonyms

You say "clever", I say "intelligent". You say "cost", I say "pricing". Language is full of things with similar or identical meaning. Synonyms help search engines to translate a query with this in mind. Unlike other search engines, these synonyms don't have to be equivalent and can be weighted as more/less important as well. This means you can control what people see in the results. If you want people searching for "surface pro" to see results with "ipad" above those with "surface pro", you can do that by setting a weight above 1.0 and visa versa. Also, note these are not reversible by default, so you can control bidirectionality.

Synonyms are more than just a language construct, they're also a way to translate your visitor queries to match your content. To help with this we provide statistics on search queries to allow you to see which queries are performing poorly and create synonyms as required. This is incredibly useful when visitor queries are mismatching your content. Consider the search stats below, more searches were made for the term "mesahighdensity" than the term "mesa high density", but the former has zero click throughs. You might think spell checking and fuzzy query matching should fix this automatically, but tradeoffs for speed and coverage must be made, so this may not be corrected. Keep in mind that on other occasions there may also be much less overlap between two terms you want to use as synonyms.

searches need synonyms

Fuzzy spelling

Fuzzy spelling (or approximate string matching) relates to the way people misspell words and phrases and how to autocorrect these efficiently. There are many ways to do this, (some more efficient than others), but all balance speed, memory overhead and correction efficiency. We've spent quite some time on this problem and have open sourced our fuzzy spelling package, which details the accuracy and speed.

Our fuzzy matching algorithm is very fast, in fact it's virtually irrelevant in terms of slowing your searches down, even for larger sites or apps. It's also not a fixed model, it grows and adapts to your content as it is added. This also means it handles any character sequences including jargon and or various languages. Popular words are more likely to be replaced. As soon as the occurrence of a given term exceed a certain threshold, that term is automatically added to your fuzzy dictionary.

Below is an example of a fuzzy match correction of three mistakes.

searches need synonyms



Get started now

Start your 14-day free trial!

Start 14-Day Free Trial →

No credit card required

Some other happy companies using Sajari Site Search
lockheed martin customer logo foxtel customer logo canva customer logo unity customer logo australian institute of family studies customer logo