Optimize Search Results

Preview Results

The Preview section enables you to check if specific records are indexed and test different search queries.

The query editor has two modes, a simple and an advanced mode. If all you want to do is run a search query and check if the expected results are returned, the simple search mode is all you need.

In the advanced mode (⋯) you can run more complex queries, test different Pipelines and versions of those Pipelines.

Advanced query editor

Advanced query editor

Default Pipelines

When you create a Site Search collection, we create the following pipelines by default:

  • Raw: The raw pipeline ranks results without applying boost rules, synonyms, or any other optimizations. It ranks results based on the index score only, meaning the result that has the best match based on the content alone. We recommend the Raw Pipeline for testing purposes only. It can help understand the impact of other Query Pipeline steps, such as boost rules or spelling correction.
  • Website: The website Query Pipeline is the default Site Search Pipeline. It adds a few steps including promotions, synonyms, and spelling correctiion. These steps improve the search experience for a standard website and allow for promoting specific pages.
  • Recent: The recent Query Pipeline adds the same steps as the Website Pipeline with one addition. It ranks recent records higher using 'datePublished' field. This is useful for news sites or blogs, where current information is generally more relevant than older information. Note that the Recent Query Pipeline only works if the webpage contains the meta field datePublished as defined by schema.org.
  • Popular: The Popular Pipeline contains the same steps as the Website Pipeline with the addition of a Popular step. It ranks popular records higher using the page views data. Use this Pipeline for websites with evergreen content where popularity is relevant, such as a recipe website or a movie database. Note that the pingback code must be set up to record page views.

Each Query Pipeline is optimized for different types of websites. Depending on your use-case you might choose the regular Website Pipeline or the Recent Pipeline which boosts recently created content.

Note: The fields (title, URL, etc) of a search result may differ for each pipeline depending on how its configured. The RAW pipeline will display all fields by default.

Query options

The advanced editor allows you to create more sophisticated queries using JSON syntax. Aside from specifying the search term, you can specify: the fields to be returned, filters to limit results, how many results should be on each results page and which page of results to return.

PropertyDefaultDescription
q""The search query.
filter""Filter expression to limit results. See below for more details on filters.
resultsPerPage"10"Number of results to show per page.
page"1"The result page to be shown. For example 3 would should the 3rd page of the results.
fields""What fields are shown depends on the Pipeline configuration. Use "" to show all fields.

Example - Running a blank query

{
 "q": "",
 "filter": "_id != ''"
}

The above example will return:

  • results for a blank query

For those results, the response will:

  • show only the fields specified by the pipeline (e.g. website pipeline returns title, description, url, and image)
  • with 10 results per page (default)

Note: The filter "filter": "_id != ''" is required if you are running a blank query (i.e. "q":"").

Example - Filter records using operators

{
 "q": "mars",
 "filter": "domain='www.sajari.com' AND dir1='blog'",
 "fields": "",
 "resultsPerPage": 5,
 "page": 2
}

The above example will return:

  • results matching the search query "mars"
  • results filtered by the domain "www.sajari.com" AND the subdirectory "/blog"

For those results, the response will:

  • show all available fields (Click on "Expand All" to un-collapse)
  • with a limit of 5 results per page
  • starting on page 2 (results 6-10)

Example - Filter records based on a timestamp field

{
 "q": "",
 "filter": "published_time>'2018-01-20'",
 "fields": "title,published_time"
}

The above example will return:

  • all results that have been published after 20th January, 2018.

For those results, the response will:

  • only show title and published_time field

Note: You can also use UNIX and RFC3339 formats. View this tool to convert time/date into Unix format.

Example - Check which records are affected by boost rules

{
 "q": "",
 "fields": "url, boost",
 "filter": "boost!=50"
}

The above example will return:

  • all results that have boost value other than 50.

For those results, the response will:

  • only show URL and boost values

Note: When a website is crawled, a boost value of 50 is assigned by default

The properties used in the examples above are specific to a Site Search collection. These properties might be different if you are using a custom collection.

Filters

Filters are used to limit the results that are returned with a search. In a search interface Filters are commonly seen as tabs or checkboxes and sliders on the side.

To make filtering easier, our crawler extracts common fields when it crawls web pages (such as the first and second directories of URLs).

Aside from commonly used fields like title, description and og:image our crawler also extracts other fields which can be useful for filtering. Here are some examples that assume that the page URL is https://www.sajari.com/blog/year-in-review:

  • url The full page URL: https://www.sajari.com/blog/year-in-review
  • dir1 The first directory of the page URL: blog
  • dir2 The second directory of the page URL: year-in-review
  • domain The domain of the page URL: www.sajari.com
  • lang The language of the page, extracted from the <html> element (if present).

Using Operators

When querying a field, there are a few operators that can be used. Note, all values must be enclosed in single quotation marks, i.e. "field boost must be greater than 10" is written as boost>'10'.

OperatorDescriptionExample
Equal To (=)Field is equal to a value (numeric or string)dir1='blog'
Not Equal To (!=)Field is not equal to a value (numeric or string)dir1!='blog'
Greater Than (>)Field is greater than a numeric valueboost>'10'
Greater Than Or Equal To (>=)Field is greater than or equal to a numeric valueboost>='10'
Less Than (<)Field is less than a given numeric valueboost<'50'
Less Than Or Equal To (<=)Field is less than or equal to a given numeric valueboost<'50'
Begins With (^)Field begins with a stringdir1^'bl'
Ends With ($)Field ends with a stringdir1$'og'
Contains (~)Field contains a stringdir1~'blog'
Does Not Contain (!~)Field does not contain a stringdir1!~'blog'

Combining expressions

It's also possible to build more complex filters by combining field filter expressions with AND/OR operators, and brackets.

OperatorDescriptionExample
ANDBoth expressions must matchdir1='blog' AND domain='www.sajari.com'
OROne expression must matchdir1='blog' OR domain='blog.sajari.com'

For example, to match pages with language set to en on www.sajari.com or any page within the en.sajari.com domain:

(domain='www.sajari.com' AND lang='en') OR domain='en.sajari.com'