Developer Guide

Get started

Welcome to Sajari. This guide will help you get started setting up search for your app or e-commerce store. If you are looking to add search to your website, using a crawler to index your content, head over to our getting started guide for websites.

Pick your data-set

First, you will need to decide what data you want to use. You can download an example data-set provided by us or bring your own data.

Top 1000 movies from IMDB

It’s best to explore Sajari with a data set you are familiar with. If you aren’t bringing your own data, we are sure you will be familiar with some of the movies on the IMDB top 1000 list.

Download the example data

Bring your own

Alternatively you can explore Sajari with your own data. The data must be available as a JSON file without nested fields.

For example, a record should look like this:

{
  "show_id": 80192018,
  "type": "Movie",
  "title": "Star Wars: Episode VIII: The Last Jedi",
  "director": "Rian Johnson",
  "cast": "Mark Hamill, Carrie Fisher, Adam Driver, Daisy Ridley, John Boyega, Oscar Isaac, Andy Serkis, Lupita Nyong'o, Domhnall Gleeson, Anthony Daniels, Gwendoline Christie, Kelly Marie Tran, Laura Dern, Frank Oz, Benicio Del Toro, Warwick Davis, Noah Segan, Jimmy Vee, Joonas Suotamo, Joseph Gordon-Levitt, Tim Rose, Paul Kasey, Matthew Sharp, Adrian Edmondson, Amanda Lawrence, Justin Theroux",
  "country": "United States",
  "date_added": "June 26, 2018",
  "release_year": 2017,
  "rating": "PG-13",
  "duration": "152 min",
  "listed_in": "Action & Adventure, Children & Family Movies, Sci-Fi & Fantasy",
  "description": "As the remnants of the Resistance flee Kylo Ren and the First Order, Rey seeks out Luke Skywalker – but he wants nothing more to do with the Force."
}

Set up your collection

Create your account

Head over to https://www.sajari.com and sign up. If you already have an account, sign in with your existing account.

Create a new collection

Collections store the records that you want to search through.

They also contain the configuration associated with your data including pipelines, rules, synonyms, authorized domains, and analytics. Each collection has an associated schema that designates field names, field types, and whether a field's data is indexed for text search.

Click the Join Beta button to create a new App or Store collection.

Join beta

Uploading your data

After entering a name for your collection, you can upload your data set. Simply drag and drop the movie data set or your own file onto the screen to upload it.

Upload data

Click Generate schema. Once the file is uploaded, Sajari will infer the schema from the data you just provided.

Verify your schema

Although Sajari will do it’s best to identify field types, lists, and unique fields, it’s important that you verify that the fields are correct. If the fields don’t match the structure of the records you are uploading, the records will be rejected.

Verify schema

Select searchable fields and train query suggestions

Select which fields you want to use for searching, and rank them in order of priority. The order will determine the weight assigned to each of those fields for the initial configuration of the search algorithm.

Query suggestions are typically a subset of the searchable fields and will be used to train the suggestions Sajari makes when users type a search query into the search box. Common fields here titles, product names, companies, brands, or categories.

Congratulations 🎉 You’ve completed the initial setup!

Optimize your search pipeline

In Sajari, you configure your search algorithm using pipelines. Pipelines are easily configurable YAML-based scripts that define a series of steps that are executed sequentially when indexing a record (record pipeline) or performing a query (query pipeline). The configuration of an intelligent search algorithm can be extremely complicated. Pipelines break down this problem into smaller pieces that can be easily mixed, matched, and combined to create an incredibly powerful search experience.

Use the realtime relevance editor to edit the pipeline generated for the collection.

Realtime relevance editor

In the editor you can find the query pipeline that was generated by the onboarding process. A few things have already been set up for you automatically.

At the top of the file are a number of default steps including filters and pagination that allow you to customize those aspects of the search. Additionally, steps for spelling and synonyms have already been added and will work out of the box.

Lastly, a number of boosts have been added to account for the weighting you gave different fields in the searchable fields setup step.

Learn more about pipelines

Boosting popularity

Let’s use an example search with the term star wars (assuming you selected the movie data set above, if not, simply follow along with an example matching your own data set).

The search preview will show the following results:

Results before boosting optimization

Let’s be honest, the likelihood that somebody searching for star wars is interested in “Star Wars: Episode I - The Phantom Menace” is quite low 😉.

We can fix that by taking popularity into account and give more weight to movies that match the search term but have higher popularity.

Simply add the following step to your pipeline anywhere before the postSteps: section.

- id: range-boost
  consts:
    field:
    - value: popularity
    score:
    - value: "0.4"
    start:
    - value: "0"
    end:
    - value: "100"

The above step simply adds boosting for a particular value range to a specified field. In this case, the field is popularity and we give it an overall score of 0.4. However, that score will only be reached if the popularity equals or exceeds the end value of 100 and scales linearly from the start value of 0.

Once we add this step, we can see how the results instantly change to the following.

Results after boosting optimization

Looking a lot better!

Range boost is only one of many powerful steps that can be used to improve results and ultimately create a better search experience for your customers. Using live relevance editing and seeing results in real-time makes it easy to explore the different steps and understand the impact a change will have on search results.

API Reference

The REST API enables you to sync your data continuously with Sajari.

https://sajari.com/docs/api-reference

If you want to explore and play with the API, download the OpenAPI Spec and import it into a tool like Postman.

SDKs

We are currently working on a Node SDK and have plans to add support for .Net and PHP soon after. Please let us know if you require an SDK in a particular language.