How do I prevent pages from being crawled?

You can add data-sj-noindex anywhere in a page and it will not be indexed. Most commonly this will be defined in the <head> of an HTML page as follows:</head>

  1. Locate the <head> tag of the page you want to prevent from being crawled.</head>
  2. Add the following code within the <head>:</head>
    <meta name="robots" content="noindex" data-sj-noindex="">
  3. Save the changes. The crawler will ignore this page next time it comes across it.

Additionally you can use crawling rules to programmatically exclude sections or certain pages of your web site. You can also set individual pages to not be indexed from the data sources tab of the admin Console.

How to hide a field in a search interface?

Background

When you generate an interface via console for a Site Search collection, we return title, description, URL, and image(optional) in the search results. In some instances, you might want to hide title, description, or URL.

Limitation

Our default interface uses URL field for click-tracking, and it must be returned in response, otherwise, the click-tracking won’t function. Hence, if you try to remove URL field, it will return an error:


tracking field 'url' missing from result.

Instructions

To hide ‘title’ or the ‘description’ field from the search interface:

  1. Generate an interface from the Integrate section in the console.
  2. After choosing the relevant options, and generating an interface, click on “View code“
  3. Add the “fields” parameter in the values object. See example below which will only return and render ‘title’ and ‘URL’:

values: {"q.override": true, "resultsPerPage": "10","q": getUrlParam("q"), "fields":'title,url'}