Pipelines are configurable server side templates governing how queries and records are processed. Query pipelines can be used to change the ranking algorithm, while record pipelines allow records to be modified before they are saved. Both allow complex functionality to be simplified at an API level.
Combine any API functionality into a pipeline
Compare different algorithms to understand performance
Complexity transformed to a single pipeline function
Query pipelines take input information and process it in a series of
actions to create a search query, which is then executed to return search results. In terms of the different actions available in a query pipeline, the key ones are:
You can completely customise the ranking algorithm to use different boosts, make fields more important, add filters, control index learning influence and more.
You can control non-ranking-algorithm query settings, such as pagenation, etc
Take input data and transform it to edit or create additional data, such as if / when to autocomplete a query. In general any query understanding actions are transforms of some sort, for example classifying or vectorising a query.
Optimising ranking algorithms is hard work. In many situations a small change can improve some queries while negatively impacting others. It's not always obvious what to change without a large set of performance data to learn from and even with the data it's still not easy. Pipelines allow us to template some of the more complex functionality based on what we've learned. They are also designed to help simplify our API, which is fairly complex.
Record pipelines allow record information to be transformed (modified or added to) before the record is saved. These are incredibly useful for extracting information for later use in query ranking algorithms, some examples include classification, vectorisation, tagging, entity extraction and summarisation. Inbound records may also be used to teach spelling correction and more.
Some record pipelines are similar to query pipelines. By using the similar transforms for both records and query pipelines it is possible to compare these values in the ranking algorithm and boost based on the result. For example, the classification (class) of a record and a query will often match or not match for good and bad results accordingly. This is therefore useful for extracting higher level information for use in ranking results. A simple real life example of this is to determine if a query is product related or support related and to rank search results accordingly.