Top five Apache Solr alternatives for highly scalable site search.
As a legacy search solution, Solr is one of the top choices on the market. But newer technology is poised to challenge Solr's marketshare. In this article, we'll walk through why companies choose Solr and share out some more modern search alternatives.
The Apache Solr project, based on Apache Lucene, was originally created by CNET to provide full-text search across the company’s massive media database. Since 2004, Solr has seen many iterations and improvements and has built an enormous community of contributors. It’s a powerful and scalable search engine written in Java with a full complement of libraries for C#, PHP, Python and other languages, and offers HTTP REST-like APIs with support for both XML and JSON.
For larger use cases with a team of dedicated search engineers, Solr is a solid search software solution. But for businesses that want to allocate engineering resources differently, Solr’s strengths — scalability, configurability, extensibility — can also be liabilities. Even small projects can require days of engineering to get up and running. However, today, alternative search solutions built on cloud-native architecture can offer the same degree of configurability and scale in much less time.
Solr can be set up either in standalone mode or in SolrCloud mode. As the name implies, SolrCloud offers more robust features to help businesses scale including index replication, load balancing, failover, and distributed queries with the help of ZooKeeper.
However, many organizations prefer to run Solr in standalone mode because they don’t have the need for heavy indexing or have high-volume use cases. Using Solr in standalone mode requires manual configuration for search features such as failover and index replication, additional nodes for high availability, sharding, and more (how much you require depends on the use case, of course).
Companies with plenty of in-house engineers can afford to manage Solr infrastructure on-premises. Others have outsourced the problem by using Solr-as-a-service solutions or by partnering with cloud providers for managing Solr on shared resources.
Regardless of which direction you go, after you have determined the best Solr hosting solution for your business, the question remains whether Solr is the best search engine for your use case.
Solr was built for enterprise use cases with very specific edge cases. So, it makes perfect sense for a company like MorganStanley in a highly regulated space to use a tool like Solr for custom solutions. However, for most businesses, adding search can be much easier and less resource-intensive.
Solr is overly-configurable for more mainstream use cases. Even for mission-critical search use cases, there are arguably better solutions today.
As stated at the beginning of this article, many of Solr’s strengths are also its weaknesses:
If you’re building the next DuckDuckGo or have a need for deploying on-prem for security, Solr is certainly worth a look. For most site search, ecommerce search, and app search use cases, there are many other options. If Solr has become too bulky, too much to manage, or showing its age, it’s time to look for alternatives.
And speaking of that, let’s have a look at some Apache Solr Alternatives.
Sajari is a site search engine built from the ground up for developers. It offers tremendous flexibility and ease of configuration built on top of a cloud-native architecture for elastic scale. Projects that can take weeks or months with Solr can be accomplished in hours or days on Sajari without the need for a battalion of engineers. Because it is fully-hosted and battle-tested with thousands queries per second, you can spend more time working on your core business without having to manage search scale.
Sajari features include:
Best use cases:
Sajari approaches search differently from legacy search engines. Whereas legacy platforms like Lucene have built immutable search indexes, Sajari treats search more like a database, which offers some advantages in near real-time read/write speed and data synchronization. It also has built-in machine learning —and more specifically, reinforcement learning — for continuous improvement of search performance.
Additionally, Sajari has taken a different approach to configuration and extensibility, moving configuration from config.xml files to a core, built-in feature called pipelines. Pipelines are YAML-based scripts that define a series of steps which are executed sequentially when indexing a record (record pipeline) or performing a query (query pipeline). With pipelines, you can configure the search algorithm to improve search relevance or even A/B test different algorithms to determine which one provides the best search experience.
There is perhaps no more similar alternative to Solr than Elasticsearch. Like Solr, Elasticsearch is another API product built on the same Lucene core and also available as an open source project (but be mindful of Elastic's OS license changes). Elasticsearch is a specialized search engine that has built a massive community around logging analytics projects with its popular ELK stack.
Like Solr, Elasticsearch offers tremendous flexibility. For best results, it requires teams of specialist engineers who have the time, resources, and capabilities to eke out higher performance or develop custom features. Elasticsearch is built for scale and ideal for projects that generate massive amounts of data like log analysis — this is where lucene based search shines as log data does not change. Unlike Solr, Elasticsearch is much easier to configure and search can be up and running quickly for basic search projects.
Elasticsearch features include:
Best use cases:
Available both as a free open source download or fully-hosted through Elastic or other providers (including AWS), there’s a large number of options for getting a project started. Elasticsearch is a great fit for logging and analytics projects, but less so for a “pure” search engine use case such as site search or ecommerce search where it’s overly complex and requires a tremendous amount of expertise.
Like Sajari, Algolia is a new search engine built from the ground up. Originally, Algolia was developed for mobile search use cases, but has since been extended to more traditional search projects. Algolia can boast about its retrieval speed; it’s milliseconds faster than the competition. Those few milliseconds won’t matter for most use cases, but if speed is important, Algolia is worth a look. As a fully-hosted product, Algolia also eliminates the need for cluster management.
Algolia features include:
Best use cases:
Algolia has quickly grown into a major player because of how simple and easy it is to get started. It’s a great general purpose search engine. But, it has its critics too, particularly around pricing and complexity for managing custom rules and configurations. For example, anytime Algolia re-indexes the database — such as for A/B testing — it counts against monthly search queries quota. Features such as machine learning are add-ons that also cost more. It's ranking algorithm is a simple tie-breaking algorithm, which is easier to understand but also less flexible and powerful than other solutions on the market.
The major cloud providers now offer many alternatives to Solr including search engines, often built on Elastic or Lucene, such as Microsoft’s Azure Cognitive Search or open source hosted search, such as Amazon Elasticsearch Service. If you go the cloud provider route, you’re going to select the one you’re already working with.
Cloud providers offer both private and public hosted search solutions. If your app is hosted in one of these providers, then it might be worth considering them for your search service as well. Co-locating your search service with your app makes a lot of sense for reducing latency.
The pros and cons of each cloud service provider and software vary a lot. But they have some similarities:
If you want to build a search application on top of Solr and get additional tooling and support, there’s Lucidworks. Lucidworks offers enterprise-support for Solr deployments on public and private cloud instances. If you’re sold on Solr and need support, Lucidworks is worth a look.
Lucidworks features include:
So, if Lucidworks is built on Solr, has a team which includes Solr contributors, and offering Solr solutions, why is it #5? Unlike all the other vendors on this list, Lucidworks does not have a self-service option. It’s a true enterprise platform. It’s solution add-ons are customizable components that will still require a good deal of engineering or a professional services engagement to piece together. This will appeal to some organizations — especially those who are building on-premise applications — but could be achieved more quickly and at less cost through some of the new platforms mentioned above.
No credit card required.