Skip to main content
search

Why ElasticSearch Outperforms Algolia for Precise and Scalable Data Queries

By December 23, 2024January 9th, 2025Application Development
Why ElasticSearch Outperforms Algolia for Precise and Scalable Data Queries

Have you ever searched for something and felt like the results were just off – either missing the mark entirely or way too slow? Now, imagine that happening to your customers. That’s the nightmare no business wants, especially when it comes to choosing the right search engine for their platform.

Your customers expect lightning-fast search results that perfectly match what they’re looking for.

But they also need search results they can trust – accurate, relevant, and delivered without a hitch, even as your data grows and queries get more complex.

Here’s the deal: if you’re comparing ElasticSearch and Algolia, you’re likely juggling speed, precision, and scalability. Both are powerful tools, but when it comes to precision and scalability, one consistently comes out on top. (Spoiler alert: it’s ElasticSearch.)

In this article, we’ll unpack why ElasticSearch takes the lead, especially if your goal is to handle complex queries, massive datasets, or scale effortlessly without sacrificing search quality.

This article compares the two platforms and explores the challenges experienced with Algolia, followed by how ElasticSearch successfully addressed them. By the end, you’ll see why so many businesses are leaning toward ElasticSearch for their search and data analysis needs.

Algolia vs Elasticsearch

Algolia vs ElasticsearchAlgolia, a popular search-as-a-service platform, is widely recognized for its lightning-fast response times, making it a favorite for applications where real-time feedback is critical. Its architecture is optimized to prioritize speed, ensuring that users experience instantaneous search results even when handling substantial datasets.

This feature is particularly beneficial for customer-facing applications such as e-commerce platforms and mobile apps, where swift responses significantly enhance user satisfaction.

In contrast, ElasticSearch focuses on delivering a more balanced approach that combines speed with precision and flexibility. While it may not match Algolia’s real-time performance, it excels in scenarios requiring accurate and detailed data handling.

ElasticSearch’s ability to perform complex queries and its customizable indexing options make it a preferred choice for use cases that demand precise results and in-depth analytics, such as enterprise-level applications and backend systems.

Its scalability and adaptability further solidify its position as a comprehensive search solution for dynamic and data-intensive projects.

Both platforms offer distinct advantages, but the choice between Algolia and ElasticSearch ultimately depends on the specific priorities of a project. Algolia’s speed is unmatched for lightweight, interactive applications, while ElasticSearch’s precision and customization make it ideal for tasks requiring reliable data accuracy and advanced search capabilities.

Feature Algolia Elasticsearch
Speed Known for its exceptional speed and low-latency searches, optimized for instant search scenarios, particularly useful in eCommerce and real-time applications. Fast, but performance may vary with complex queries and larger datasets; speed can decrease with heavy aggregation or high concurrency.
Scalability Scalable for moderate-sized datasets; best suited for applications with predictable traffic patterns, like mobile apps and websites. Highly scalable, designed for large datasets; suitable for big data applications, log analysis, and analytics platforms with distributed architecture.
Index Size Limitations Practical index size limits based on the chosen pricing plan; costs can rise significantly with larger indexes, potentially limiting scalability for high-traffic applications. No hard limit on index size; can handle massive indexes effectively, provided the hardware and cluster are configured properly.
Search Capabilities Focuses on speed and simplicity; includes features like instant search, typo tolerance, and basic ranking customization, but lacks advanced querying capabilities found in Elasticsearch. Powerful search features: full-text search, complex queries, filters, aggregations, custom scoring, and relevance tuning, enabling nuanced search experiences.
Cost Paid service with tiered pricing based on usage, query volume, and index size; costs can increase significantly for large datasets (e.g., starting from $29/month for small projects). Open-source and free to use, but costs can increase with self-hosting expenses, hardware, and maintenance; cloud-based options may have pricing based on usage.
Ease of Setup and Maintenance Very easy to set up and use; fully managed SaaS solution with minimal maintenance, making it ideal for teams without DevOps resources. Requires more manual setup, tuning, and maintenance (like cluster health monitoring and scaling), which can be challenging for teams without dedicated infrastructure support.

The Challenges with Algolia

The Challenges with AlgoliaAlgolia has been known for its emphasis on speed in search performance.

But according to their official documentation, this prioritization sometimes comes at the cost of accuracy, particularly when it comes to hit counts.

To enhance the speed of search queries, Algolia may halt the counting process after reaching a certain threshold and instead provide an estimated count for the remaining hits. This can lead to non-exhaustive search results, as the system sacrifices exactitude for performance.

This inability to provide accurate counts for filtered queries was a significant challenge in our project.

The Core Problem: Hit Count Limitations

When we first implemented Algolia for the search functionality, we expected it to handle our dataset efficiently. However, we quickly noticed discrepancies between the actual number of records and the hit count that Algolia returned.

Algolia’s dashboard displayed accurate count values, but when we executed the same queries, the count returned by the search was incorrect.

In one of our tabs, for example, we had around 70,000 to 80,000 records, but Algolia was only returning about 35,000 to 40,000 records.

After checking, we pinpointed the root cause: hit count limitations in Algolia’s search system. We realized that the issue stemmed from the pagination limit imposed by Algolia which affected the accuracy of the hit count.

In other words, when we queried for large datasets, Algolia didn’t return the complete set of records, and consequently, the hit count was inaccurate.

Algolia, by default, imposes certain limits on the number of records it returns per query. The pagination limit in our case was set to 50,000 records, causing incorrect page counts and fewer total hits. When the number of records exceeded this limit, the search didn’t return the expected number of hits.

Tried Approaches

In an effort to resolve this, we explored multiple strategies:

Adjusting Pagination Limits

Our first approach was to adjust the pagination limit, anticipating that this would yield more accurate results for larger datasets.

Since we were getting fewer records than expected, we hypothesized that increasing the pagination limit would allow Algolia to return more records and, hopefully, accurate hit counts. We experimented by raising the pagination limit to values as high as 25 lakh or 30 lakh records.

However, this solution didn’t yield the expected results. Even with increased pagination limits, Algolia still failed to return the correct hit count, and we realized that pagination limits alone weren’t sufficient to address the core issue of count accuracy.

Shifting from Filter to Facet Filter

There are two major aspects of searching: one is exact data found, and the other is partial matches. For instance, if you’re searching for the term “smartphone,” but you only enter “phone,” partial results should still appear. On the other hand, if you search for the full term “smartphone,” only exact matches should be shown.

Algolia provides two functions for this purpose: filter and facet filter. The filter function performs searches based on exact matches, while the facet filter allows filtering based on predefined attributes.

Initially, we used Algolia’s filter function. However, this didn’t give us the accurate count we needed. More specifically, the count results were higher than expected.

Upon reviewing the issue, we hypothesized that the problem might arise from not using the correct filtering functions required to get precise counts.

We observed that Algolia offers a function called facet filter, which is similar to the regular filter function but specifically designed for more precise filtering and accurate counting. Unlike the standard filter, the facet filter directly targets specific attributes and provides an exact count of how many results match those filtered attributes.

We believed they might offer more precision, as they focus on exact values within a dataset and perform better for count aggregation.

We configured facet filter attributes in the Algolia dashboard and replaced our previous filtering methods with facet filters, hoping this would give us a more accurate count.

Despite our efforts, facet filtering also didn’t resolve the issue. The count accuracy still wasn’t on par with our expectations, and the inconsistency persisted. We also encountered inconsistencies in how facet filters were applied, especially when aggregating large datasets.

Reducing the Index Size

With the filtering methods exhausted, we turned our focus to the index size, thinking that the sheer volume of data might be causing the inconsistencies in the hit count. Specifically, we believed that reducing the overall index size might help Algolia process queries more efficiently and return more accurate results.

To test this, we decided to take a methodical approach: removing one store at a time and observing the impact on the accuracy of the hit count.

At the time, the main store contained the highest volume of records, with around 12.5 lakh records out of a total of 22 lakh records across all 12 to 13 stores.

We hoped that by reducing the size of the index to around 15 lakh records, we could resolve the count inaccuracies. We began by carefully removing the data from each store in the index, one by one.

As we removed store data incrementally, we ran the same query and observed at what point the hit count returned to a more accurate value. This allowed us to pinpoint which store or stores were contributing most significantly to the distortion of the hit count.

Unfortunately, even after reducing the index size, the issue persisted. Removing data from smaller stores did not seem to affect the hit count significantly. Upon further analysis, we realized that the issue was confined to the main store, which contained the highest number of records.

Creating a Separate Index for Crucial Data

To solve this, we considered another approach: creating a separate index or table specifically for the crucial data. The idea was that this dedicated index would allow for more precise handling of the large volume of data, potentially resolving the hit count issues in the process.

However, as we began implementing this approach, we encountered several unforeseen challenges that made the solution less feasible than initially anticipated.

One of the primary challenges that arose during this process was pagination. When fetching data from the primary index, we needed to first query the crucial individual index for the data and then merge the results. After merging, we would need to sort the data and add the count.

This approach worked fine for the first page of results, but as we moved to subsequent pages, the complexities of merging the results and maintaining proper pagination made the process unfeasible. This solution could not be implemented effectively.

Another challenge we faced was sorting the merged data correctly and accurately counting the number of results. The process of merging the data from separate indices added a layer of complexity that made both sorting and counting unreliable, especially when dealing with larger datasets.

The result was that we couldn’t guarantee accurate counts after merging data across different indices, especially when dealing with multiple pages.

The separate index solution introduced significant complexity to our query process. Instead of simplifying the system, it created more hurdles in terms of data fetching, merging, and pagination. The effort required to maintain this system made it clear that the approach was not as viable as we initially hoped.

After facing these challenges, we turned to Algolia’s documentation for further insights into the issue. What we discovered was that Algolia acknowledges this issue with hit count accuracy. According to their documentation, the platform occasionally prioritizes speed over accuracy in order to provide faster search results.

This trade-off means that, in some cases, Algolia may return inconsistent or inaccurate hit counts when making repeated API calls. The platform’s design focuses on delivering quick results rather than perfect count accuracy, and this approach can lead to differences in the results each time the API is called.

How Elasticsearch Solved the Problem for Our Client

How Elasticsearch Solved the Problem for Our ClientElasticsearch, offered as a part of Amazon Web Services, is a powerful, distributed search and analytics engine. It’s known for its stronger consistency in count accuracy and better handling of large datasets, especially when it comes to more complex search operations involving pagination and sorting.

When our team initially adopted Algolia for our client’s project, we were impressed by its fast search capabilities, especially given the size of our dataset and the pricing plan we had.

However, over time, we began to encounter significant issues, particularly with the accuracy of search results. This prompted us to consider an alternative – ElasticSearch, which, although slightly slower, offered a much higher level of accuracy and flexibility, making it a better fit for our needs.

By transitioning to Elastic Search, we hoped to overcome the limitations of Algolia and achieve more reliable search results, particularly for large and crucial datasets.

Why We Dropped Algolia

The primary reason for transitioning from Algolia to ElasticSearch stemmed from a critical issue: the count problem. In Algolia, we often got spontaneous results that, while fast, lacked the precision required for our application.

The speed was certainly a benefit, but it came at the cost of accuracy – particularly in filtering search results based on specific attributes like store names, part numbers, or product details. This was unacceptable, as inaccurate search results directly impacted user experience and functionality.

ElasticSearch, on the other hand, operates with a slightly slower response time, but it offers significant improvements in accuracy.

For our use case, where precision and reliability are non-negotiable, the trade-off in speed was acceptable. Here’s why this decision mattered:

  • Focus on Accuracy: Elastic Search delivers precise results, ensuring that our data retrieval meets the expected counts and aligns perfectly with the actual records.
  • Customizable Search Attributes: One standout feature is the ability to tailor search behaviors for individual attributes.

This level of flexibility allows us to define unique search behaviors tailored to the needs of each dataset.

While the initial transition required adjustments to our search infrastructure, we quickly realized that the trade-off was worth it.

Here’s why Elastic Search became the better fit and how its features addressed the limitations we faced with Algolia.

Speed vs. Accuracy

With ElasticSearch, the number of incorrect results in filtering operations was dramatically reduced.

In Algolia, if we wanted to implement searches on specific attributes, we needed to configure the index settings to include or exclude certain attributes from searchability. While this approach worked to some extent, it lacked the flexibility offered by ElasticSearch.

With ElasticSearch, we could go beyond simply including or excluding attributes; we were able to assign custom search configurations like partial search, exact match, or both to individual attributes. This level of customization allowed for much more refined control over search behavior.

For example, when searching for a store name like “Asus,” we could be confident that only stores named exactly “Asus” would appear in the results, with no partial matches (e.g., “Asusa” or other variations).

Keyword vs. Text Search

A key feature of ElasticSearch that we leverage is the ability to define attribute types, such as “keyword” and “text,” which control how searches are executed.

For instance, if the store name attribute is set to “keyword,” ElasticSearch will only return exact matches for that field. This ensures that when we search for a specific store name like “Asus,” only results with that exact term appear.

On the other hand, if we set an attribute to “text,” ElasticSearch performs a partial search. This is particularly useful for fields like product descriptions or slugs, where we want to match substrings or variations of a term. By using the correct attribute type for each use case, we can balance between partial and exact matching depending on the nature of the data.

For example, if we need to search for an optional part number and we want to ensure exact matches (e.g., searching for “32” or a hyphenated value), we set that attribute to “keyword” and apply the appropriate search configuration.

This setup ensures that users can find specific results without unwanted partial matches, making the search experience more precise and user-friendly.

Flexibility in Future Implementations

Beyond accuracy, ElasticSearch offers a much more comprehensive and flexible solution than Algolia. As we continue to scale and refine our search needs, ElasticSearch provides the foundation for implementing more advanced search features in the future.

The ability to fine-tune search configurations, combine different types of searches, and easily expand the system as our data grows is a significant advantage.

One of the most compelling aspects of ElasticSearch is its well-established presence in the market. Unlike Algolia, which may not have the same brand recognition or robust ecosystem, ElasticSearch is widely adopted and trusted across industries for managing large-scale search applications.

This means we can rely on its long-term stability and scalability, knowing it will evolve to meet future needs without compromising on accuracy or speed.

Pagination and Data Integrity

Another key factor in our decision to move away from Algolia was its inability to provide the accurate record count needed for pagination.

In a typical search application, if the count of matching records is incorrect, pagination can break down, leading to poor user experience.

For instance, if there are only 22 relevant records, but the system reports 40, users could end up on the wrong page of search results, creating confusion and frustration.

ElasticSearch handles this challenge by providing precise record counts, which are fundamental for maintaining correct pagination. This means we can confidently paginate through results without worrying about discrepancies in the total number of records.

The Transition Process

Transitioning from Algolia to ElasticSearch was not without its challenges, but the process was relatively smooth.

We used similar query configurations in ElasticSearch as we did in Algolia, ensuring a familiar setup for the development team.

Additionally, we created lambda functions to upload data from our primary database to both the Algolia and ElasticSearch databases, streamlining the process of migrating our search infrastructure.

ElasticSearch also provided tools to manage data indexing and search configurations, allowing us to maintain consistent search quality during the transition. As a result, we experienced minimal disruption in search functionality and were able to improve the accuracy of our search results without significant downtime.

Final Thoughts

In our transition from Algolia to ElasticSearch, we made a clear decision to prioritize accuracy over speed, and it made all the difference.

To sum up, while Algolia offered fast search capabilities, its lack of accuracy made it unsuitable for our needs. ElasticSearch, with its focus on precise search results, flexibility, and scalability, proved to be the better solution for our application.

By leveraging ElasticSearch’s advanced search configurations, such as keyword and text attribute types, partial and exact match settings, and its robust filtering capabilities, we were able to deliver a more accurate, reliable, and user-friendly search experience.

By switching to ElasticSearch, we gained full control over how our client’s data is indexed and retrieved, which allowed us to overcome the limitations we faced with Algolia. This shift has enabled our client to not only meet our current needs but also prepare effectively for future growth.

For organizations dealing with large datasets and requiring precise search functionality, ElasticSearch is the clear winner. It’s flexible and scalable and provides the kind of control needed for complex queries and growing data needs.

Whether you need to handle multi-faceted queries, custom relevance criteria, or specialized data types, ElasticSearch provides the flexibility to ensure accurate results every time.

On the other hand, Algolia left us feeling boxed in. Its limited flexibility created challenges:

  • We found ourselves constrained within the predefined framework of Algolia.
  • Adapting to new requirements or implementing complex functionality was either impossible or highly restrictive.

Switching to ElasticSearch removed these barriers entirely.

If you’re facing similar challenges and need a search engine that gives you full control over data indexing and querying, ElasticSearch might be the solution you’re looking for. Reach out today to discuss how we can help you implement a custom search solution that scales with your business needs.

Raj Sanghvi

Raj Sanghvi is a technologist and founder of BitCot, a full-service award-winning software development company. With over 15 years of innovative coding experience creating complex technology solutions for businesses like IBM, Sony, Nissan, Micron, Dicks Sporting Goods, HDSupply, Bombardier and more, Sanghvi helps build for both major brands and entrepreneurs to launch their own technologies platforms. Visit Raj Sanghvi on LinkedIn and follow him on Twitter. View Full Bio