Based on our record, Vespa.ai should be more popular than Xapian. It has been mentiond 19 times since March 2021. We are tracking product recommendations and mentions on various public social media platforms and blogs. They can help you identify which product is more popular and what people think of it.
Recoll is free/open source (GPL) that can index PDFs and search them very quickly. It uses Xapian under the hood. I have over 165,000 documents indexed on an old laptop running Linux and can query them all in a split second. Source: 7 months ago
+ xapian which has been around a while, and while gpl licensed, is quite capable https://xapian.org/. - Source: Hacker News / over 1 year ago
Tangentially related if you need search without the clustering and high availability story of elastic search and friends I highly recommend Xapian. Its like the SQLite of search. Single library that provides the basic set of features you would expect in a quality search experience: facets, ranked search, boolean operators, stemming etc etc. https://xapian.org/. - Source: Hacker News / over 1 year ago
For fast searching, it usually requires indexing the files in question. There are a number of text-file indexing solutions, many of which use xapian, sphinx, or lucene/solr under the hood. Based on conditions (watching files/directories, cron jobs, new-mail triggers, etc), they'll add/remove files to the index, and you can then use a corresponding command to compose queries across that data. If it's indexed, it... Source: about 2 years ago
There is also xapian/recoll https://xapian.org/ which works great for "desktop" search. - Source: Hacker News / over 2 years ago
If you're serious about scaling up, definitely consider Vespa (https://vespa.ai). At serious scale, Vespa will likely knock all the other options out of the park. - Source: Hacker News / 24 days ago
Yahoo released their geographic data catalogue under open license and it still lives on as https://whosonfirst.org/ Afaik https://en.wikipedia.org/wiki/Apache_ZooKeeper started at Yahoo https://vespa.ai/ was Yahoo's search engine for news and other content product, now spinned off (https://techcrunch.com/2023/10/04/yahoo-spins-out-vespa-its-search-tech-into-an-independent-company/). - Source: Hacker News / 3 months ago
I think https://vespa.ai/ has the right approach in this space by focusing on being hybrid - vectors alone aren't great for production use cases, it's the combining of vectors+text that lets you use ranking to get meaningful result. (I'm an investor so I'm biased; but it's also the reason why I invested). - Source: Hacker News / 3 months ago
So what’s the catch? Why is this not everywhere? Because IR is not quite NLP — it hasn’t gone fully mainstream, and a lot of the IR frameworks are, quite frankly, a bit of a pain to work with in-production. Some solid efforts to bridge the gap like Vespa [1] are gathering steam, but it’s not quite there. [1] https://vespa.ai. - Source: Hacker News / 4 months ago
When it comes to search I cannot disagree more. https://vespa.ai is a purpose built search engine. If you start bolting search onto your database, your relevance will be terrible, you'll be rewriting a lot of table stakes tools/features from scratch, and your technical debt will skyrocket. - Source: Hacker News / 10 months ago
ElasticSearch - Elasticsearch is an open source, distributed, RESTful search engine.
Meilisearch - Ultra relevant, instant, and typo-tolerant full-text search API
ElasticHQ - Tool for ElasticSearch management and monitoring.
Typesense - Typo tolerant, delightfully simple, open source search 🔍
Kaizen - Kaizen is an ElasticSearch GUI for Windows, Mac and Linux, written in JavaFX as a cross-platform desktop application.
Algolia - Algolia's Search API makes it easy to deliver a great search experience in your apps & websites. Algolia Search provides hosted full-text, numerical, faceted and geolocalized search.