Sphinx Search might be a bit more popular than Xapian. We know about 10 links to it since March 2021 and only 7 links to Xapian. We are tracking product recommendations and mentions on various public social media platforms and blogs. They can help you identify which product is more popular and what people think of it.
Sphinx is a search engine that can be integrated into a website to provide advanced search functionality such as full-text, Boolean, and faceted search. It is a powerful open-source search engine that can handle large amounts of data and quickly return results. - Source: dev.to / about 1 year ago
Have been using Sphinx. It does some processing around suffixes, tenses, and so on, and looks at word proximity (BM25), but is definitely limited. Source: about 1 year ago
Lucene is the thing you think you need. Elastic Search is a nice wrapper for it. But these are Java, so maybe you want Sphinx Search (C++) or MeiliSearch (Rust). Source: over 1 year ago
Using a natural language search will almost certainly be a better solution and PHP may not be the best tool for this task. Figure out how you are going to get the text out of the PDF and where you are going to put it. Look at things like sphinx and full text search in boolean mode for doing the keyword matching. Source: over 1 year ago
In practice though you don't do any of this, you get a library to do it for you. I've used Sphinx Search in the past for some fairly hefty (In the order of terabytes), and there's a good book covering how to get it all set up and started. Source: almost 2 years ago
Recoll is free/open source (GPL) that can index PDFs and search them very quickly. It uses Xapian under the hood. I have over 165,000 documents indexed on an old laptop running Linux and can query them all in a split second. Source: 7 months ago
+ xapian which has been around a while, and while gpl licensed, is quite capable https://xapian.org/. - Source: Hacker News / over 1 year ago
Tangentially related if you need search without the clustering and high availability story of elastic search and friends I highly recommend Xapian. Its like the SQLite of search. Single library that provides the basic set of features you would expect in a quality search experience: facets, ranked search, boolean operators, stemming etc etc. https://xapian.org/. - Source: Hacker News / over 1 year ago
For fast searching, it usually requires indexing the files in question. There are a number of text-file indexing solutions, many of which use xapian, sphinx, or lucene/solr under the hood. Based on conditions (watching files/directories, cron jobs, new-mail triggers, etc), they'll add/remove files to the index, and you can then use a corresponding command to compose queries across that data. If it's indexed, it... Source: about 2 years ago
There is also xapian/recoll https://xapian.org/ which works great for "desktop" search. - Source: Hacker News / over 2 years ago
MkDocs - Project documentation with Markdown.
ElasticSearch - Elasticsearch is an open source, distributed, RESTful search engine.
GitBook - Modern Publishing, Simply taking your books from ideas to finished, polished books.
ElasticHQ - Tool for ElasticSearch management and monitoring.
Kaizen - Kaizen is an ElasticSearch GUI for Windows, Mac and Linux, written in JavaFX as a cross-platform desktop application.
Apache Solr - Solr is an open source enterprise search server based on Lucene search library, with XML/HTTP and...