-
An ML-powered cloud platform for text search
Hi Dmitry, I am cofounder of ZIR AI (https://zir-ai.com/). I researched neural information retrieval at Google, before starting ZIR in 2020. (Note: Vespa, who appear in your article, reference some of my work in [1]) To give you some historical perspective, embedding based retrieval on large text corpora became viable only after the introduction of transformers in 2017. Google Talk to Books (https://books.google.com/talktobooks/) is the first such system I'm aware of --- I designed the neural network that powers that system. It's tricky to compare with BM25 on extremely large datasets (like the BioASQ challenge), because the ground truth is collected using keyword search, hence a chicken-and-egg problem. However, it's been shown in recent research (post 2018) that neural retrieval can outperform BM25, as measured by mean average precision (MAP), by large margins in the following cases: 1. On natural language queries (e.g. Longer than 5 words; spoken queries, etc.).
#Software Engineering #Developer Tools #AI 1 social mentions
-
Browse passages from books using experimental AI
Hi Dmitry, I am cofounder of ZIR AI (https://zir-ai.com/). I researched neural information retrieval at Google, before starting ZIR in 2020. (Note: Vespa, who appear in your article, reference some of my work in [1]) To give you some historical perspective, embedding based retrieval on large text corpora became viable only after the introduction of transformers in 2017. Google Talk to Books (https://books.google.com/talktobooks/) is the first such system I'm aware of --- I designed the neural network that powers that system. It's tricky to compare with BM25 on extremely large datasets (like the BioASQ challenge), because the ground truth is collected using keyword search, hence a chicken-and-egg problem. However, it's been shown in recent research (post 2018) that neural retrieval can outperform BM25, as measured by mean average precision (MAP), by large margins in the following cases: 1. On natural language queries (e.g. Longer than 5 words; spoken queries, etc.).
#Developer Tools #AI #Data Science And Machine Learning 9 social mentions