No Vespa.ai videos yet. You could help us improve this page by suggesting one.
I've been playing around with different scraping tools in the past month, trying to find the best one to help with my research project, and I have to say this new feature of auto-detection comes like a life-savor. I only need to give the software the link and it will auto-detect the content and build the crawler for me. I can even enjoy it with just a free plan!
Based on our record, Vespa.ai should be more popular than Octoparse. It has been mentiond 19 times since March 2021. We are tracking product recommendations and mentions on various public social media platforms and blogs. They can help you identify which product is more popular and what people think of it.
If you're serious about scaling up, definitely consider Vespa (https://vespa.ai). At serious scale, Vespa will likely knock all the other options out of the park. - Source: Hacker News / 2 months ago
Yahoo released their geographic data catalogue under open license and it still lives on as https://whosonfirst.org/ Afaik https://en.wikipedia.org/wiki/Apache_ZooKeeper started at Yahoo https://vespa.ai/ was Yahoo's search engine for news and other content product, now spinned off (https://techcrunch.com/2023/10/04/yahoo-spins-out-vespa-its-search-tech-into-an-independent-company/). - Source: Hacker News / 4 months ago
I think https://vespa.ai/ has the right approach in this space by focusing on being hybrid - vectors alone aren't great for production use cases, it's the combining of vectors+text that lets you use ranking to get meaningful result. (I'm an investor so I'm biased; but it's also the reason why I invested). - Source: Hacker News / 5 months ago
So what’s the catch? Why is this not everywhere? Because IR is not quite NLP — it hasn’t gone fully mainstream, and a lot of the IR frameworks are, quite frankly, a bit of a pain to work with in-production. Some solid efforts to bridge the gap like Vespa [1] are gathering steam, but it’s not quite there. [1] https://vespa.ai. - Source: Hacker News / 6 months ago
When it comes to search I cannot disagree more. https://vespa.ai is a purpose built search engine. If you start bolting search onto your database, your relevance will be terrible, you'll be rewriting a lot of table stakes tools/features from scratch, and your technical debt will skyrocket. - Source: Hacker News / 11 months ago
Octoparse.com might work, they have a very nice interactive tool + 14 day free trail. Source: over 2 years ago
These are no-code solutions for scraping websites. You don’t need any technical knowledge to scrape Aliexpress using these tools. Using advanced AI-powered click and scrape tools, you can get started scraping within seconds either locally or in the cloud. Choosing a good scraping tool can save you lots of money and time as well. Source: almost 3 years ago
I have always been able to extract data without any problems with Octoparse. It is also a very easy to use tool. Source: almost 3 years ago
Meilisearch - Ultra relevant, instant, and typo-tolerant full-text search API
import.io - Import. io helps its users find the internet data they need, organize and store it, and transform it into a format that provides them with the context they need.
Typesense - Typo tolerant, delightfully simple, open source search 🔍
Apify - Apify is a web scraping and automation platform that can turn any website into an API.
Qdrant - Qdrant is a high-performance, massive-scale Vector Database for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
ParseHub - ParseHub is a free web scraping tool. With our advanced web scraper, extracting data is as easy as clicking the data you need.