Scrapy - Scrapy | A Fast and Powerful Scraping and Web Crawling Framework
Apache Nutch - Apache Nutch is a highly extensible and scalable open source web crawler software project.
Mixnode - Turn the web into a database!
ACHE Crawler - ACHE is a web crawler for domain-specific search.
CommonCrawl - Common Crawl
Apache Solr - Solr is an open source enterprise search server based on Lucene search library, with XML/HTTP and...