Scrapy - Scrapy | A Fast and Powerful Scraping and Web Crawling Framework
StormCrawler - StormCrawler is an open source SDK for building distributed web crawlers with Apache Storm.
Apache Nutch - Apache Nutch is a highly extensible and scalable open source web crawler software project.
Datahut - Datahut is a web scraping service provider providing web scraping, data scraping, web crawling and web data extraction to help companies get structured data from websites.
ACHE Crawler - ACHE is a web crawler for domain-specific search.
Apache Solr - Solr is an open source enterprise search server based on Lucene search library, with XML/HTTP and...