Software Alternatives & Reviews

Apache Nutch

Apache Nutch is a highly extensible and scalable open source web crawler software project. subtitle

Top 6 Open-Source Alternatives to Apache Nutch

Scrapy StormCrawler Mixnode HTTrack Scrapfly.io ScrapeHero

Summary

The top open-source alternatives to Apache Nutch are Scrapy, StormCrawler, and Mixnode. One of the criteria for ordering this list is the number of mentions that products have on reliable external sources. You can suggest additional sources through the form here.
  1. 1
    Scrapy | A Fast and Powerful Scraping and Web Crawling Framework
    Pricing:
    • Open Source

    #Web Scraping #Data Extraction #Data 93 social mentions

  2. StormCrawler is an open source SDK for building distributed web crawlers with Apache Storm.
    Pricing:
    • Open Source

    #Web Scraping #Data Extraction #Data

  3. Turn the web into a database!
    Pricing:
    • Open Source

    #Web Scraping #Data Extraction #Data

  4. HTTrack is a free (GPL, libre/free software) and easy-to-use offline browser utility.
    Pricing:
    • Open Source

    #Utilities #Download Manager #Web Copier 2 social mentions

  5. Simple but powerful Web Scraping API - We provide fully managed web scraping through a simple REST API. The promise is to turn any website into database effortlessly in a unified tool.
    Pricing:
    • Open Source
    • Freemium
    • Free Trial
    • $15.0 / Monthly (all features)

    #Web Scraping #Data Extraction #Scraper 33 social mentions

  6. A web scraping service to collect data from websites, without any programming or DIY tools.
    Pricing:
    • Open Source
    • Freemium
    • Free Trial
    • $5.0 / Monthly

    #Web Scraping #Data Extraction #Data 1 social mentions

Suggest an alternative
If you think we've missed something, please suggest an alternative to Apache Nutch.
Please use the Feedback button if you think any of the listed products shouldn't be regarded as open-source.

Generic Apache Nutch discussion

Log in or Post with