Software Alternatives & Reviews

Top 12 Open-Source Alternatives to Diffbot

ScrapeHero Webhose.io Zyte Scrapy CoffeeScript Newscatcher News API Dataflow Kit Oxylabs StormCrawler Colly

Summary

The top open-source alternatives to Diffbot are ScrapeHero, Webhose.io, and Zyte. One of the criteria for ordering this list is the number of mentions that products have on reliable external sources. You can suggest additional sources through the form here.
  1. A web scraping service to collect data from websites, without any programming or DIY tools.
    Pricing:
    • Open Source
    • Freemium
    • Free Trial
    • $5.0 / Monthly

    #Web Scraping #Data Extraction #Data 1 social mentions

  2. Webhose.
    Pricing:
    • Open Source

    #Web Scraping #Data Extraction #Data 1 social mentions

  3. 3
    We're Zyte (formerly Scrapinghub), the central point of entry for all your web data needs.
    Pricing:
    • Open Source
    • Freemium
    • Free Trial

    #Web Scraping #Data Extraction #Web Crawling 1 social mentions

  4. 4
    Scrapy | A Fast and Powerful Scraping and Web Crawling Framework
    Pricing:
    • Open Source

    #Web Scraping #Data Extraction #Data 93 social mentions

  5. Unfancy JavaScript
    Pricing:
    • Open Source

    #Web Scraping #Data Extraction #Data 25 social mentions

  6. Get News Data with API
    Pricing:
    • Open Source

    #API Tools #APIs #News 16 social mentions

  7. A cloud-based web scraping platform. Extract data from websites and automate workflows on the web.
    Pricing:
    • Open Source
    • Paid
    • Free Trial
    • $5.0 / Usage

    #Web Scraping #Web Scraping API #Data Extraction

  8. A web intelligence collection platform and premium proxy provider, enabling companies of all sizes to utilize the power of big data.
    Pricing:
    • Open Source
    • Paid
    • Free Trial
    • $8.0 (per GB)

    #Proxy #Residential Proxies #Private Proxy 9 social mentions

  9. StormCrawler is an open source SDK for building distributed web crawlers with Apache Storm.
    Pricing:
    • Open Source

    #Web Scraping #Data Extraction #Data

  10. 10
    Colly is a scraping framework to extract structured data from websites.
    Pricing:
    • Open Source

    #Web Scraping #Data Extraction #Data 9 social mentions

  11. Apache Nutch is a highly extensible and scalable open source web crawler software project.
    Pricing:
    • Open Source

    #Web Scraping #Data Extraction #Utilities 2 social mentions

  12. 12
    Serverless Node.js stack for API development
    Pricing:
    • Open Source

    #Web Scraping #Data Extraction #Developer Tools

Suggest an alternative
If you think we've missed something, please suggest an alternative to Diffbot.
Please use the Feedback button if you think any of the listed products shouldn't be regarded as open-source.

Generic Diffbot discussion

Log in or Post with