Software Alternatives & Reviews


Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web...

· Edit

Heritrix Alternatives

The best Heritrix alternatives based on verified products, community votes, reviews and other factors.
Latest update:

  1. Scrapy | A Fast and Powerful Scraping and Web Crawling Framework

  2. 10

    StormCrawler is an open source SDK for building distributed web crawlers with Apache Storm.

  3. 10

    Apache Nutch is a highly extensible and scalable open source web crawler software project.

  4. ACHE is a web crawler for domain-specific search.

  5. Turn the web into a database!

  6. grab-site is a crawler for archiving websites to WARC files.

  7. ProxyCrawl stay anonymous while crawling the web. Avoid captchas, blocks and proxies. Crawling and scraping protection


  8. GNU Wget is a free software package for retrieving files using HTTP(S) and FTP, the most...

  9. HTTrack is a free (GPL, libre/free software) and easy-to-use offline browser utility.

  10. Solr is an open source enterprise search server based on Lucene search library, with XML/HTTP and...

  11. Algolia's Search API makes it easy to deliver a great search experience in your apps & websites. Algolia Search provides hosted full-text, numerical, faceted and geolocalized search.

  12. aria2 is a lightweight multi-protocol & multi-source command-line download utility. It supports HTTP/HTTPS, FTP, SFTP, BitTorrent and Metalink.

Generic Heritrix discussion

Log in or Post with

Heritrix Reviews

There are no reviews of Heritrix yet.
Be the first one to post

Was this Heritrix alternatives list helpful? Your feedback is important!

Yes No

5 out of 8 people consider this list as helpful.
This is equivalent to 3.1 / 5 rating.

Author: | Publisher: SaaSHub
Categories: Web Scraping, Web Scraping API, Data Extraction