Company offering cloud based web scraping and data extraction platform that works not only with HTML pages as data source but also with JS, JSON, XML, documents like iCal, XSLX, XLS, CSV and images. Extracted data kept in the database as dataset which can be downloaded in various formats, retrieved via API or pushed to any other destination upon completion. Integrated with such services like Zapier, Tableau, OSM, Luminati, DeathByCaptcha.
Scrapy - Scrapy | A Fast and Powerful Scraping and Web Crawling Framework
import.io - Import. io helps its users find the internet data they need, organize and store it, and transform it into a format that provides them with the context they need.
StormCrawler - StormCrawler is an open source SDK for building distributed web crawlers with Apache Storm.
Octoparse - Octoparse provides easy web scraping for anyone. Our advanced web crawler, allows users to turn web pages into structured spreadsheets within clicks.
Apache Nutch - Apache Nutch is a highly extensible and scalable open source web crawler software project.
Content Grabber - Content Grabber is an automated web scraping tool.