Company offering cloud based web scraping and data extraction platform that works not only with HTML pages as data source but also with JS, JSON, XML, documents like iCal, XSLX, XLS, CSV and images. Extracted data kept in the database as dataset which can be downloaded in various formats, retrieved via API or pushed to any other destination upon completion. Integrated with such services like Zapier, Tableau, OSM, Luminati, DeathByCaptcha.
Portia is recommended for users who are looking for a relatively easy and intuitive way to scrape websites, especially those without advanced programming skills. It's suitable for projects that require structured data crawling and for users who appreciate community-driven tools.
Scrapy - Scrapy | A Fast and Powerful Scraping and Web Crawling Framework
import.io - Import. io helps its users find the internet data they need, organize and store it, and transform it into a format that provides them with the context they need.
Apify - Apify is a web scraping and automation platform that can turn any website into an API.
Octoparse - Octoparse provides easy web scraping for anyone. Our advanced web crawler, allows users to turn web pages into structured spreadsheets within clicks.
artoo.js - Artoo.js provides script that can be run from your browser’s bookmark bar to scrape a website and return the data in JSON format.
ParseHub - ParseHub is a free web scraping tool. With our advanced web scraper, extracting data is as easy as clicking the data you need.