Apify is a JavaScript & Node.js based data extraction tool for websites that crawls lists of URLs and automates workflows on the web. With Apify you can manage and automatically scale a pool of headless Chrome / Puppeteer instances, maintain queues of URLs to crawl, store crawling results locally or in the cloud, rotate proxies and much more.
No Xenu's Link Sleuth videos yet. You could help us improve this page by suggesting one.
Based on our record, Apify should be more popular than Xenu's Link Sleuth. It has been mentiond 26 times since March 2021. We are tracking product recommendations and mentions on various public social media platforms and blogs. They can help you identify which product is more popular and what people think of it.
For deployment, we'll use the Apify platform. It's a simple and effective environment for cloud deployment, allowing efficient interaction with your crawler. Call it via API, schedule tasks, integrate with various services, and much more. - Source: dev.to / 28 days ago
We already have a fully functional implementation for local execution. Let us explore how to adapt it for running on the Apify Platform and transform in Apify Actor. - Source: dev.to / 2 months ago
We've had the best success by first converting the HTML to a simpler format (i.e. markdown) before passing it to the LLM. There are a few ways to do this that we've tried, namely Extractus[0] and dom-to-semantic-markdown[1]. Internally we use Apify[2] and Firecrawl[3] for Magic Loops[4] that run in the cloud, both of which have options for simplifying pages built-in, but for our Chrome Extension we use... - Source: Hacker News / 9 months ago
Developed by Apify, it is a Python adaptation of their famous JS framework crawlee, first released on Jul 9, 2019. - Source: dev.to / 9 months ago
Hey all, This is Jan, the founder of [Apify](https://apify.com/)—a full-stack web scraping platform. After the success of [Crawlee for JavaScript](https://github.com/apify/crawlee/) today! The main features are: - A unified programming interface for both HTTP (HTTPX with BeautifulSoup) & headless browser crawling (Playwright). - Source: Hacker News / 11 months ago
I still haven't found a good modern equivalent to http://home.snafu.de/tilman/xenulink.html. - Source: Hacker News / over 1 year ago
Good list of free and open source link checkers here: https://www.devopsschool.com/blog/list-of-free-open-source-self-hosted-application-to-check-broken-links-url-check-on-web-page/ I've been using Xenu's Link Sleuth (https://home.snafu.de/tilman/xenulink.html) forever, but I should probably try some others out and see if there's something better now. Xenu's is also a funny throwback to the old web, sort of like... - Source: Hacker News / over 1 year ago
Xenu’s Link Sleuth Http://home.snafu.de/tilman/xenulink.html Diagnostic, Technical SEO Winner of the ugliest-SEO-tool-on-the-planet award, Xenu is also one of the most useful. Crawl entire sites, find broken links, create sitemaps, and more. Source: over 2 years ago
If you want to a tool to crawl through your site looking for 404 or 500 errors, there are online tools (e.g. The W3C's online link checker), browser plugins for Firefox and Chrome, or windows programs like Xenu's Link Sleuth. - Source: dev.to / almost 3 years ago
I used to use http://home.snafu.de/tilman/xenulink.html to check my bookmarks.html and personal web sites, but it's outdated and no longer supported. Is there a free replacement that work well to show broken links, redirected links, etc.? Source: about 3 years ago
import.io - Import. io helps its users find the internet data they need, organize and store it, and transform it into a format that provides them with the context they need.
Screaming Frog SEO Spider - The Screaming Frog SEO Spider is a small desktop program (PC or Mac) which crawls websites’ links...
Scrapy - Scrapy | A Fast and Powerful Scraping and Web Crawling Framework
Pulno - SEO Audit and Website Analysis. Pull up your SEO.
ParseHub - ParseHub is a free web scraping tool. With our advanced web scraper, extracting data is as easy as clicking the data you need.
Netpeak Spider - A desktop tool for fast and comprehensive technical audit of the entire website.