Software Alternatives, Accelerators & Startups

Hacker News Search VS CommonCrawl

Compare Hacker News Search VS CommonCrawl and see what are their differences

Hacker News Search logo Hacker News Search

a faster hnsearch

CommonCrawl logo CommonCrawl

Common Crawl
  • Hacker News Search Landing page
    Landing page //
    2022-01-25
  • CommonCrawl Landing page
    Landing page //
    2023-10-16

Category Popularity

0-100% (relative to Hacker News Search and CommonCrawl)
Search Engine
92 92%
8% 8
Web Search
100 100%
0% 0
Web Scraping
0 0%
100% 100
Hacker News
100 100%
0% 0

User comments

Share your experience with using Hacker News Search and CommonCrawl. For example, how are they different and which one is better?
Log in or Post with

Social recommendations and mentions

Based on our record, Hacker News Search seems to be a lot more popular than CommonCrawl. While we know about 1927 links to Hacker News Search, we've tracked only 91 mentions of CommonCrawl. We are tracking product recommendations and mentions on various public social media platforms and blogs. They can help you identify which product is more popular and what people think of it.

Hacker News Search mentions (1927)

  • Which cognitive psychology findings are solid that I can use to help students?
    The top answer is written by Justin Skycak (https://www.justinmath.com/) who works on Math Academy (https://www.mathacademy.com/). Math Academy is awesome. I am a happy customer. Previous HN comments about it: https://hn.algolia.com/?dateRange=all&page=0&prefix=true&query=mathacademy&sort=byDate&type=comment. - Source: Hacker News / 1 day ago
  • Hypertext Version of a Pattern Language by Christopher Alexander
    Here's some other posts on Alexander's work: Beautiful Software: Christopher Alexander's research initiative on computing - https://news.ycombinator.com/item?id=34011469 Dec 2009 (30 comments) “A pattern language” explained (2016) - https://news.ycombinator.com/item?id=18644150 Jun 2021 (22 comments) Christopher Alexander: An Introduction for Object-Oriented Designers -... - Source: Hacker News / 2 days ago
  • eu/acc
    Note that my advice is more towards people who want to do an investment, is planning a startup, a company that might grow up, etc > Why is there a need to have specialists just to interface with one's local government True, in theory you shouldn't need it. And more than current officials, there's a lot of legislation that is to blame, but this is besides the point. You consult with specialists because they know... - Source: Hacker News / 2 days ago
  • Ask HN: What distributed file system would you use in 2024?
    What distributed file system would you use for a greenfield homelab project today? Requirements / desires: * Reliable * Performant * Easy to setup and operate Some options: SeaweedFS - https://github.com/seaweedfs/seaweedfs 289 hits: https://hn.algolia.com/?dateRange=all&page=0&prefix=true&query=seaweedfs&sort=byPopularity&type=all JuiceFS - https://github.com/juicedata/juicefs 2047 hits:... - Source: Hacker News / 5 days ago
  • It's always TCP_NODELAY. Every damn time
    FYI the best way to filter by author is 'author:Animats' this will only show results from the user Animats and won't match animats inside the comment text. https://hn.algolia.com/?dateRange=all&page=0&prefix=true&query=%22delayed%20ack%22%20author%3AAnimats&sort=byDate&type=comment. - Source: Hacker News / 6 days ago
View more

CommonCrawl mentions (91)

  • Ask HN: Who is hiring? (May 2024)
    Common Crawl Foundation | REMOTE | Full and part-time | https://commoncrawl.org/ | web datasets I'm the CTO at the Common Crawl Foundation, which has a 17 year old, 8. - Source: Hacker News / 14 days ago
  • Ask HN: How does one implement web plagiarism?
    Https://commoncrawl.org/ is a non-profit which offers a pre-crawled dataset. The specifics of individual tools probably vary. I imagine most tools would be based on academic datasets. - Source: Hacker News / 4 months ago
  • Things are about to get a lot worse for Generative AI
    Should the NYT not sue https://commoncrawl.org/ ? OpenAI just used the data from commoncrawl for training. - Source: Hacker News / 5 months ago
  • Indexing a Billion Pages
    What you’re likely referring to is Common Crawl: https://commoncrawl.org. - Source: Hacker News / 5 months ago
  • Interview with Viktor Lofgren from Marginalia Search
    > ... a project called "Nutch" would allow web users to crawl the web themselves. Perhaps that promise is similar to the promises being made about "AI" today. The project did not turn out to be used in the way it was predicted (marketed), or even used by web users at all. Actually Nutch is used to produce the Common Crawl[0] and 60% of GPT-3's training data was Common Crawl[1], so in a way it is being used... - Source: Hacker News / 6 months ago
View more

What are some alternatives?

When comparing Hacker News Search and CommonCrawl, you can also consider the following products

DuckDuckGo - The Internet privacy company that empowers you to seamlessly take control of your personal information online, without any tradeoffs.

Scrapy - Scrapy | A Fast and Powerful Scraping and Web Crawling Framework

Medium - Welcome to Medium, a place to read, write, and interact with the stories that matter most to you.

StormCrawler - StormCrawler is an open source SDK for building distributed web crawlers with Apache Storm.

40 Hadiths - Hadith Nawawi is an Islamic Android App that is designed with the purpose to enlighten the heart and souls of Muslims around the globe with the authentic teachings of Prophet Muhammad (PBUH).

Apache Nutch - Apache Nutch is a highly extensible and scalable open source web crawler software project.