Software Alternatives & Reviews

Hadoop Reviews

Open-source software for reliable, scalable, distributed computing

Social recommendations and mentions

We have tracked the following product recommendations or mentions on Reddit and HackerNews. They can help you see what people think about Hadoop and what they use it for.
  • 5 Best Practices For Data Integration To Boost ROI And Efficiency
    There are different ways to implement parallel dataflows, such as using parallel data processing frameworks like Apache Hadoop, Apache Spark, and Apache Flink, or using cloud-based services like Amazon EMR and Google Cloud Dataflow. It is also possible to use parallel dataflow frameworks to handle big data and distributed computing, like Apache Nifi and Apache Kafka. - Source: Reddit / 20 days ago
  • Data Engineering and DataOps: A Beginner's Guide to Building Data Solutions and Solving Real-World Challenges
    There are several frameworks available for batch processing, such as Hadoop, Apache Storm, and DataTorrent RTS. - Source: / 2 months ago
  • Effortlessly Set Up a Hadoop Multi-Node Cluster on Windows Machines with Our Step-by-Step Guide
    A copy of Hadoop installed on each of these machines. You can download Hadoop from the Apache website, or you can use a distribution like Cloudera or Hortonworks. - Source: / 3 months ago
  • In One Minute : Hadoop
    The Apache™ Hadoop™ project develops open-source software for reliable, scalable, distributed computing. - Source: / 4 months ago
  • A peek into Location Data Science at Ola
    This requires the use of distributed computation tools such as Spark and Hadoop, Flink and Kafka are used. But for occasional experimentation, Pandas, Geopandas and Dask are some of the commonly used tools. - Source: / 6 months ago
  • Big Data Processing, EMR with Spark and Hadoop | Python, PySpark
    Apache Hadoop is an open source framework that is used to efficiently store and process large datasets ranging in size from gigabytes to petabytes of data.Wanna dig more dipper? - Source: / about 1 year ago
  • Unknown Python.exe process taking 2% CPU
    Few related projects too it on the side of the page here that might be familiar - Source: Reddit / about 1 year ago
  • How do I make multiple computers run as one?
    The computers that you have appear to use an x86 architecture. Therefore, you could most likely install a Linux distro on each one. Then, you could use something like Apache Hadoop to execute some sort of distributed process across each computer. - Source: Reddit / about 1 year ago
  • Spark for beginners - and you
    Hadoop is an ecosystem of tools for big data storage and data analysis. It is older than Spark and writes intermediate results to disk whereas Spark tires to keep data in memory whenever possible, so this is faster in many use cases. - Source: / over 1 year ago
  • Dreaming and Breaking Molds – Establishing Best Practices with Scott Haines
    So Yahoo bought that. I think it was 2013 or 2014. Timelines are hard. But I wanted to go join the Games team and start things back up. But that was also my first kind of experience in actually building recommendation engines or working with lots of data. And I think for me, like that was, I the time, we were using something called Apache Storm. We had Hadoop, which had been around for a while. And it... - Source: / over 1 year ago
  • Spark is lit once again
    Here at Exacaster Spark applications have been used extensively for years. We started using them on our Hadoop clusters with YARN as an application manager. However, with our recent product, we started moving towards a Cloud-based solution and decided to use Kubernetes for our infrastructure needs. - Source: / over 1 year ago
  • 5 Best Big Data Frameworks You Can Learn in 2021
    Both Fortune 500 and small companies are looking for competent people who can derive useful insight from their huge pile of data and that's where Big Data Framework like Apache Hadoop, Apache Spark, Flink, Storm, and Hive can help. - Source: / about 2 years ago
  • The Data Engineering Interview Study Guide
    Some positions require Hadoop, others SQL. Some roles require understanding statistics, while still others require heavy amounts of system design. - Source: / almost 2 years ago
  • Currently in Data Science. Should I make the move?
    It'd be best to clarify exactly what we mean by "Hadoop", but if we define it as the suite described here then the only components I still see being used for greenfield are HDFS - or, to be more specific, HDFS-compatible filesystems (AWS EMR and Azure Data Lake Storage both offer HDFS compatibility) - and maybe (Spark) YARN. - Source: Reddit / about 2 years ago

External sources with reviews and comparisons of Hadoop

A List of The 16 Best ETL Tools And Why To Choose Them
Companies considering Hadoop should be aware of its costs. A significant portion of the cost of implementing Hadoop comes from the computing power required for processing and the expertise needed to maintain Hadoop ETL, rather than the tools or storage themselves.
16 Top Big Data Analytics Tools You Should Know About
Hadoop is an Apache open-source framework. Written in Java, Hadoop is an ecosystem of components that are primarily used to store, process, and analyze big data. The USP of Hadoop is it enables multiple types of analytic workloads to run on the same data, at the same time, and on a massive scale on industry-standard hardware.
5 Best-Performing Tools that Build Real-Time Data Pipeline
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than relying on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the...

Do you know an article comparing Hadoop to other products?
Suggest a link to a post with product alternatives.