Software Alternatives & Reviews

In One Minute : Hadoop

Apache ZooKeeper Apache Tez Apache Storm Apache Spark Apache Pig Apache Oozie Apache Mahout Apache Hive Apache HBase Hadoop
  1. Apache ZooKeeper is an effort to develop and maintain an open-source server which enables highly reliable distributed coordination.
    Pricing:
    • Open Source
    ZooKeeper, a system for coordinating distributed nodes, similar to Google's Chubby.

    #Web And Application Servers #Web Servers #Application Server 29 social mentions

  2. Apache Tez is aimed at building an application framework which allows for a complex directed-acyclic-graph of tasks for processing data.
    Tez is an extensible framework for building high performance batch and interactive data processing applications, coordinated by YARN.

    #Stream Processing #Big Data #Databases 1 social mentions

  3. Apache Storm is a free and open source distributed realtime computation system.
    Pricing:
    • Open Source
    Storm, a system for real-time and stream processing.

    #Big Data #Data Management #Databases 11 social mentions

  4. Apache Spark is an engine for big data processing, with built-in modules for streaming, SQL, machine learning and graph processing.
    Pricing:
    • Open Source
    Spark, a fast and general engine for large-scale data processing.

    #Databases #Big Data #Big Data Analytics 56 social mentions

  5. Pig is a high-level platform for creating MapReduce programs used with Hadoop.
    Pig, a platform/programming language for authoring parallelizable jobs.

    #Data Dashboard #Database Tools #Data Science And Machine Learning 2 social mentions

  6. Apache Oozie Workflow Scheduler for Hadoop
    Oozie, a workflow scheduler system to manage Apache Hadoop jobs.

    #IT Automation #Workflow Automation #Workload Automation 1 social mentions

  7. Distributed Linear Algebra
    Pricing:
    • Open Source
    Mahout, a library of machine learning algorithms compatible with M/R paradigm.

    #Development #Data Science And Machine Learning #Data Dashboard 2 social mentions

  8. Apache Hive data warehouse software facilitates querying and managing large datasets residing in distributed storage.
    Pricing:
    • Open Source
    Hive, A data warehouse infrastructure that provides data summarization and ad hoc querying.

    #Databases #Big Data #Data Warehousing 8 social mentions

  9. Apache HBase – Apache HBase™ Home
    Pricing:
    • Open Source
    HBase, A scalable, distributed database that supports structured data storage for large tables.

    #Databases #NoSQL Databases #Relational Databases 6 social mentions

  10. 10
    Open-source software for reliable, scalable, distributed computing
    Pricing:
    • Open Source
    The Apache™ Hadoop™ project develops open-source software for reliable, scalable, distributed computing.

    #Databases #NoSQL Databases #Big Data 15 social mentions

  11. Graph Databases
    Giraph is an iterative graph processing framework, built on top of Apache Hadoop.

    #Graph Databases #Databases #NoSQL Databases 1 social mentions

  12. Big Data Processing and Distribution
    Chukwa: A data collection system for managing large distributed systems.

    #Big Data #Databases #Development 1 social mentions

  13. Apache Avro is a comprehensive data serialization system and acting as a source of data exchanger service for Apache Hadoop.
    Pricing:
    • Open Source
    Avro, a data serialization system based on JSON schemas.

    #Development #OS & Utilities #Tool 12 social mentions

  14. Ambari is aimed at making Hadoop management simpler by developing software for provisioning, managing, and monitoring Hadoop clusters.
    Ambari, A web-based tool for provisioning, managing, and monitoring Apache Hadoop clusters which includes support for Hadoop HDFS, Hadoop MapReduce, Hive, HCatalog, HBase, ZooKeeper, Oozie, Pig and Sqoop. Ambari also provides a dashboard for viewing cluster health such as heatmaps and ability to view MapReduce, Pig and Hive applications visually along with features to diagnose their performance characteristics in a user-friendly manner.

    #Data Dashboard #Big Data #Development 1 social mentions

Discuss: In One Minute : Hadoop

Log in or Post with