Software Alternatives & Reviews

Spark for beginners - and you

Apache Storm Spark Streaming Apache Spark Apache Mesos Hadoop Apache Flink Apache Apex
  1. Apache Storm is a free and open source distributed realtime computation system.
    Pricing:
    • Open Source
    Streaming: Sparks Streamings's latency is at least 500ms, since it operates on micro-batches of records, instead of processing one record at a time. Native streaming tools like Storm, Apex or Flink might be better for low-latency applications.

    #Big Data #Data Management #Databases 11 social mentions

  2. Spark Streaming makes it easy to build scalable and fault-tolerant streaming applications.
    Is a big data framework and currently one of the most popular tools for big data analytics. It contains libraries for data analysis, machine learning, graph analysis and streaming live data. In general Spark is faster than Hadoop, as it does not write intermediate results to disk. It is not a data storage system. We can use Spark on top of HDFS or read data from other sources like Amazon S3. It is the designed for Data Analytics, Machine Learning, Streaming and Graph Analytics.

    #Big Data #Stream Processing #Data Management 3 social mentions

  3. Apache Spark is an engine for big data processing, with built-in modules for streaming, SQL, machine learning and graph processing.
    Pricing:
    • Open Source

    #Databases #Big Data #Big Data Analytics 56 social mentions

  4. Apache Mesos abstracts resources away from machines, enabling fault-tolerant and elastic distributed systems to easily be built and run effectively.
    Pricing:
    • Open Source
    Cluster Modes: We can use a cluster in Standalone version or via a clustermanager either YARN or Mesos.

    #Developer Tools #Containers As A Service #DevOps Tools 7 social mentions

  5. 5
    Open-source software for reliable, scalable, distributed computing
    Pricing:
    • Open Source
    Hadoop is an ecosystem of tools for big data storage and data analysis. It is older than Spark and writes intermediate results to disk whereas Spark tires to keep data in memory whenever possible, so this is faster in many use cases.

    #Databases #NoSQL Databases #Big Data 15 social mentions

  6. Apache Apex is an enterprise-grade unified stream and batch processing engine.
    Streaming: Sparks Streamings's latency is at least 500ms, since it operates on micro-batches of records, instead of processing one record at a time. Native streaming tools like Storm, Apex or Flink might be better for low-latency applications.

    #Big Data #Data Dashboard #Data Warehousing 1 social mentions

Discuss: Spark for beginners - and you

Log in or Post with