Software Alternatives & Reviews

Spark is lit once again

Apache Spark Apache Pig Jupyter Hadoop
  1. Apache Spark is an engine for big data processing, with built-in modules for streaming, SQL, machine learning and graph processing.
    Pricing:
    • Open Source
    Here at Exacaster Spark applications have been used extensively for years. We started using them on our Hadoop clusters with YARN as an application manager. However, with our recent product, we started moving towards a Cloud-based solution and decided to use Kubernetes for our infrastructure needs.

    #Databases #Big Data #Big Data Analytics 56 social mentions

  2. Pig is a high-level platform for creating MapReduce programs used with Hadoop.
    In the early days of the Big Data era when K8s hasn't even been born yet, the common open source go-to solution was the Hadoop stack. We have written several old-fashioned Map-Reduce jobs, scripts using Pig until we came across Spark. Since then Spark has became one of the most popular data processing engines. It is very easy to start using Lighter on YARN deployments. Just run a docker with proper configuration and mount necessary configurations in all the default paths.

    #Data Dashboard #Database Tools #Data Science And Machine Learning 2 social mentions

  3. Project Jupyter exists to develop open-source software, open-standards, and services for interactive computing across dozens of programming languages. Ready to get started? Try it in your browser Install the Notebook.
    For ad-hoc data analysis Jupyterlab on top of Spark is an elegant solution. Between themselves, however, these two great tools cannot communicate so Lighter together with SparkMagic acts as a bridge. You only need to provide the correct configuration to SparkMagic to have it working.

    #Data Science And Machine Learning #Data Science Tools #Data Science Notebooks 205 social mentions

  4. 4
    Open-source software for reliable, scalable, distributed computing
    Pricing:
    • Open Source
    Here at Exacaster Spark applications have been used extensively for years. We started using them on our Hadoop clusters with YARN as an application manager. However, with our recent product, we started moving towards a Cloud-based solution and decided to use Kubernetes for our infrastructure needs.

    #Databases #NoSQL Databases #Big Data 15 social mentions

Discuss: Spark is lit once again

Log in or Post with