Apache Spark
Apache Spark is an engine for big data processing, with built-in modules for streaming, SQL, machine learning and graph processing.
- Open Source
Apache Spark Alternatives
The best Apache Spark alternatives based on verified products, community votes, reviews and other factors.
Latest update:
-
Flink is a streaming dataflow engine that provides data distribution, communication, and fault tolerance for distributed computations.
-
Airflow is a platform to programmaticaly author, schedule and monitor data pipelines.
-
The most intuitive platform to manage projects and teamwork
-
Open-source software for reliable, scalable, distributed computing
-
Apache Kafka is an open-source message broker project developed by the Apache Software Foundation written in Scala.
-
Apache Storm is a free and open source distributed realtime computation system.
-
Google Cloud Dataflow is a fully-managed cloud service and programming model for batch and streaming big data processing.
-
Apache Hive data warehouse software facilitates querying and managing large datasets residing in distributed storage.
-
Databricks provides a Unified Analytics Platform that accelerates innovation by unifying data science, engineering and business.What is Apache Spark?
-
Fast column-oriented distributed data store
-
Level up your Java code and explore what Spring can do for you.
-
Amazon Elastic MapReduce is a web service that makes it easy to quickly process vast amounts of data.
-
Amazon Kinesis services make it easy to work with real-time streaming data in the AWS cloud.
-
Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run.