Software Alternatives & Reviews
Register   |   Login

Apache Hive VS Apache Spark

Compare Apache Hive VS Apache Spark and see what are their differences


Apache Hive data warehouse software facilitates querying and managing large datasets residing in distributed storage.

Apache Spark is an engine for big data processing, with built-in modules for streaming, SQL, machine learning and graph processing.
Apache Hive Landing Page
Apache Hive Landing Page
Apache Spark Landing Page
Apache Spark Landing Page

Apache Hive details

Categories
Big Data Databases Big Data Analytics
Website hive.apache.org  

Apache Spark details

Categories
Databases Big Data Big Data Analytics Big Data Infrastructure
Website spark.apache.org  

Apache Hive videos

Hive vs Impala - Comparing Apache Hive vs Apache Impala

Apache Spark videos

Weekly Apache Spark live Code Review -- look at StringIndexer multi-col (Scala) & Python testing

More videos:

  • - What's New in Apache Spark 3.0.0
  • - Apache Spark for Data Engineering and Analysis - Overview

Category Popularity

0-100% (relative to Apache Hive and Apache Spark)
23
23%
77%
77
23
23%
77%
77
19
19%
81%
81
100
100%
0%
0

Social recommendations and mentions

We have tracked the following product recommendations or mentions on Reddit and HackerNews. They can help you identify which product is more popular and what people think of it.

Apache Hive mentions

We have not tracked any mentions of Apache Hive yet. Tracking of Apache Hive recommendations started around Mar 2021.

Apache Spark mentions

  • Unit testing your PySpark library
    In software development we often unit test our code (hopefully). And code written for Spark is no different. So here I want to run through an example of building a small library using PySpark and unit testing it. I'm using Visual Studio Code as my editor here, mostly because I think it's brilliant, but other editors are available. - Source: dev.to / 17 days ago

What are some alternatives?

When comparing Apache Hive and Apache Spark, you can also consider the following products

Amazon Redshift - Learn about Amazon Redshift cloud data warehouse.

Hadoop - Open-source software for reliable, scalable, distributed computing

Apache Kylin - OLAP Engine for Big Data

Hive - Seamless project management and collaboration for your team.

Apache Druid - Fast column-oriented distributed data store

Hortonworks - Hadoop-Related

User reviews

Share your experience with using Apache Hive and Apache Spark. For example, how are they different and which one is better?

Post a review