Software Alternatives & Reviews

Top 9 Databases in Big Data Infrastructure

The best Databases within the Big Data Infrastructure category - based on our collection of reviews & verified products.

Apache Spark Hadoop Amazon EMR Snowflake Impala Apache ORC Apache Kudu Apache Flume

Summary

The top products on this list are Apache Spark, Hadoop, and Amazon EMR. All products here are categorized as: Software for creating, managing, and manipulating databases. Big Data Infrastructure. One of the criteria for ordering this list is the number of mentions that products have on reliable external sources. You can suggest additional sources through the form here.
  1. Apache Spark is an engine for big data processing, with built-in modules for streaming, SQL, machine learning and graph processing.
    Pricing:
    • Open Source

    #Databases #Big Data #Big Data Analytics 56 social mentions

  2. 2
    Open-source software for reliable, scalable, distributed computing
    Pricing:
    • Open Source

    #Databases #NoSQL Databases #Big Data 15 social mentions

  3. Amazon Elastic MapReduce is a web service that makes it easy to quickly process vast amounts of data.

    #Big Data #Big Data Tools #Big Data Infrastructure 10 social mentions

  4. Snowflake is the only data platform built for the cloud for all your data & all your users. Learn more about our purpose-built SQL cloud data warehouse.

    #Data Warehousing #Cloud Data #Data Dashboard 4 social mentions

  5. 5
    Impala is a modern, open source, distributed SQL query engine for Apache Hadoop.
    Pricing:
    • Open Source

    #Big Data #Big Data Infrastructure #Databases

  6. Apache ORC is a columnar storage for Hadoop workloads.
    Pricing:
    • Open Source

    #Big Data #Databases #Stream Processing 3 social mentions

  7. Apache Kudu is Hadoop's storage layer to enable fast analytics on fast data.

    #Business & Commerce #Data Dashboard #Office & Productivity

  8. Hadoop-Related

    #Big Data #Big Data Infrastructure #Databases

  9. Apache Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data

    #Big Data #Log Management #Databases 1 social mentions

Related categories

Recently added products

If you want to make changes on any of the products, you can go to its page and click on the "Suggest Changes" link. Alternatively, if you are working on one of these products, it's best to verify it and make the changes directly through the management page. Thanks!