Software Alternatives & Reviews

The Data Engineer Roadmap 🗺

Apache Storm Apache Spark Presto DB Apache Pig Apache Oozie neo4j Materialize Apache Kafka Apache Hive Apache HBase
  1. Apache Storm is a free and open source distributed realtime computation system.
    Pricing:
    • Open Source

    #Big Data #Data Management #Databases 11 social mentions

  2. Apache Spark is an engine for big data processing, with built-in modules for streaming, SQL, machine learning and graph processing.
    Pricing:
    • Open Source

    #Databases #Big Data #Big Data Analytics 56 social mentions

  3. Distributed SQL Query Engine for Big Data (by Facebook)
    Pricing:
    • Open Source

    #Database Tools #Data Dashboard #Big Data Analytics 6 social mentions

  4. Pig is a high-level platform for creating MapReduce programs used with Hadoop.

    #Data Dashboard #Database Tools #Data Science And Machine Learning 2 social mentions

  5. Apache Oozie Workflow Scheduler for Hadoop

    #IT Automation #Workflow Automation #Workload Automation 1 social mentions

  6. 6
    Meet Neo4j: The graph database platform powering today's mission-critical enterprise applications, including artificial intelligence, fraud detection and recommendations.
    Pricing:

    #Graph Databases #Big Data #Databases 27 social mentions

  7. A Streaming Database for Real-Time Applications
    Pricing:
    Materialize - The Streaming Database for Real-time Analytics.

    #Database Tools #Databases #Relational Databases 65 social mentions

  8. Apache Kafka is an open-source message broker project developed by the Apache Software Foundation written in Scala.
    Pricing:
    • Open Source

    #Stream Processing #Data Integration #ETL 120 social mentions

  9. Apache Hive data warehouse software facilitates querying and managing large datasets residing in distributed storage.
    Pricing:
    • Open Source

    #Databases #Big Data #Data Warehousing 8 social mentions

  10. Apache HBase – Apache HBase™ Home
    Pricing:
    • Open Source

    #Databases #NoSQL Databases #Relational Databases 6 social mentions

  11. 11
    Open-source software for reliable, scalable, distributed computing
    Pricing:
    • Open Source

    #Databases #NoSQL Databases #Big Data 15 social mentions

  12. Automate your workflow from idea to production
    Pricing:
    • Open Source

    #DevOps Tools #Continuous Integration #Developer Tools 271 social mentions

  13. Google Cloud Storage offers developers and IT organizations durable and highly available object storage.

    #Cloud Storage #Cloud Computing #Object Storage 36 social mentions

  14. The Apache Cassandra database is the right choice when you need scalability and high availability without compromising performance.

    #Databases #NoSQL Databases #Relational Databases 39 social mentions

  15. Apache Beam provides an advanced unified programming model to implement batch and streaming data processing jobs.
    Pricing:
    • Open Source

    #Big Data #Data Dashboard #Data Warehousing 14 social mentions

  16. Amazon S3 is an object storage where users can store data from their business on a safe, cloud-based platform. Amazon S3 operates in 54 availability zones within 18 graphic regions and 1 local region.

    #Cloud Hosting #Object Storage #Cloud Storage 170 social mentions

  17. Apache Arrow is a cross-language development platform for in-memory data.
    Pricing:
    • Open Source

    #Databases #NoSQL Databases #Relational Databases 33 social mentions

  18. Airflow is a platform to programmaticaly author, schedule and monitor data pipelines.
    Pricing:
    • Open Source

    #Workflow #Workflow Automation #Data Pipelines 65 social mentions

Discuss: The Data Engineer Roadmap 🗺

Log in or Post with