Software Alternatives, Accelerators & Startups
Table of contents
  1. Videos
  2. Social Mentions

Delta Lake

Application and Data, Data Stores, and Big Data Tools

Delta Lake Reviews and details

Screenshots and images

  • Delta Lake Landing page
    Landing page //


Promote Delta Lake. You can add any of these badges on your website.

SaaSHub badge
Show embed code


A Thorough Comparison of Delta Lake, Iceberg and Hudi

Delta Lake for apache Spark | How does it work | How to use delta lake | Delta Lake for Spark ACID

ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scale Storage and Analytics

Social recommendations and mentions

We have tracked the following product recommendations or mentions on various public social media platforms and blogs. They can help you see what people think about Delta Lake and what they use it for.
  • 25 Open Source AI Tools to Cut Your Development Time in Half
    Delta Lake is a storage layer framework that provides reliability to data lakes. It addresses the challenges of managing large-scale data in lakehouse architectures, where data is stored in an open format and used for various purposes, like machine learning (ML). Data engineers can build real-time pipelines or ML applications using Delta Lake because it supports both batch and streaming data processing. It also... - Source: / 1 day ago
  • Make Rust Object Oriented with the dual-trait pattern
    There is a neat example, of how a third party project belonging to the Linux Foundation, is implementing UserDefinedLogicalNodeCore: MetricObserver in delta-rs. The developer had to use only #[derive(Debug, Hash, Eq, PartialEq)] to get dyn_eq and dyn_hash implemented. - Source: / 6 days ago
  • Delta Lake vs. Parquet: A Comparison
    Delta is pretty great, let's you do upserts into tables in DataBricks much easier than without it. I think the website is here: - Source: Hacker News / 6 months ago
  • Getting Started with Flink SQL, Apache Iceberg and DynamoDB Catalog
    Apache Iceberg is one of the three types of lakehouse, the other two are Apache Hudi and Delta Lake. - Source: / 7 months ago
  • [D] Is there other better data format for LLM to generate structured data?
    The Apache Spark / Databricks community prefers Apache parquet or Linux Fundation's over json. Source: 7 months ago
  • Databricks Strikes $1.3B Deal for Generative AI Startup MosaicML
    Databricks provides Jupyter lab like notebooks for analysis and ETL pipelines using spark through pyspark, sparkql or scala. I think R is supported as well but it doesn't interop as well with their newer features as well as python and SQL do. It interfaces with cloud storage backend like S3 and offers some improvements to the parquet format of data querying that allows for updating, ordering and merged through... - Source: Hacker News / about 1 year ago
  • The "Big Three's" Data Storage Offerings
    Structured, Semi-structured and Unstructured can be stored in one single format, a lakehouse storage format like Delta, Iceberg or Hudi (assuming those don't require low-latency SLAs like subsecond). Source: about 1 year ago
  • Medallion/lakehouse architecture data modelling
    Take a look at Delta Lake, it enables a lot of database-like actions on files. Source: about 1 year ago
  • How to build a data pipeline using Delta Lake
    This sounds like a new trending destination to take selfies in front of, but it’s even better than that. Delta Lake is an “open-source storage layer designed to run on top of an existing data lake and improve its reliability, security, and performance.” (source). It let’s you interact with an object storage system like you would with a database. - Source: / about 1 year ago
  • self hosting
    You are right, is just a framework. Sorry for the unclear question. Another try: when you host spark on your own with delta as table format compared to usage of Databricks, what are the differences? Source: about 1 year ago
  • self hosting
    I mean the different between using the framework to let it run on your own machines/ vms vs using databricks and have clusters defined. Source: about 1 year ago
  • self hosting
    Is there actually any company implementing self hosted beside microsoft/synapse and databricks? Would it be worth the effort compared to the features microsoft/databricks bring to the table? Source: about 1 year ago
  • Lightweight HTTP API for Big Data on S3
    We are happy to announce our third opensource project - Delta Fetch. Delta Fetch is a configurable HTTP API service for accessing Delta Lake tables. Service is highly configurable, with possibility to filter your Delta tables by selected columns. - Source: / over 1 year ago
  • Question about lambda architecture
    I’d suggest looking at the open table formats. Delta lake does an excellent job at providing batch and streaming APIs for Spark. This would unify your workloads. It would follow the medallion architecture which is a bit more popular lately. Aspects of the lamda architecture can still be present in the medallion model, especially when real-time requirements are present. Source: over 1 year ago
  • HDFS/Spark + Delta: Is this warning dangerous?
    I've installed the stack (Hadoop, Hive, Spark) into a Centos VM, built everything from sources to make sure it fits together. Then added Delta Lake ( from their maven repo. Source: over 1 year ago
  • Confusion with Hadoop/Hive/Spark
    (I configured in $SPARK_HOME/conf/spark-defaults.xml so it's loaded & available). Source: over 1 year ago
  • 5 Reasons Your Data Lakehouse should Embrace Dremio Cloud
    You can query data organized in many open table formats like Apache Iceberg and Delta Lake. (Here is a good article on what is a table format and the differences between different ones). - Source: / almost 2 years ago
  • How do we bridge SQL and Python.
    Bit more specific: Source: almost 2 years ago
  • [D] How do you share big datasets with your team and others?
    Spark Delta Tables ( - that can persist their data on S3, Azure Blobs, or local files. Source: about 2 years ago
  • Databricks platform for small data, is it worth it?
    Currently the infrastructure we have is some custom made pipelines that load the data on S3, and I use Delta Tables here and there for its convenience: ACID, time travel, merges, CDC etc... Source: about 2 years ago
  • Data point versioning infrastructure for time traveling to a precise point in time?
    I've been playing around a bit with Delta (Table/Lake) whatever you want to call it. It has time travel so you can look back and see what the data looked like at a particular point in time. Source: about 2 years ago

Do you know an article comparing Delta Lake to other products?
Suggest a link to a post with product alternatives.

Suggest an article

Delta Lake discussion

Log in or Post with

This is an informative page about Delta Lake. You can review and discuss the product here. The primary details have not been verified within the last quarter, and they might be outdated. If you think we are missing something, please use the means on this page to comment or suggest changes. All reviews and comments are highly encouranged and appreciated as they help everyone in the community to make an informed choice. Please always be kind and objective when evaluating a product and sharing your opinion.