Delta Lake Reviews and details

Screenshots and images

Landing page //
2023-08-26

Badges

Promote Delta Lake. You can add any of these badges on your website.

<a href='https://www.saashub.com/delta-lake?utm_source=badge&utm_campaign=badge&utm_content=delta-lake&badge_variant=color&badge_kind=approved' target='_blank'><img src="https://cdn-b.saashub.com/img/badges/approved-color.png?v=1" alt="Delta Lake badge" style="max-width: 150px;"/></a>

Show embed code

Videos

A Thorough Comparison of Delta Lake, Iceberg and Hudi

Delta Lake for apache Spark | How does it work | How to use delta lake | Delta Lake for Spark ACID

ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scale Storage and Analytics

Social recommendations and mentions

We have tracked the following product recommendations or mentions on various public social media platforms and blogs. They can help you see what people think about Delta Lake and what they use it for.

Delta Lake vs. Parquet: A Comparison
Delta is pretty great, let's you do upserts into tables in DataBricks much easier than without it. I think the website is here: https://delta.io. - Source: Hacker News / 3 months ago
Getting Started with Flink SQL, Apache Iceberg and DynamoDB Catalog
Apache Iceberg is one of the three types of lakehouse, the other two are Apache Hudi and Delta Lake. - Source: dev.to / 4 months ago
[D] Is there other better data format for LLM to generate structured data?
The Apache Spark / Databricks community prefers Apache parquet or Linux Fundation's delta.io over json. Source: 5 months ago
Databricks Strikes $1.3B Deal for Generative AI Startup MosaicML
Databricks provides Jupyter lab like notebooks for analysis and ETL pipelines using spark through pyspark, sparkql or scala. I think R is supported as well but it doesn't interop as well with their newer features as well as python and SQL do. It interfaces with cloud storage backend like S3 and offers some improvements to the parquet format of data querying that allows for updating, ordering and merged through... - Source: Hacker News / 10 months ago
The "Big Three's" Data Storage Offerings
Structured, Semi-structured and Unstructured can be stored in one single format, a lakehouse storage format like Delta, Iceberg or Hudi (assuming those don't require low-latency SLAs like subsecond). Source: 11 months ago
Medallion/lakehouse architecture data modelling
Take a look at Delta Lake https://delta.io, it enables a lot of database-like actions on files. Source: 11 months ago
How to build a data pipeline using Delta Lake
This sounds like a new trending destination to take selfies in front of, but it’s even better than that. Delta Lake is an “open-source storage layer designed to run on top of an existing data lake and improve its reliability, security, and performance.” (source). It let’s you interact with an object storage system like you would with a database. - Source: dev.to / 11 months ago
Delta.io/deltalake self hosting
You are right, delta.io is just a framework. Sorry for the unclear question. Another try: when you host spark on your own with delta as table format compared to usage of Databricks, what are the differences? Source: about 1 year ago
Delta.io/deltalake self hosting
I mean the different between using the delta.io framework to let it run on your own machines/ vms vs using databricks and have clusters defined. Source: about 1 year ago
Delta.io/deltalake self hosting
Is there actually any company implementing delta.io self hosted beside microsoft/synapse and databricks? Would it be worth the effort compared to the features microsoft/databricks bring to the table? Source: about 1 year ago
Lightweight HTTP API for Big Data on S3
We are happy to announce our third opensource project - Delta Fetch. Delta Fetch is a configurable HTTP API service for accessing Delta Lake tables. Service is highly configurable, with possibility to filter your Delta tables by selected columns. - Source: dev.to / about 1 year ago
Question about lambda architecture
I’d suggest looking at the open table formats. Delta lake does an excellent job at providing batch and streaming APIs for Spark. This would unify your workloads. It would follow the medallion architecture which is a bit more popular lately. Aspects of the lamda architecture can still be present in the medallion model, especially when real-time requirements are present. Source: about 1 year ago
HDFS/Spark + Delta: Is this warning dangerous?
I've installed the stack (Hadoop, Hive, Spark) into a Centos VM, built everything from sources to make sure it fits together. Then added Delta Lake (delta.io) from their maven repo. Source: about 1 year ago
Confusion with Hadoop/Hive/Spark
(I configured delta.io in $SPARK_HOME/conf/spark-defaults.xml so it's loaded & available). Source: about 1 year ago
5 Reasons Your Data Lakehouse should Embrace Dremio Cloud
You can query data organized in many open table formats like Apache Iceberg and Delta Lake. (Here is a good article on what is a table format and the differences between different ones). - Source: dev.to / over 1 year ago
How do we bridge SQL and Python.
Bit more specific: https://delta.io/. Source: over 1 year ago
[D] How do you share big datasets with your team and others?
Spark Delta Tables (https://delta.io) - that can persist their data on S3, Azure Blobs, or local files. Source: almost 2 years ago
Databricks platform for small data, is it worth it?
Currently the infrastructure we have is some custom made pipelines that load the data on S3, and I use Delta Tables here and there for its convenience: ACID, time travel, merges, CDC etc... Source: almost 2 years ago
Data point versioning infrastructure for time traveling to a precise point in time?
I've been playing around a bit with Delta (Table/Lake) whatever you want to call it. It has time travel so you can look back and see what the data looked like at a particular point in time. https://delta.io/. Source: almost 2 years ago
What is a Delta Table?
It is a specific table format. https://delta.io/ it’s an open source project just read their website, will have way more info than these comments. Source: almost 2 years ago
CDC with spark
You may want to look at Delta Lake https://delta.io/. Source: almost 2 years ago

Do you know an article comparing Delta Lake to other products?
Suggest a link to a post with product alternatives.

Suggest an article

Generic Delta Lake discussion

This is an informative page about Delta Lake. You can review and discuss the product here. The primary details have not been verified within the last quarter, and they might be outdated. If you think we are missing something, please use the means on this page to comment or suggest changes. All reviews and comments are highly encouranged and appreciated as they help everyone in the community to make an informed choice. Please always be kind and objective when evaluating a product and sharing your opinion.

Delta Lake

Application and Data, Data Stores, and Big Data Tools subtitle