Delta is pretty great, let's you do upserts into tables in DataBricks much easier than without it. I think the website is here: https://delta.io. - Source: Hacker News / 3 months ago
Apache Iceberg is one of the three types of lakehouse, the other two are Apache Hudi and Delta Lake. - Source: dev.to / 4 months ago
The Apache Spark / Databricks community prefers Apache parquet or Linux Fundation's delta.io over json. Source: 5 months ago
Databricks provides Jupyter lab like notebooks for analysis and ETL pipelines using spark through pyspark, sparkql or scala. I think R is supported as well but it doesn't interop as well with their newer features as well as python and SQL do. It interfaces with cloud storage backend like S3 and offers some improvements to the parquet format of data querying that allows for updating, ordering and merged through... - Source: Hacker News / 10 months ago
Structured, Semi-structured and Unstructured can be stored in one single format, a lakehouse storage format like Delta, Iceberg or Hudi (assuming those don't require low-latency SLAs like subsecond). Source: 11 months ago
Take a look at Delta Lake https://delta.io, it enables a lot of database-like actions on files. Source: 11 months ago
This sounds like a new trending destination to take selfies in front of, but it’s even better than that. Delta Lake is an “open-source storage layer designed to run on top of an existing data lake and improve its reliability, security, and performance.” (source). It let’s you interact with an object storage system like you would with a database. - Source: dev.to / 11 months ago
You are right, delta.io is just a framework. Sorry for the unclear question. Another try: when you host spark on your own with delta as table format compared to usage of Databricks, what are the differences? Source: about 1 year ago
I mean the different between using the delta.io framework to let it run on your own machines/ vms vs using databricks and have clusters defined. Source: about 1 year ago
Is there actually any company implementing delta.io self hosted beside microsoft/synapse and databricks? Would it be worth the effort compared to the features microsoft/databricks bring to the table? Source: about 1 year ago
We are happy to announce our third opensource project - Delta Fetch. Delta Fetch is a configurable HTTP API service for accessing Delta Lake tables. Service is highly configurable, with possibility to filter your Delta tables by selected columns. - Source: dev.to / about 1 year ago
I’d suggest looking at the open table formats. Delta lake does an excellent job at providing batch and streaming APIs for Spark. This would unify your workloads. It would follow the medallion architecture which is a bit more popular lately. Aspects of the lamda architecture can still be present in the medallion model, especially when real-time requirements are present. Source: about 1 year ago
I've installed the stack (Hadoop, Hive, Spark) into a Centos VM, built everything from sources to make sure it fits together. Then added Delta Lake (delta.io) from their maven repo. Source: about 1 year ago
(I configured delta.io in $SPARK_HOME/conf/spark-defaults.xml so it's loaded & available). Source: about 1 year ago
You can query data organized in many open table formats like Apache Iceberg and Delta Lake. (Here is a good article on what is a table format and the differences between different ones). - Source: dev.to / over 1 year ago
Bit more specific: https://delta.io/. Source: over 1 year ago
Spark Delta Tables (https://delta.io) - that can persist their data on S3, Azure Blobs, or local files. Source: almost 2 years ago
Currently the infrastructure we have is some custom made pipelines that load the data on S3, and I use Delta Tables here and there for its convenience: ACID, time travel, merges, CDC etc... Source: almost 2 years ago
I've been playing around a bit with Delta (Table/Lake) whatever you want to call it. It has time travel so you can look back and see what the data looked like at a particular point in time. https://delta.io/. Source: almost 2 years ago
It is a specific table format. https://delta.io/ it’s an open source project just read their website, will have way more info than these comments. Source: almost 2 years ago
You may want to look at Delta Lake https://delta.io/. Source: almost 2 years ago
Do you know an article comparing Delta Lake to other products?
Suggest a link to a post with product alternatives.
This is an informative page about Delta Lake. You can review and discuss the product here. The primary details have not been verified within the last quarter, and they might be outdated. If you think we are missing something, please use the means on this page to comment or suggest changes. All reviews and comments are highly encouranged and appreciated as they help everyone in the community to make an informed choice. Please always be kind and objective when evaluating a product and sharing your opinion.