What is the separation of storage and compute in data platforms and why does it matter?

Databases Big Data Cloud Hosting

Apache Spark Landing Page
1

Apache Spark

Apache Spark is an engine for big data processing, with built-in modules for streaming, SQL, machine learning and graph processing.
Pricing:
- Open Source
However, once your data reaches a certain size or you reach the limits of vertical scaling, it may be necessary to distribute your queries across a cluster, or scale horizontally. This is where distributed query engines like Trino and Spark come in. Distributed query engines make use of a coordinator to plan the query and multiple worker nodes to execute them in parallel.

#Databases #Big Data #Big Data Analytics 56 social mentions
Apache Parquet Landing Page
2

Apache Parquet

Apache Parquet is a columnar storage format available to any project in the Hadoop ecosystem.
Pricing:
- Open Source
Apache Parquet is a file format that is used to store data in a columnar format. This allows for faster reads and reduced disk space. Not only is the Parquet format open source, but it also has an entire ecosystem of tools to help you create, read and transform data.

#Databases #Big Data #Relational Databases 19 social mentions
Amazon S3 Landing Page

3

Amazon S3

Amazon S3 is an object storage where users can store data from their business on a safe, cloud-based platform. Amazon S3 operates in 54 availability zones within 18 graphic regions and 1 local region.

Object Storage: Services like AWS S3 can be used to store files on-demand with "infinite storage". There is no need to provision or manage capacity and it is completely decoupled from the instance or serverless that is doing the processing.

#Cloud Hosting #Object Storage #Cloud Storage 172 social mentions

Discuss: What is the separation of storage and compute in data platforms and why does it matter?

14 Websites to Download Research Paper for Free – 2024

ilovephd.com // 2 months ago

IMDb Alternatives

tutorialspoint.com // 10 months ago

Log analysis: Elasticsearch vs Apache Doris

doris.apache.org // 8 months ago

10 Best Cheap Web Hosting in India

actualpost.com // about 1 year ago

Rockset, ClickHouse, Apache Druid, or Apache Pinot? Which is the best database for customer-facing analytics?

embeddable.com // 6 months ago

Best Web Hosting Affiliate Programs in 2023

digiexe.com // 7 months ago

What is the separation of storage and compute in data platforms and why does it matter?

This page summarizes and extends the software alternatives mentioned in the source post on dev.to.

2022-11-29

Apache Spark

Apache Parquet

Amazon S3

Discuss: What is the separation of storage and compute in data platforms and why does it matter?

Related Posts

What is the separation of storage and compute in data platforms and why does it matter?

This page summarizes and extends the software alternatives mentioned in the source post on dev.to. 2022-11-29

Apache Spark

Apache Parquet

Amazon S3

Discuss: What is the separation of storage and compute in data platforms and why does it matter?

Related Posts

This page summarizes and extends the software alternatives mentioned in the source post on dev.to.

2022-11-29