-
Apache Spark is an engine for big data processing, with built-in modules for streaming, SQL, machine learning and graph processing.Pricing:
- Open Source
I would guess that Apache Spark would be an okay choice. With data stored locally in avro or parquet files. Just processing the data in python would also work, IMO.
#Databases #Big Data #Big Data Analytics 56 social mentions
-
Apache Parquet is a columnar storage format available to any project in the Hadoop ecosystem.Pricing:
- Open Source
I would guess that Apache Spark would be an okay choice. With data stored locally in avro or parquet files. Just processing the data in python would also work, IMO.
#Databases #Big Data #Relational Databases 19 social mentions