Software Alternatives & Reviews

Perform computation over 500 million vectors

Apache Spark Apache Parquet
  1. Apache Spark is an engine for big data processing, with built-in modules for streaming, SQL, machine learning and graph processing.
    Pricing:
    • Open Source
    I would guess that Apache Spark would be an okay choice. With data stored locally in avro or parquet files. Just processing the data in python would also work, IMO.

    #Databases #Big Data #Big Data Analytics 56 social mentions

  2. Apache Parquet is a columnar storage format available to any project in the Hadoop ecosystem.
    Pricing:
    • Open Source
    I would guess that Apache Spark would be an okay choice. With data stored locally in avro or parquet files. Just processing the data in python would also work, IMO.

    #Databases #Big Data #Relational Databases 19 social mentions

Discuss: Perform computation over 500 million vectors

Log in or Post with