Software Alternatives & Reviews

15 data science tools to consider using in 2021

Apache Spark D3.js IBM SPSS Julia Jupyter Keras MATLAB Matplotlib Python PyTorch
  1. Apache Spark is an engine for big data processing, with built-in modules for streaming, SQL, machine learning and graph processing.
    Pricing:
    • Open Source
    Apache Spark is an open source data processing and analytics engine that can handle large amounts of data -- upward of several petabytes, according to proponents. Spark's ability to rapidly process data has fueled significant growth in the use of the platform since it was created in 2009, helping to make the Spark project one of the largest open source communities among big data technologies.

    #Databases #Big Data #Big Data Analytics 56 social mentions

  2. 2
    D3.js is a JavaScript library for manipulating documents based on data. D3 helps you bring data to life using HTML, SVG, and CSS.
    Pricing:
    • Open Source
    Another open source tool, D3.js is a JavaScript library for creating custom data visualizations in a web browser. Commonly known as D3, which stands for Data-Driven Documents, it uses web standards, such as HTML, Scalable Vector Graphics and CSS, instead of its own graphical vocabulary. D3's developers describe it as a dynamic and flexible tool that requires a minimum amount of effort to generate visual representations of data.

    #Javascript UI Libraries #Charting Libraries #Data Visualization 159 social mentions

  3. NOTE: IBM SPSS has been discontinued.
    IBM SPSS is a predictive analytics suite for enterprises.
    Created by SPSS Inc. in 1968, initially with the name Statistical Package for the Social Sciences, the statistical analysis software was acquired by IBM in 2009, along with the predictive modeling platform, which SPSS had previously bought. While the product family is officially called IBM SPSS, the software is still usually known simply as SPSS.

    #Business & Commerce #Development #Technical Computing

  4. 4
    Julia is a sophisticated programming language designed especially for numerical computing with specializations in analysis and computational science. It is also efficient for web use, general programming, and can be used as a specification language.
    Pricing:
    • Open Source
    Julia 1.0 became available in 2018, nine years after work began on the language; the latest version is 1.6, released in March 2021. The documentation for Julia notes that, because its compiler differs from the interpreters in data science languages like Python and R, new users "may find that Julia's performance is unintuitive at first." But, it claims, "once you understand how Julia works, it's easy to write code that's nearly as fast as C."

    #Programming Language #Technical Computing #OOP 114 social mentions

  5. Project Jupyter exists to develop open-source software, open-standards, and services for interactive computing across dozens of programming languages. Ready to get started? Try it in your browser Install the Notebook.
    Jupyter Notebook's roots are in the programming language Python -- it originally was part of the IPython interactive toolkit open source project before being split off in 2014. The loose combination of Julia, Python and R gave Jupyter its name; along with supporting those three languages, Jupyter has modular kernels for dozens of others.

    #Data Science And Machine Learning #Data Science Tools #Data Science Notebooks 204 social mentions

  6. 6
    Keras is a minimalist, modular neural networks library, written in Python and capable of running on top of either TensorFlow or Theano.
    Pricing:
    • Open Source
    Keras is a programming interface that enables data scientists to more easily access and use the TensorFlow machine learning platform. It's an open source deep learning API and framework written in Python that runs on top of TensorFlow and is now integrated into that platform. Keras previously supported multiple back ends but was tied exclusively to TensorFlow starting with its 2.4.0 release in June 2020.

    #Data Science And Machine Learning #Data Science Tools #OCR 31 social mentions

  7. 7
    A high-level language and interactive environment for numerical computation, visualization, and programming
    Developed and sold by software vendor MathWorks since 1984, Matlab is a high-level programming language and analytics environment for numerical computing, mathematical modeling and data visualization. It's primarily used by conventional engineers and scientists to analyze data, design algorithms and develop embedded systems for wireless communications, industrial control, signal processing and other applications, often in concert with a companion Simulink tool that offers model-based design and simulation capabilities.

    #Technical Computing #Numerical Computation #Data Visualization

  8. matplotlib is a python 2D plotting library which produces publication quality figures in a variety...
    Pricing:
    • Open Source
    Matplotlib is an open source Python plotting library that's used to read, import and visualize data in analytics applications. Data scientists and other users can create static, animated and interactive data visualizations with Matplotlib, using it in Python scripts, the Python and IPython shells, Jupyter Notebook, web application servers and various GUI toolkits.

    #Development #Data Visualization #Technical Computing 98 social mentions

  9. 9
    Python is a clear and powerful object-oriented programming language, comparable to Perl, Ruby, Scheme, or Java.
    Pricing:
    • Open Source
    Python is the most widely used programming language for data science and machine learning and one of the most popular languages overall. The Python open source project's website describes it as "an interpreted, object-oriented, high-level programming language with dynamic semantics," as well as built-in data structures and dynamic typing and binding capabilities. The site also touts Python's simple syntax, saying it's easy to learn and its emphasis on readability reduces the cost of program maintenance.

    #Programming Language #OOP #Generic Programming Language 280 social mentions

  10. 10
    Open source deep learning platform that provides a seamless path from research prototyping to...
    Pricing:
    • Open Source
    First released publicly in 2017, PyTorch uses arraylike tensors to encode model inputs, outputs and parameters. Its tensors are similar to the multidimensional arrays supported by NumPy, another Python library for scientific computing, but PyTorch adds built-in support for running models on GPUs. NumPy arrays can be converted into tensors for processing in PyTorch, and vice versa.

    #Data Science And Machine Learning #Data Science Tools #AI 106 social mentions

  11. 11
    RPL

    R Programming Language

    This product hasn't been added to SaaSHub yet
    The R programming language is an open source environment designed for statistical computing and graphics applications, as well as data manipulation, analysis and visualization. Many data scientists, academic researchers and statisticians use R to retrieve, cleanse, analyze and present data, making it one of the most popular languages for data science and advanced analytics.

  12. 12

    SAS

    With SAS you are part of a community experiencing easy, joyful and reliable services delivered the Scandinavian way
    SAS is an integrated software suite for statistical analysis, advanced analytics, BI and data management. Developed and sold by software vendor SAS Institute Inc., the platform enables users to integrate, cleanse, prepare and manipulate data, and then they can analyze it using different statistical and data science techniques. SAS can be used for various tasks, from basic BI and data visualization to risk management, operational analytics, data mining, predictive analytics and machine learning.

    #Data Dashboard #Business Intelligence #Data Visualization

  13. scikit-learn (formerly scikits.learn) is an open source machine learning library for the Python programming language.
    Pricing:
    • Open Source
    Scikit-learn is an open source machine learning library for Python that's built on the SciPy and NumPy scientific computing libraries, plus Matplotlib for plotting data. It supports both supervised and unsupervised machine learning and includes numerous algorithms and models, called estimators in scikit-learn parlance. Additionally, it provides functionality for model fitting, selection and evaluation, and data preprocessing and transformation.

    #Data Science And Machine Learning #Data Science Tools #Python Tools 27 social mentions

  14. TensorFlow is an open-source machine learning framework designed and published by Google. It tracks data flow graphs over time. Nodes in the data flow graphs represent machine learning algorithms. Read more about TensorFlow.
    Pricing:
    • Open Source
    Keras is a programming interface that enables data scientists to more easily access and use the TensorFlow machine learning platform. It's an open source deep learning API and framework written in Python that runs on top of TensorFlow and is now integrated into that platform. Keras previously supported multiple back ends but was tied exclusively to TensorFlow starting with its 2.4.0 release in June 2020.

    #Data Science And Machine Learning #Data Science Tools #AI 7 social mentions

  15. 15
    WEKA is a set of powerful data mining tools that run on Java.
    Weka is free software licensed under the GNU General Public License. It was developed at the University of Waikato in New Zealand starting in 1992; an initial version was rewritten in Java to create the current workbench, which was first released in 1999. Weka stands for the Waikato Environment for Knowledge Analysis and is also the name of a flightless bird native to New Zealand that the technology's developers say has "an inquisitive nature."

    #Data Science And Machine Learning #Data Science Tools #Python Tools

  16. Analytic Process Automation (APA) delivers automation of analytics, machine learning and data science processes; enabling the agility needed to accelerate digital transformation.

    #Data Science And Machine Learning #Data Science #Machine Learning

  17. Amazon SageMaker provides every developer and data scientist with the ability to build, train, and deploy machine learning models quickly.

    #Data Science And Machine Learning #Data Science Tools #Machine Learning 35 social mentions

  18. Azure Machine Learning Studio is a GUI-based integrated development environment for constructing and operationalizing Machine Learning workflow on Azure.

    #Machine Learning #AI #Data Science And Machine Learning 2 social mentions

  19. One platform for accelerating data-driven innovation across data engineering, data science & business analytics

    #Office & Productivity #Development #Data Science And Machine Learning 1 social mentions

  20. 20
    Dataiku is the developer of DSS, the integrated development platform for data professionals to turn raw data into predictions.
    Some platforms are also available in free open source or community editions -- examples include Dataiku and H2O. Knime combines an open source analytics platform with a commercial Knime Server software package that supports team-based collaboration and workflow automation, deployment and management.

    #Data Science And Machine Learning #Data Science Tools #Python Tools

  21. Become an AI-Driven Enterprise with Automated Machine Learning
    Pricing:
    • Open Source

    #Business & Commerce #Data Science And Machine Learning #Technical Computing 1 social mentions

  22. Domino is a data science platform that enables collaborative and reusable analysis of data.

    #Business & Commerce #Development #Data Dashboard

  23. Google Cloud Machine Learning is a service that enables user to easily build machine learning models, that work on any type of data, of any size.
    Pricing:
    • Open Source

    #Data Science And Machine Learning #Data Science Tools #Python Tools 21 social mentions

  24. 24
    Democratizing Generative AI. Own your models: generative and predictive. We bring both super powers together with h2oGPT.
    Pricing:
    • Open Source

    #Data Science And Machine Learning #AI #Machine Learning 1 social mentions

  25. Learn more about Watson Studio. Increase productivity by giving your team a single environment to work with the best of open source and IBM software, to build and deploy an AI solution.

    #Machine Learning #AI #Technical Computing

  26. 26
    KNIME, the open platform for your data.
    Some platforms are also available in free open source or community editions -- examples include Dataiku and H2O. Knime combines an open source analytics platform with a commercial Knime Server software package that supports team-based collaboration and workflow automation, deployment and management.

    #Business & Commerce #Development #Data Science And Machine Learning 2 social mentions

  27. RapidMiner is a software platform for data science teams that unites data prep, machine learning, and predictive model deployment.

    #Data Science And Machine Learning #Data Science Tools #Python Tools 3 social mentions

  28. Data science is a team sport. Data scientists, citizen data scientists, business users, and developers need flexible and extensible tools that promote collaboration, automation, and...
    The development of SAS started in 1966 at North Carolina State University; use of the technology began to grow in the early 1970s, and SAS Institute was founded in 1976 as an independent company. The software was initially built for use by statisticians -- SAS was short for Statistical Analysis System. But, over time, it was expanded to include a broad set of functionality and became one of the most widely used analytics suites in both commercial enterprises and academia.

    #Business & Commerce #Technical Computing #Development

Discuss: 15 data science tools to consider using in 2021

Log in or Post with