In software development we often unit test our code (hopefully). And code written for Spark is no different. So here I want to run through an example of building a small library using PySpark and unit testing it. I'm using Visual Studio Code as my editor here, mostly because I think it's brilliant, but other editors are available. - Source: dev.to / 21 days ago
Azure Data Factory - Learn more about Azure Data Factory, the easiest cloud-based hybrid data integration solution at an enterprise scale. Build data factories without the need to code.
Hadoop - Open-source software for reliable, scalable, distributed computing
Apache NiFi - An easy to use, powerful, and reliable system to process and distribute data.
Hive - Seamless project management and collaboration for your team.
Talend Big Data Platform - Talend Big Data Platform is a data integration and data quality platform built on Spark for cloud and on-premises.
Hortonworks - Hadoop-Related