Based on our record, Apache Airflow seems to be a lot more popular than Cloud Dataprep. While we know about 65 links to Apache Airflow, we've tracked only 3 mentions of Cloud Dataprep. We are tracking product recommendations and mentions on various public social media platforms and blogs. They can help you identify which product is more popular and what people think of it.
Check Google Cloud Dataprep – requires no coding, you can normalize & clean up the data as well. I've done this many times, saved me headaches from dirty data in Excel files. Source: almost 2 years ago
Not sure if I understand the request but a commercial tool I know of is https://cloud.google.com/dataprep - it sounds like that could be helpful but I am not sure. Source: over 2 years ago
If you need to adjust the underlying data, you can use Cloud Dataprep to do manipulations (here). Source: about 3 years ago
For the third, examples here might be analytics plugins in specialized databases like Clickhouse, data-transformations in places like your ETL pipeline using Airflow or Fivetran, or special integrations in your authentication workflow with Auth0 hooks and rules. - Source: dev.to / 3 months ago
Apache Airflow is an open-source platform to programmatically author, schedule, and monitor workflows. The platform features a web-based user interface and a command-line interface for managing and triggering workflows. Source: 6 months ago
Airflow is the most widely used and well-known tool for orchestrating data workflows. It allows for efficient pipeline construction, scheduling, and monitoring. - Source: dev.to / 6 months ago
AIRFLOW This is more of a library in my opinion, but Airflow has become an essential tool for scheduling in my work. All our ML training pipelines are ordered and scheduled with Airflow and it works seamlessly. The dashboard provided is also fantastic! Source: 7 months ago
I agree there are many options in this space. Two others to consider: - https://airflow.apache.org/ - https://github.com/spotify/luigi There are also many Kubernetes based options out there. For the specific use case you specified, you might even consider a plain old Makefile and incrond if you expect these all to run on a single host and be triggered by a new file... - Source: Hacker News / 8 months ago
Amazon SageMaker - Amazon SageMaker provides every developer and data scientist with the ability to build, train, and deploy machine learning models quickly.
ifttt - IFTTT puts the internet to work for you. Create simple connections between the products you use every day.
GeoSpock - GeoSpock is the platform for data lake management, providing a unified view of the data assets within an organization and making it easily accessible.
Microsoft Power Automate - Microsoft Power Automate is an automation platform that integrates DPA, RPA, and process mining. It lets you automate your organization at scale using low-code and AI.
Delta Lake - Application and Data, Data Stores, and Big Data Tools
Make.com - Tool for workflow automation (Former Integromat)