Level 1 of MLOps is when you've put each lifecycle stage and their intefaces in an automated pipeline. The pipeline could be a python or bash script, or it could be a directed acyclic graph run by some orchestration framework like Airflow, dagster or one of the cloud-provider offerings. AI- or data-specific platforms like MLflow, ClearML and dvc also feature pipeline capabilities. - Source: dev.to / 2 days ago
For the third, examples here might be analytics plugins in specialized databases like Clickhouse, data-transformations in places like your ETL pipeline using Airflow or Fivetran, or special integrations in your authentication workflow with Auth0 hooks and rules. - Source: dev.to / 3 months ago
Apache Airflow is an open-source platform to programmatically author, schedule, and monitor workflows. The platform features a web-based user interface and a command-line interface for managing and triggering workflows. Source: 6 months ago
Airflow is the most widely used and well-known tool for orchestrating data workflows. It allows for efficient pipeline construction, scheduling, and monitoring. - Source: dev.to / 6 months ago
AIRFLOW This is more of a library in my opinion, but Airflow has become an essential tool for scheduling in my work. All our ML training pipelines are ordered and scheduled with Airflow and it works seamlessly. The dashboard provided is also fantastic! Source: 8 months ago
I agree there are many options in this space. Two others to consider: - https://airflow.apache.org/ - https://github.com/spotify/luigi There are also many Kubernetes based options out there. For the specific use case you specified, you might even consider a plain old Makefile and incrond if you expect these all to run on a single host and be triggered by a new file... - Source: Hacker News / 8 months ago
Folks who have used Python-based orchestration tools such as Apache Airflow, Luigi and Mage will be familiar with the concepts and the API if PyJaws. Source: almost 1 year ago
There are a range of solutions available to help make this process easier. Some of these options include automation tools such as Apache Airflow and Azure Data Factory, specialized libraries that focus on fine-tuning deep learning models like FinetunerPlus, or machine learning platforms that provide end-to-end solutions like Amazon SageMaker and Google Cloud ML Engine. Source: about 1 year ago
Looks interesting as a light-weight alternative to https://www.prefect.io/ (which itself is a lighter-weight / more modern alternative to https://airflow.apache.org/ ). Source: about 1 year ago
A few years ago, I opened a GitHub issue with Microsoft telling them that I think the .NET ecosystem needs its own equivalent of Apache Airflow or Prefect. Fast forward 'til now, and I still don't think we have anything close to these frameworks. Source: about 1 year ago
If you have the spare capacity Apache Airflow is great for this. Source: over 1 year ago
Its a bit overkill but I use Airflow with local executor. Source: over 1 year ago
To learn more about it, I built a Data Pipeline that uses Apache Airflow to pull Elon Musk tweets using the Twitter API and store the result in a CSV stored in a MinIO (OSS alternative to AWS s3) Object Storage bucket. - Source: dev.to / over 1 year ago
Airflow, that's it https://airflow.apache.org/. Source: over 1 year ago
If you are fixed on developing a solution in house, you may have options that don't require many additional tools. Aurora Postgres already supports exporting data to S3. So use an orchestration tool like AWS ECS Scheduled Tasks, Airflow, Prefect, etc to run a script (probably Python). That script can ask for all the distinct tenant ids "SELECT distinct tenant_id FROM...". Then iterate through them and run a query... Source: over 1 year ago
Airflow is really popular, started at Airbnb. Pros: huge community, super mature. Cons: generic workflow orchestration, not the best for handling only data, hard to scale and maintain. Source: over 1 year ago
Seems like you want documentation. Look into plantuml or something like that? For reference tools like airflow provide this stuff too. Source: over 1 year ago
Airflow might also be a good option for you. Essentially DAGs of cronjobs. We like it a lot. Source: over 1 year ago
$ helm upgrade --install airflow apache-airflow/airflow --namespace airflow --create-namespace Release "airflow" does not exist. Installing it now. NAME: airflow LAST DEPLOYED: Sun Nov 6 02:06:55 2022 NAMESPACE: airflow STATUS: deployed REVISION: 1 TEST SUITE: None NOTES: Thank you for installing Apache Airflow 2.4.1! Your release is named airflow. You can now access your dashboard(s) by executing the following... - Source: dev.to / over 1 year ago
Https://airflow.apache.org/ might be worth looking into. Source: over 1 year ago
I gotta admit, my first thought was "Duct Size" is a weird name for a distributed work-flow tool[1]. [1] https://airflow.apache.org/. - Source: Hacker News / over 1 year ago
Do you know an article comparing Apache Airflow to other products?
Suggest a link to a post with product alternatives.
This is an informative page about Apache Airflow. You can review and discuss the product here. The primary details have not been verified within the last quarter, and they might be outdated. If you think we are missing something, please use the means on this page to comment or suggest changes. All reviews and comments are highly encouranged and appreciated as they help everyone in the community to make an informed choice. Please always be kind and objective when evaluating a product and sharing your opinion.