-
spaCy is a library for advanced natural language processing in Python and Cython.Pricing:
- Open Source
We have tasks which actually require lots of different Spacy language models to be loaded at once, and we load them on many processes at once.
#Natural Language Processing #NLP And Text Analytics #Spreadsheets 58 social mentions
-
Luigi is a Python module that helps you build complex pipelines of batch jobs.
At Wonderflow we're doing a lot of ML / NLP using Python and recently we are enjoying writing data pipelines using Spotify's Luigi.
#DevOps Tools #Workflow Automation #Workflows 9 social mentions
-
Dask natively scales Python Dask provides advanced parallelism for analytics, enabling performance at scale for the tools you lovePricing:
- Open Source
To do that, we are efficiently using Dask, simply creating on-demand local (or remote) clusters on task run() method:.
#Workflows #Databases #Software Development 16 social mentions
-
Airflow is a platform to programmaticaly author, schedule and monitor data pipelines.Pricing:
- Open Source
Moreover, configure and deploy the Luigi's Scheduler on a server / pod for production use is easy, while it might be not for other similar tools like Apache AirFlow.
#Workflows #Workflow Automation #Data Pipelines 65 social mentions