Software Alternatives & Reviews

Top 10 Popular Open-Source ETL Tools for 2021

Recommended and mentioned products

  1. Apache Camel is a versatile open-source integration framework based on known enterprise integration patterns.

    Can I continuously write to a CSV file with a python script... about 8 days ago

    Since you're writing a Java app to consume this, I highly recommend Apache Camel to do the consuming of messages for it. You can trivially aim it at file systems, message queues, databases, web services and all manner of other sources to grab your data for you, and you can change your mind about what that source is, without having to rewrite most of your client code.
  2. Replicate data in minutes with prebuilt & custom connectors

    Data Pipeline: From ETL to EL plus T about about 1 month ago:

    Yes, absolutely, Airbyte, and there are many similar solutions, but Airbyte is open source and relatively easy to use.
  3. Apache Kafka is an open-source message broker project developed by the Apache Software Foundation written in Scala.

    Horizontally scaling Kafka consumers with rendezvous hashing about 15 days ago:

    Apache Kafka is the foundational architecture most developers choose when building streaming applications. It’s incredibly scalable, fault-tolerant, and dependable. And its popularity has yielded vast knowledge with an active community to support developers when they build applications. Naturally, Kafka is now a table-stakes data platform component for nearly every major enterprise that deals with events data.
  4. logstash is a tool for managing events and logs.

  5. Pentaho Data Integration ( ETL ) a.k.a Kettle

  6. Connect to any data source in batch or real-time, across any platform. Download Talend Open Studio today to start working with Hadoop and NoSQL.

  7. Simple, Composable, Open Source ETL

    CDC (Change Data Capture) with 3rd party APIs about 5 months ago

    Or you could build your own such system and run it on Airflow, Prefect, Dagster, etc. Check out the Singer project for a suite of Python packages designed for such a task. Quality varies greatly, though.
  8. KET

    KETL

    This hasn't been added to SaaSHub yet

  9. An easy to use, powerful, and reliable system to process and distribute data.

    S3 to S3 transform about 19 days ago:

    For a simple sequential Pipeline, my goto would be Apache Camel. As soon as you want complexity its either Apache Nifi or a micro service architecture.
  10. CloverDX is a data integration platform for designing, automating and operating data jobs at scale.