Software Alternatives, Accelerators & Startups

Analysing Github Stars - Extracting and analyzing data from Github using Apache NiFi®, Apache Kafka® and Apache Druid®

Apache ZooKeeper Apache NiFi Apache Kafka Apache Druid
  1. Apache ZooKeeper is an effort to develop and maintain an open-source server which enables highly reliable distributed coordination.
    Pricing:
    • Open Source
    You can install Kafka from https://kafka.apache.org/quickstart. Because Druid and Kafka both use Apache Zookeeper, I opted to use the Zookeeper deployment that comes with Druid, so didn’t start it with Kafka. Once running, I created two topics for me to post the data into, and for Druid to ingest from:.

    #Web And Application Servers #Web Servers #Application Server 32 social mentions

  2. An easy to use, powerful, and reliable system to process and distribute data.
    Pricing:
    • Open Source
    Spencer Kimball (now CEO at CockroachDB) wrote an interesting article on this topic in 2021 where they created spencerkimball/stargazers based on a Python script. So I started thinking: could I create a data pipeline using Nifi and Kafka (two OSS tools often used with Druid) to get the API data into Druid - and then use SQL to do the analytics? The answer was yes! And I have documented the outcome below. Here’s my analytical pipeline for Github stars data using Nifi, Kafka and Druid.

    #Analytics #Web Analytics #Mobile Analytics 18 social mentions

  3. Apache Kafka is an open-source message broker project developed by the Apache Software Foundation written in Scala.
    Pricing:
    • Open Source
    Spencer Kimball (now CEO at CockroachDB) wrote an interesting article on this topic in 2021 where they created spencerkimball/stargazers based on a Python script. So I started thinking: could I create a data pipeline using Nifi and Kafka (two OSS tools often used with Druid) to get the API data into Druid - and then use SQL to do the analytics? The answer was yes! And I have documented the outcome below. Here’s my analytical pipeline for Github stars data using Nifi, Kafka and Druid.

    #Stream Processing #Data Integration #ETL 144 social mentions

  4. Fast column-oriented distributed data store
    Pricing:
    • Open Source
    Spencer Kimball (now CEO at CockroachDB) wrote an interesting article on this topic in 2021 where they created spencerkimball/stargazers based on a Python script. So I started thinking: could I create a data pipeline using Nifi and Kafka (two OSS tools often used with Druid) to get the API data into Druid - and then use SQL to do the analytics? The answer was yes! And I have documented the outcome below. Here’s my analytical pipeline for Github stars data using Nifi, Kafka and Druid.

    #Databases #Data Analysis #Relational Databases 10 social mentions

Discuss: Analysing Github Stars - Extracting and analyzing data from Github using Apache NiFi®, Apache Kafka® and Apache Druid®

Log in or Post with