Software Alternatives, Accelerators & Startups
Table of contents
  1. Videos
  2. Social Mentions
  3. Comments

Apache Flink

Flink is a streaming dataflow engine that provides data distribution, communication, and fault tolerance for distributed computations.

Apache Flink Reviews and details

Screenshots and images

  • Apache Flink Landing page
    Landing page //
    2023-10-03

Badges & Trophies

Promote Apache Flink. You can add any of these badges on your website.

SaaSHub badge
Show embed code
SaaSHub badge
Show embed code

Videos

GOTO 2019 • Introduction to Stateful Stream Processing with Apache Flink • Robert Metzger

Apache Flink Tutorial | Flink vs Spark | Real Time Analytics Using Flink | Apache Flink Training

How to build a modern stream processor: The science behind Apache Flink - Stefan Richter

Social recommendations and mentions

We have tracked the following product recommendations or mentions on various public social media platforms and blogs. They can help you see what people think about Apache Flink and what they use it for.
  • Show HN: Restate, low-latency durable workflows for JavaScript/Java, in Rust
    Restate is built as a sharded replicated state machine similar to how TiKV (https://tikv.org/), Kudu (https://kudu.apache.org/kudu.pdf) or CockroachDB (https://github.com/cockroachdb/cockroach) since it makes it possible to tune the system more easily for different deployment scenarios (on-prem, cloud, cost-effective blob storage). Moreover, it allows for some other cool things like seamlessly moving from one log... - Source: Hacker News / about 1 month ago
  • Array Expansion in Flink SQL
    I’ve recently started my journey with Apache Flink. As I learn certain concepts, I’d like to share them. One such "learning" is the expansion of array type columns in Flink SQL. Having used ksqlDB in a previous life, I was looking for functionality similar to the EXPLODE function to "flatten" a collection type column into a row per element of the collection. Because Flink SQL is ANSI compliant, it’s no surprise... - Source: dev.to / about 2 months ago
  • Show HN: An SQS Alternative on Postgres
    You should let the Apache Flink team know, they mention exactly-once processing on their home page (under "correctness guarantees") and in their list of features. [0] https://flink.apache.org/ [1] https://flink.apache.org/what-is-flink/flink-applications/#building-blocks-for-streaming-applications. - Source: Hacker News / 2 months ago
  • Top 10 Common Data Engineers and Scientists Pain Points in 2024
    Data scientists often prefer Python for its simplicity and powerful libraries like Pandas or SciPy. However, many real-time data processing tools are Java-based. Take the example of Kafka, Flink, or Spark streaming. While these tools have their Python API/wrapper libraries, they introduce increased latency, and data scientists need to manage dependencies for both Python and JVM environments. For example,... - Source: dev.to / 3 months ago
  • Choosing Between a Streaming Database and a Stream Processing Framework in Python
    Other stream processing engines (such as Flink and Spark Streaming) provide SQL interfaces too, but the key difference is a streaming database has its storage. Stream processing engines require a dedicated database to store input and output data. On the other hand, streaming databases utilize cloud-native storage to maintain materialized views and states, allowing data replication and independent storage scaling. - Source: dev.to / 5 months ago
  • Go concurrency simplified. Part 4: Post office as a data pipeline
    Also, this knowledge applies to learning more about data engineering, as this field of software engineering relies heavily on the event-driven approach via tools like Spark, Flink, Kafka, etc. - Source: dev.to / 7 months ago
  • Five Apache projects you probably didn't know about
    Apache SeaTunnel is a data integration platform that offers the three pillars of data pipelines: sources, transforms, and sinks. It offers an abstract API over three possible engines: the Zeta engine from SeaTunnel or a wrapper around Apache Spark or Apache Flink. Be careful, as each engine comes with its own set of features. - Source: dev.to / 7 months ago
  • Getting Started with Flink SQL, Apache Iceberg and DynamoDB Catalog
    Due to the technology transformation we want to do recently, we started to investigate Apache Iceberg. In addition, the data processing engine we use in house is Apache Flink, so it's only fair to look for an experimental environment that integrates Flink and Iceberg. - Source: dev.to / 7 months ago
  • Snowflake - what are the streaming capabilities it provides?
    When low latency matters you should always consider an ETL approach rather than ELT, e.g. Collect data in Kafka and process using Kafka Streams/Flink in Java or Quix Streams/Bytewax in Python, then sink it to Snowflake where you can handle non-critical workloads (as is the case for 99% of BI/analytics). This way you can choose the right path for your data depending on how quickly it needs to be served. Source: about 1 year ago
  • JR, quality Random Data from the Command line, part I
    Sometimes we may need to generate random data of type 2 in different streams, so the "coherency" must also spread across different entities, think for example to referential integrity in databases. If I am generating users, products and orders to three different Kafka topics and I want to create a streaming application with Apache Flink, I definitely need data to be coherent across topics. - Source: dev.to / about 1 year ago
  • Brand Lift Studies on Reddit
    The Treatment and Control audiences need to be stored for future low-latency, high-reliability retrieval. Retrieval happens when we are delivering the survey, and informs the system which users to send surveys to. How is this achieved at Reddit’s scale? Users interact with ads, which generate events that are sent to our downstream systems for processing. At the output, these interactions are stored in DynamoDB as... Source: over 1 year ago
  • Query Real Time Data in Kafka Using SQL
    Most streaming database technologies use SQL for these reasons: RisingWave, Materialize, KsqlDB, Apache Flink, and so on offering SQL interfaces. This post explains how to choose the right streaming database. - Source: dev.to / over 1 year ago
  • 5 Best Practices For Data Integration To Boost ROI And Efficiency
    There are different ways to implement parallel dataflows, such as using parallel data processing frameworks like Apache Hadoop, Apache Spark, and Apache Flink, or using cloud-based services like Amazon EMR and Google Cloud Dataflow. It is also possible to use parallel dataflow frameworks to handle big data and distributed computing, like Apache Nifi and Apache Kafka. Source: over 1 year ago
  • Forward Compatible Enum Values in API with Java Jackson
    We’re not discussing the technical details behind the deduplication process. It could be Apache Flink, Apache Spark, or Kafka Streams. Anyway, it’s out of the scope of this article. - Source: dev.to / over 1 year ago
  • Which MQTT (or similar protocol) broker for a few 10k IoT devices with quite a lot of traffic?
    One can also consider https://flink.apache.org/ instead of Kafka for connecting a large number of devices. Source: over 1 year ago
  • Apache Pulsar vs Apache Kafka - How to choose a data streaming platform
    Both Kafka and Pulsar provide some kind of stream processing capability, but Kafka is much further along in that regard. Pulsar stream processing relies on the Pulsar Functions interface which is only suited for simple callbacks. On the other hand, Kafka Streams and ksqlDB are more complete solutions that could be considered replacements for Apache Spark or Apache Flink, state-of-the-art stream-processing... - Source: dev.to / over 1 year ago
  • Real Time Data Infra Stack
    The Apache Flink, which is often mentioned, is one of these options, and there are many others. - Source: dev.to / over 1 year ago
  • In One Minute : Hadoop
    Flink, a fast and reliable large-scale data processing engine. - Source: dev.to / over 1 year ago
  • A peek into Location Data Science at Ola
    This requires the use of distributed computation tools such as Spark and Hadoop, Flink and Kafka are used. But for occasional experimentation, Pandas, Geopandas and Dask are some of the commonly used tools. - Source: dev.to / almost 2 years ago
  • Evolutionary Data Infrastructure
    Therefore, I still recommend using a streaming framework such as Apache Flink or Apache Kafka Streams. - Source: dev.to / almost 2 years ago
  • Headless BI with streaming data
    In the last few years, streaming SQL technologies such as ksqlDB, Materialize, and Apache Flink have significantly progressed. These technologies enable us to process streaming data and run analysis with SQL—without needing to learn a new language or build specific language-unique integrations. - Source: dev.to / almost 2 years ago

Do you know an article comparing Apache Flink to other products?
Suggest a link to a post with product alternatives.

Suggest an article

Apache Flink discussion

Log in or Post with

This is an informative page about Apache Flink. You can review and discuss the product here. The primary details have not been verified within the last quarter, and they might be outdated. If you think we are missing something, please use the means on this page to comment or suggest changes. All reviews and comments are highly encouranged and appreciated as they help everyone in the community to make an informed choice. Please always be kind and objective when evaluating a product and sharing your opinion.