July 30, 2020 -
Alexander Fedulov
(@alex_fedulov)
Introduction # In the previous articles of the series, we described how you can achieve flexible stream partitioning based on dynamically-updated configurations (a set of fraud-detection rules) and how you can utilize Flink's Broadcast mechanism to distribute processing configuration at runtime among the relevant operators. Following up directly where we left the discussion of the end-to-end solution last time, in this article we will describe how you can use the "Swiss knife" of Flink - the Process Function to create an implementation that is tailor-made to match your streaming business logic requirements.
...
Continue reading »
July 29, 2020 -
Marta Paes
(@morsapaes)
As July draws to an end, we look back at a monthful of activity in the Flink community, including two releases (!) and some work around improving the first-time contribution experience in the project.
Also, events are starting to pick up again, so we’ve put together a list of some great ones you can (virtually) attend in August!
The Past Month in Flink # Flink Releases # Flink 1.11 # A couple of weeks ago, Flink 1.
...
Continue reading »
July 28, 2020 -
Jark Wu
(@JarkWu)
Apache Flink 1.11 has released many exciting new features, including many developments in Flink SQL which is evolving at a fast pace. This article takes a closer look at how to quickly build streaming applications with Flink SQL from a practical point of view.
In the following sections, we describe how to integrate Kafka, MySQL, Elasticsearch, and Kibana with Flink SQL to analyze e-commerce user behavior in real-time. All exercises in this blogpost are performed in the Flink SQL CLI, and the entire process uses standard SQL syntax, without a single line of Java/Scala code or IDE installation.
...
Continue reading »
July 23, 2020 -
Dawid Wysakowicz
(@dwysakowicz)
With an ever-growing number of people working with data, it’s a common practice for companies to build self-service platforms with the goal of democratizing their access across different teams and — especially — to enable users from any background to be independent in their data needs. In such environments, metadata management becomes a crucial aspect. Without it, users often work blindly, spending too much time searching for datasets and their location, figuring out data formats and similar cumbersome tasks.
...
Continue reading »
July 21, 2020 -
Dian Fu
(@DianFu11)
The Apache Flink community released the first bugfix version of the Apache Flink 1.11 series.
This release includes 44 fixes and minor improvements for Flink 1.11.0. The list below includes a detailed list of all fixes and improvements.
We highly recommend all users to upgrade to Flink 1.11.1.
Updated Maven dependencies:
<dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-java</artifactId> <version>1.11.1</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-streaming-java_2.11</artifactId> <version>1.11.1</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-clients_2.11</artifactId> <version>1.11.1</version> </dependency> You can find the binaries on the updated Downloads page.
...
Continue reading »
July 14, 2020 -
Kostas Kloudas
(@kkloudas)
With the rise of stream processing and real-time analytics as a critical tool for modern businesses, an increasing number of organizations build platforms with Apache Flink at their core and offer it internally as a service. Many talks with related topics from companies like Uber, Netflix and Alibaba in the latest editions of Flink Forward further illustrate this trend.
These platforms aim at simplifying application submission internally by lifting all the operational burden from the end user.
...
Continue reading »
July 6, 2020 -
Marta Paes
(@morsapaes)
The Apache Flink community is proud to announce the release of Flink 1.11.0! More than 200 contributors worked on over 1.3k issues to bring significant improvements to usability as well as new features to Flink users across the whole API stack. Some highlights that we’re particularly excited about are:
The core engine is introducing unaligned checkpoints, a major change to Flink’s fault tolerance mechanism that improves checkpointing performance under heavy backpressure.
...
Continue reading »
June 23, 2020 -
Jeff Zhang
(@zjffdu)
In a previous post, we introduced the basics of Flink on Zeppelin and how to do Streaming ETL. In this second part of the “Flink on Zeppelin” series of posts, I will share how to perform streaming data visualization via Flink on Zeppelin and how to use Apache Flink UDFs in Zeppelin.
Streaming Data Visualization # With Zeppelin, you can build a real time streaming dashboard without writing any line of javascript/html/css code.
...
Continue reading »
June 15, 2020 -
Jeff Zhang
(@zjffdu)
The latest release of Apache Zeppelin comes with a redesigned interpreter for Apache Flink (version Flink 1.10+ is only supported moving forward) that allows developers to use Flink directly on Zeppelin notebooks for interactive data analysis. I wrote 2 posts about how to use Flink in Zeppelin. This is part-1 where I explain how the Flink interpreter in Zeppelin works, and provide a tutorial for running Streaming ETL with Flink on Zeppelin.
...
Continue reading »
June 10, 2020 -
Marta Paes
(@morsapaes)
And suddenly it’s June. The previous month has been calm on the surface, but quite hectic underneath — the final testing phase for Flink 1.11 is moving at full speed, Stateful Functions 2.1 is out in the wild and Flink has made it into Google Season of Docs 2020.
To top it off, a piece of good news: Flink Forward is back on October 19-22 as a free virtual event!
...
Continue reading »