June 5, 2019 -
Nico Kruber
Flink’s network stack is one of the core components that make up the flink-runtime module and sit at the heart of every Flink job. It connects individual work units (subtasks) from all TaskManagers. This is where your streamed-in data flows through and it is therefore crucial to the performance of your Flink job for both the throughput as well as latency you observe. In contrast to the coordination channels between TaskManagers and JobManagers which are using RPCs via Akka, the network stack between TaskManagers relies on a much lower-level API using Netty.
...
Continue reading »
May 17, 2019 -
Fabian Hueske
(@fhueske)
Andrey Zagrebin
A common requirement for many stateful streaming applications is to automatically cleanup application state for effective management of your state size, or to control how long the application state can be accessed (e.g. due to legal regulations like the GDPR). The state time-to-live (TTL) feature was initiated in Flink 1.6.0 and enabled application state cleanup and efficient state size management in Apache Flink.
In this post, we motivate the State TTL feature and discuss its use cases.
...
Continue reading »
May 14, 2019 -
Marta Paes
(@morsapaes)
Figuring out how to manage and model temporal data for effective point-in-time analysis was a longstanding battle, dating as far back as the early 80’s, that culminated with the introduction of temporal tables in the SQL standard in 2011. Up to that point, users were doomed to implement this as part of the application logic, often hurting the length of the development lifecycle as well as the maintainability of the code.
...
Continue reading »
May 3, 2019 -
Sijie Guo
(@sijieg)
The open source data technology frameworks Apache Flink and Apache Pulsar can integrate in different ways to provide elastic data processing at large scale. I recently gave a talk at Flink Forward San Francisco 2019 and presented some of the integrations between the two frameworks for batch and streaming applications. In this post, I will give a short introduction to Apache Pulsar and its differentiating elements from other messaging systems and describe the ways that Pulsar and Flink can work together to provide a seamless developer experience for elastic data processing at scale.
...
Continue reading »
April 17, 2019 -
Konstantin Knauf
(@snntrable)
The Apache Flink community is happy to announce its application to the first edition of Season of Docs by Google. The program is bringing together Open Source projects and technical writers to raise awareness for and improve documentation of Open Source projects. While the community is continuously looking for new contributors to collaborate on our documentation, we would like to take this chance to work with one or two technical writers to extend and restructure parts of our documentation (details below).
...
Continue reading »
April 9, 2019 -
Aljoscha Krettek
(@aljoscha)
The Apache Flink community is pleased to announce Apache Flink 1.8.0. The latest release includes more than 420 resolved issues and some exciting additions to Flink that we describe in the following sections of this post. Please check the complete changelog for more details.
Flink 1.8.0 is API-compatible with previous 1.x.y releases for APIs annotated with the @Public annotation. The release is available now and we encourage everyone to download the release and check out the updated documentation.
...
Continue reading »
March 11, 2019 -
Maximilian Bode, TNG Technology Consulting
(@mxpbode)
This blog post describes how developers can leverage Apache Flink’s built-in metrics system together with Prometheus to observe and monitor streaming applications in an effective way. This is a follow-up post from my Flink Forward Berlin 2018 talk (slides, video). We will cover some basic Prometheus concepts and why it is a great fit for monitoring Apache Flink stream processing jobs. There is also an example to showcase how you can utilize Prometheus with Flink to gain insights into your applications and be alerted on potential degradations of your Flink jobs.
...
Continue reading »
March 6, 2019 -
Fabian Hueske
(@fhueske)
The third annual Flink Forward San Francisco is just a few weeks away! As always, Flink Forward will be the right place to meet and mingle with experienced Flink users, contributors, and committers. Attendees will hear and chat about the latest developments around Flink and learn from technical deep-dive sessions and exciting use cases that were put into production with Flink. The event will take place on April 1-2, 2019 at Hotel Nikko in San Francisco.
...
Continue reading »
February 25, 2019 -
The Apache Flink community released the fourth bugfix version of the Apache Flink 1.6 series.
This release includes more than 25 fixes and minor improvements for Flink 1.6.3. The list below includes a detailed list of all fixes.
We highly recommend all users to upgrade to Flink 1.6.4.
Updated Maven dependencies:
<dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-java</artifactId> <version>1.6.4</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-streaming-java_2.11</artifactId> <version>1.6.4</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-clients_2.11</artifactId> <version>1.6.4</version> </dependency> You can find the binaries on the updated Downloads page.
...
Continue reading »
February 21, 2019 -
Konstantin Knauf
(@snntrable)
This blog post provides an introduction to Apache Flink’s built-in monitoring and metrics system, that allows developers to effectively monitor their Flink jobs. Oftentimes, the task of picking the relevant metrics to monitor a Flink application can be overwhelming for a DevOps team that is just starting with stream processing and Apache Flink. Having worked with many organizations that deploy Flink at scale, I would like to share my experience and some best practice with the community.
...
Continue reading »