Stateful Functions Internals: Behind the scenes of Stateful Serverless

October 13, 2020 - Tzu-Li (Gordon) Tai (@tzulitai)

Stateful Functions (StateFun) simplifies the building of distributed stateful applications by combining the best of two worlds: the strong messaging and state consistency guarantees of stateful stream processing, and the elasticity and serverless experience of today’s cloud-native architectures and popular event-driven FaaS platforms. Typical StateFun applications consist of functions deployed behind simple services using these modern platforms, with a separate StateFun cluster playing the role of an “event-driven database” that provides consistency and fault-tolerance for the functions’ state and messaging. ...

Continue reading »

Stateful Functions 2.2.0 Release Announcement

September 28, 2020 - Tzu-Li (Gordon) Tai (@tzulitai) Igal Shilman (@IgalShilman)

The Apache Flink community is happy to announce the release of Stateful Functions (StateFun) 2.2.0! This release introduces major features that extend the SDKs, such as support for asynchronous functions in the Python SDK, new persisted state constructs, and a new SDK that allows embedding StateFun functions within a Flink DataStream job. Moreover, we’ve also included important changes that improve out-of-the-box stability for common workloads, as well as increased observability for operational purposes. ...

Continue reading »

Apache Flink 1.11.2 Released

September 17, 2020 - Zhu Zhu (@zhuzhv)

The Apache Flink community released the second bugfix version of the Apache Flink 1.11 series. This release includes 96 fixes and minor improvements for Flink 1.11.1. The list below includes a detailed list of all fixes and improvements. We highly recommend all users to upgrade to Flink 1.11.2. Updated Maven dependencies: <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-java</artifactId> <version>1.11.2</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-streaming-java_2.11</artifactId> <version>1.11.2</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-clients_2.11</artifactId> <version>1.11.2</version> </dependency> You can find the binaries on the updated Downloads page. ...

Continue reading »

Flink Community Update - August'20

September 4, 2020 - Marta Paes (@morsapaes)

Ah, so much for a quiet August month. This time around, we bring you some new Flink Improvement Proposals (FLIPs), a preview of the upcoming Flink Stateful Functions 2.2 release and a look into how far Flink has come in comparison to 2019. The Past Month in Flink # Flink Releases # Getting Ready for Flink Stateful Functions 2.2 # The details of the next release of Stateful Functions are under discussion in this @dev mailing list thread, and the feature freeze is set for September 10th — so, you can expect Stateful Functions 2. ...

Continue reading »

Memory Management improvements for Flink’s JobManager in Apache Flink 1.11

September 1, 2020 - Andrey Zagrebin

Apache Flink 1.11 comes with significant changes to the memory model of Flink’s JobManager and configuration options for your Flink clusters. These recently-introduced changes make Flink adaptable to all kinds of deployment environments (e.g. Kubernetes, Yarn, Mesos), providing better control over its memory consumption. The previous blog post focused on the memory model of the TaskManagers and how it was improved in Flink 1.10. This post addresses the same topic but for the JobManager instead. ...

Continue reading »

Apache Flink 1.10.2 Released

August 25, 2020 - Zhu Zhu (@zhuzhv)

The Apache Flink community released the second bugfix version of the Apache Flink 1.10 series. This release includes 73 fixes and minor improvements for Flink 1.10.1. The list below includes a detailed list of all fixes and improvements. We highly recommend all users to upgrade to Flink 1.10.2. Note After FLINK-18242, the deprecated `OptionsFactory` and `ConfigurableOptionsFactory` classes are removed (not applicable for release-1.10), please use `RocksDBOptionsFactory` and `ConfigurableRocksDBOptionsFactory` instead. Please also recompile your application codes if any class extending `DefaultConfigurableOptionsFactory` Note After FLINK-17800 by default we will set `setTotalOrderSeek` to true for RocksDB's `ReadOptions`, to prevent user from miss using `optimizeForPointLookup`. ...

Continue reading »

The State of Flink on Docker

August 20, 2020 - Robert Metzger (@rmetzger_)

With over 50 million downloads from Docker Hub, the Flink docker images are a very popular deployment option. The Flink community recently put some effort into improving the Docker experience for our users with the goal to reduce confusion and improve usability. Let’s quickly break down the recent improvements: Reduce confusion: Flink used to have 2 Dockerfiles and a 3rd file maintained outside of the official repository — all with different features and varying stability. ...

Continue reading »

Monitoring and Controlling Networks of IoT Devices with Flink Stateful Functions

August 18, 2020 - Igal Shilman (@IgalShilman)

In this blog post, we’ll take a look at a class of use cases that is a natural fit for Flink Stateful Functions: monitoring and controlling networks of connected devices (often called the “Internet of Things” (IoT)). IoT networks are composed of many individual, but interconnected components, which makes getting some kind of high-level insight into the status, problems, or optimization opportunities in these networks not trivial. Each individual device “sees” only its own state, which means that the status of groups of devices, or even the network as a whole, is often a complex aggregation of the individual devices’ state. ...

Continue reading »

Accelerating your workload with GPU and other external resources

August 6, 2020 - Yangze Guo

Apache Flink 1.11 introduces a new External Resource Framework, which allows you to request external resources from the underlying resource management systems (e.g., Kubernetes) and accelerate your workload with those resources. As Flink provides a first-party GPU plugin at the moment, we will take GPU as an example and show how it affects Flink applications in the AI field. Other external resources (e.g. RDMA and SSD) can also be supported in a pluggable manner. ...

Continue reading »

PyFlink: The integration of Pandas into PyFlink

August 4, 2020 - Jincheng Sun (@sunjincheng121) Markos Sfikas (@MarkSfik)

Python has evolved into one of the most important programming languages for many fields of data processing. So big has been Python’s popularity, that it has pretty much become the default data processing language for data scientists. On top of that, there is a plethora of Python-based data processing tools such as NumPy, Pandas, and Scikit-learn that have gained additional popularity due to their flexibility or powerful functionalities. Pic source: VanderPlas 2017, slide 52. ...

Continue reading »