11 Nov 2020 Tzu-Li (Gordon) Tai (@tzulitai)
The Apache Flink community released the first bugfix release of the Stateful Functions (StateFun) 2.2 series, version 2.2.1.
This release fixes a critical bug that causes restoring the Stateful Functions cluster from snapshots (checkpoints or savepoints) to fail under certain conditions. Starting from this release, StateFun now creates snapshots with a more robust format that allows it to be restored safely going forward.
We strongly recommend all users to upgrade to 2.2.1. Please see the following sections on instructions and things to keep in mind for this upgrade.
For new users just starting out with Stateful Functions
We strongly recommend to skip all previous versions and start using StateFun from version 2.2.1. This guarantees that failure recovery from checkpoints, or application upgrades using savepoints will work as expected for you.
For existing users on versions <= 2.2.0
Users that are currently using older versions of StateFun may or may not be able to directly upgrade to 2.2.1 using savepoints taken with the older versions. The Flink community is working hard on a follow-up hotfix release, 2.2.2, that would guarantee that you can perform the upgrade smoothly. For the meantime, you may still try to upgrade to 2.2.1 first, but may encounter FLINK-19741 or FLINK-19748. If you do encounter this, do not worry about data loss; this simply means that the restore failed, and you’d have to wait until 2.2.2 is out in order to upgrade.
The follow-up hotfix release 2.2.2 is expected to be ready within another 2~3 weeks, as it requires a new hotfix release from Flink core, and ultimately an upgrade of the Flink dependency in StateFun. We’ll update the community via the Flink mailing lists as soon as this is ready, so please subscribe to the mailing lists for important updates for this!
You can find the binaries on the updated Downloads page.
This release includes 6 fixes and minor improvements since StateFun 2.2.0. Below is a detailed list of all fixes and improvements:
- [FLINK-19515] - Async RequestReply handler concurrency bug
- [FLINK-19692] - Can't restore feedback channel from savepoint
- [FLINK-19866] - FunctionsStateBootstrapOperator.createStateAccessor fails due to uninitialized runtimeContext