Stateful Functions — Event-driven Applications on Apache Flink®
Stateful Functions is an API that simplifies building distributed stateful applications. It’s based on functions with persistent state that can interact dynamically with strong consistency guarantees.
A stateful function is a small piece of logic/code existing in multiple instances that represent entities — similar to actors. Functions are invoked through messages and are:
Functions have embedded, fault-tolerant state, accessed locally like a variable.
Much like FaaS, functions don't reserve resources — inactive functions don't consume CPU/Memory.
Applications are composed of modules of multiple functions that can interact arbitrarily with:
State and messaging go hand-in-hand, providing exactly-once message/state semantics.
Functions message each other by logical addresses. No service discovery needed.
Dynamic and Cyclic Messaging
Messaging patterns don't need to be pre-defined as dataflows (dynamic) and are also not restricted to DAGs (cyclic).
The Stateful Functions runtime is designed to provide a set of properties similar to what characterizes serverless functions, but applied to stateful problems.
The runtime is built on Apache Flink®, with the following design principles:
Logical Compute/State Co-location:
Messaging, state access/updates and function invocations are managed tightly together. This ensures a high-level of consistency out-of-the-box.
Physical Compute/State Separation:
Functions can be executed remotely, with message and state access provided as part of the invocation request. This way, functions can be managed like stateless processes and support rapid scaling, rolling upgrades and other common operational patterns.
Function invocations use a simple HTTP/gRPC-based protocol so that Functions can be easily implemented in various languages.
This makes it possible to execute functions on a Kubernetes deployment, a FaaS platform or behind a (micro)service, while providing consistent state and lightweight messaging between functions.
The API allows you to build and compose functions that communicate dynamic- and arbitrarily with each other. This gives you much more flexibility compared to the acyclic nature of classical stream processing topologies.
Functions can keep local state that is persistent and integrated with the messaging between functions. This gives you the effect of exactly-once state access/updates and guaranteed efficient messaging out-of-the-box.
State durability and fault tolerance build on Apache Flink’s robust distributed snapshots model. This requires nothing but a simple blob storage tier (e.g. S3, GCS, HDFS) to store the state snapshots.
Stateful Function's approach to state and composition can be combined with the capabilities of modern serverless platforms like Kubernetes, Knative and AWS Lambda.
State access is part of the function invocation and so Stateful Functions applications behave like stateless processes that can be managed with the same simplicity and benefits, like rapid scalability, scale-to-zero and rolling/zero-downtime upgrades.
Imagine an application that receives financial information and emits alerts for every transaction that exceeds a given threshold fraud score (i.e. fraudulent). To build this example with Stateful Functions, you can define four different functions, each tracking its own state:
Fraud Count: tracks the total number of reported fraudulent transactions made against an account on a rolling 30 day period.
Merchant Scorer: returns a trustworthiness score for each merchant, relying on a third party service.
Transaction Manager: enriches transaction records to create feature vectors for scoring and emits fraud alert events.
Model: scores transactions based on input feature vectors from the Transaction Manager.
Keeping track of fraudulent reports
The entry points to the application are the “Fraud Confirmation” and “Transactions” ingresses (e.g. Kafka Topics). As events flow in from “Fraud Confirmation”, the “Fraud Count” function increments its internal counter and sets a 30-day expiration timer on this state. Here, multiple instances of “Fraud Count” will exist — for example, one per customer account. After 30 days, the “Fraud Count” function will receive an expiration message (from itself) and clear its state.
Enriching and scoring transactions
On receiving events from the “Transactions” ingress, the “Transaction Manager” function messages “Fraud Count” to get the current count of fraud cases reported for the customer account; it also messages the “Merchant Scorer” for the trustworthiness score of the transaction merchant. “Transaction Manager” creates a feature vector with the count of fraud cases reported and the merchant score for the customer account that is then sent to the “Model” function for scoring.
Depending on the score sent back to “Transaction Manager”, it may emit an alert event to the “Alert User” egress if a given threshold is exceeded.
If you find these ideas interesting, give Stateful Functions a try and get involved! Check out the Getting Started section for introduction walkthroughs and the documentation for a deeper look into the internals of Stateful Functions.