Flink 2015: A year in review, and a lookout to 2016

December 18, 2015 -

With 2015 ending, we thought that this would be good time to reflect on the amazing work done by the Flink community over this past year, and how much this community has grown. Overall, we have seen Flink grow in terms of functionality from an engine to one of the most complete open-source stream processing frameworks available. The community grew from a relatively small and geographically focused team, to a truly global, and one of the largest big data communities in the the Apache Software Foundation. ...

Continue reading »

Storm Compatibility in Apache Flink: How to run existing Storm topologies on Flink

December 11, 2015 -

Apache Storm was one of the first distributed and scalable stream processing systems available in the open source space offering (near) real-time tuple-by-tuple processing semantics. Initially released by the developers at Backtype in 2011 under the Eclipse open-source license, it became popular very quickly. Only shortly afterwards, Twitter acquired Backtype. Since then, Storm has been growing in popularity, is used in production at many big companies, and is the de-facto industry standard for big data stream processing. ...

Continue reading »

Introducing Stream Windows in Apache Flink

December 4, 2015 -

The data analysis space is witnessing an evolution from batch to stream processing for many use cases. Although batch can be handled as a special case of stream processing, analyzing never-ending streaming data often requires a shift in the mindset and comes with its own terminology (for example, “windowing” and “at-least-once”/”exactly-once” processing). This shift and the new terminology can be quite confusing for people being new to the space of stream processing. ...

Continue reading »

Flink 0.10.1 released

November 27, 2015 -

Today, the Flink community released the first bugfix release of the 0.10 series of Flink. We recommend all users updating to this release, by bumping the version of your Flink dependencies and updating the binaries on the server. Issues fixed # [FLINK-2879] - Links in documentation are broken [FLINK-2938] - Streaming docs not in sync with latest state changes [FLINK-2942] - Dangling operators in web UI's program visualization (non-deterministic) [FLINK-2967] - TM address detection might not always detect the right interface on slow networks / overloaded JMs [FLINK-2977] - Cannot access HBase in a Kerberos secured Yarn cluster [FLINK-2987] - Flink 0. ...

Continue reading »

Announcing Apache Flink 0.10.0

November 16, 2015 -

The Apache Flink community is pleased to announce the availability of the 0.10.0 release. The community put significant effort into improving and extending Apache Flink since the last release, focusing on data stream processing and operational features. About 80 contributors provided bug fixes, improvements, and new features such that in total more than 400 JIRA issues could be resolved. For Flink 0.10.0, the focus of the community was to graduate the DataStream API from beta and to evolve Apache Flink into a production-ready stream data processor with a competitive feature set. ...

Continue reading »

Off-heap Memory in Apache Flink and the curious JIT compiler

September 16, 2015 -

Running data-intensive code in the JVM and making it well-behaved is tricky. Systems that put billions of data objects naively onto the JVM heap face unpredictable OutOfMemoryErrors and Garbage Collection stalls. Of course, you still want to to keep your data in memory as much as possible, for speed and responsiveness of the processing applications. In that context, “off-heap” has become almost something like a magic word to solve these problems. ...

Continue reading »

Announcing Flink Forward 2015

September 3, 2015 -

Flink Forward 2015 is the first conference with Flink at its center that aims to bring together the Apache Flink community in a single place. The organizers are starting this conference in October 12 and 13 from Berlin, the place where Apache Flink started. The conference program has been announced by the organizers and a program committee consisting of Flink PMC members. The agenda contains talks from industry and academia as well as a dedicated session on hands-on Flink training. ...

Continue reading »

Apache Flink 0.9.1 available

September 1, 2015 -

The Flink community is happy to announce that Flink 0.9.1 is now available. 0.9.1 is a maintenance release, which includes a lot of minor fixes across several parts of the system. We suggest all users of Flink to work with this latest stable version. Download the release and [check out the documentation]({{ site.docs-stable }}). Feedback through the Flink mailing lists is, as always, very welcome! The following issues were fixed for this release: ...

Continue reading »

Introducing Gelly: Graph Processing with Apache Flink

August 24, 2015 -

This blog post introduces Gelly, Apache Flink’s graph-processing API and library. Flink’s native support for iterations makes it a suitable platform for large-scale graph analytics. By leveraging delta iterations, Gelly is able to map various graph processing models such as vertex-centric or gather-sum-apply to Flink dataflows. Gelly allows Flink users to perform end-to-end data analysis in a single system. Gelly can be seamlessly used with Flink’s DataSet API, which means that pre-processing, graph creation, analysis, and post-processing can be done in the same application. ...

Continue reading »

Announcing Apache Flink 0.9.0

June 24, 2015 -

The Apache Flink community is pleased to announce the availability of the 0.9.0 release. The release is the result of many months of hard work within the Flink community. It contains many new features and improvements which were previewed in the 0.9.0-milestone1 release and have been polished since then. This is the largest Flink release so far. Download the release and check out the documentation. Feedback through the Flink mailing lists is, as always, very welcome! ...

Continue reading »