is an open source platform for distributed stream and batch data processing.
Flink’s core is a streaming dataflow engine that provides data distribution, communication, and fault tolerance for distributed computations over data streams.
Flink includes several APIs for creating applications that use the Flink engine:
- DataStream API for unbounded streams embedded in Java and Scala, and
- DataSet API for static data embedded in Java, Scala, and Python,
- Table API with a SQL-like expression language embedded in Java and Scala.
Flink also bundles libraries for domain-specific use cases:
- CEP, a complex event processing library,
- Machine Learning library, and
- Gelly, a graph processing API and library.
You can integrate Flink easily with other well-known open source systems both for data input and output as well as deployment.