Apache Flink Training. System Overview

Apache Flink Training System Overview

Apache Flink A stream processor with many applications Streaming dataflow runtime 2

1 year of Flink - code April 2014 April 2015 Hadoop M/R Python Gelly Table ML flow MRQL Table SAMOA flow Set API (Java/Scala) Flink core Set (Java/Scala) Flink core Stream (Java/Scala) Local Remote Yarn Local Remote Yarn Tez Embedded

Flink Community 120 #unique contributor ids by git commits 100 80 60 40 In top 5 of Apache's big data projects after one year in the Apache Software Foundation 20 0 Aug-10 Feb-11 Sep-11 Apr-12 Oct-12 May-13 Nov-13 Jun-14 Dec-14 Jul-15

The Apache Way Flink is an Apache top-level project Independent, non-profit organization Community-driven open source software development approach Public communication and open to new contributors 5

What is Apache Flink? Hadoop M/R Table Gelly ML flow/beam MRQL Cascading Table SAMOA flow/beam Set (Java/Scala/Python) Stream (Java/Scala) Streaming dataflow runtime Local Remote Yarn Tez Embedded 6

Native workload support Heavy batch jobs Machine Learning at scale Streaming topologies Flink How can an engine natively support all these workloads? And what does native mean? 7

E.g.: Non-native iterations for (int i = 0; i < maxiterations; i++) { // Execute MapReduce job } Client Step Step Step Step Step 8

E.g.: Non-native streaming stream discretizer while (true) { // get next few records // issue batch job } Job Job Job Job 9

Native workload support Batch processing Machine Learning at scale Stream processing Graph Analysis Flink How can an engine natively support all these workloads? And what does "native" mean? 10

Flink Engine 1. Execute everything as streams 2. Iterative (cyclic) dataflows 3. Mutable state State + Computa4on 4. Operate on managed memory 5. Special code paths for batch GroupRed sort hash-part [0,1] Combine Join Hybrid Hash buildht probe GroupRed sort forward Join Hybrid Hash buildht probe broadcast Map Filter Source orders.tbl forward Source lineitem.tbl hash-part [0] hash-part [0] Map Filter Source orders.tbl 11 Source lineitem.tbl

What is a Flink Program? 12

Flink stack Hadoop M/R Table Gelly ML flow MRQL Cascading (WiP) Table SAMOA flow (WiP) Set (Java/Scala/Python) Stream (Java/Scala) Streaming dataflow runtime Local Remote Yarn Tez Embedded

Basic API Concept Source Stream Opera4on Stream Sink Source Set Opera4on Set Sink How do I write a Flink program? 1. Bootstrap sources 2. Apply operations 3. Output to source 14

Batch & Stream Processing Set API Example: Map/Reduce paradigm b h 2 1 3 5 Map Reduce a 12 7 4 Stream API Example: Live Stock Feed Stock Feed Name Price Google 516 Alert if Microso< > 120 Microso. 124 Write event to database Microso< 124 Apple 235 Google 516 Apple 235 Apple 235 Sum every 10 seconds Microso. 124 Google 516 Alert if sum > 10000 15

Streaming & Batch Streaming Batch Input infinite finite transfer pipelined blocking or pipelined Latency low high 16

Scaling out 17 Set Opera4on Set Source Sink Set Opera4on Set Source Sink Set Opera4on Set Source Sink Set Opera4on Set Source Sink Set Opera4on Set Source Sink Set Opera4on Set Source Sink Set Opera4on Set Source Sink Set Opera4on Set Source Sink

Scaling up 18

Sources (selection) Collection-based fromcollection fromelements File-based TextInputFormat CsvInputFormat Other SocketInputFormat KafkaInputFormat bases 19

Sinks (selection) File-based TextOutputFormat CsvOutputFormat PrintOutput Others SocketOutputFormat KafkaOutputFormat bases 20

Hadoop Integration Out of the box Access HDFS Yarn Execution (covered later) Reuse data types (Writables) With a thin wrapper Reuse Hadoop input and output formats Reuse functions like Map and Reduce 21

What s the Lifecycle of a Program? 22

From Program to flow case class Path (from: Long, to: Long) val tc = edges.iterate(10) { paths: Set[Path] => val next = paths.join(edges).where("to").equalto("from") { (path, edge) => Path(path.from, edge.to) }.union(paths).distinct() next } Program Type extraction stack Optimizer Pre-flight (Client) Map Filter Source orders.tbl build HT GroupRed sort forward Join Hybrid Hash probe hash-part [0] hash-part [0] Sourc e lineitem.tbl flow Graph flow metadata deploy operators Task scheduling Master track intermediate results Workers 23

Architecture Overview Client Master (Job Manager) Worker (Task Manager) Client Job Manager Task Manager Task Manager Task Manager 24

Client Optimize Construct job graph Pass job graph to job manager Retrieve job results case class Path (from: Long, to: Long) val tc = edges.iterate(10) { paths: Set[Path] => val next = paths.join(edges).where("to").equalto("from") { (path, edge) => Path(path.from, edge.to) }.union(paths).distinct() next } Type extraction Optimizer Map Filter build HT GroupRed sort forward Join Hybrid Hash probe hash-part [0] hash-part [0] Sourc e lineitem.tbl Job Manager Client Source orders.tbl 25

Job Manager Parallelization: Create Execution Graph Scheduling: Assign tasks to task managers State: Supervise the execution Task Manager Map Filter Source orders.tbl build HT GroupRed sort forwar d Join Hybrid Hash prob e hash-part [0] hash-part [0] Sour ce lineitem.tbl GroupRed GroupRed GroupRed sort GroupRed sort sort sort forwar forward forward forward d Join Join Hybrid Hash Join Hybrid Hash Hybrid Join buildhash prob Hybrid buildhash HT prob e buildht prob e buildht prob e HT e hash-part [0] hash-part [0] hash-part [0] hash-part [0] hash-part [0] hash-part Sour [0] hash-part [0] Map hash-part Sour [0] Map Source Map Sourlineitem.tbl ce Map Filter lineitem.tbl ce Filter lineitem.tbl ce Filter lineitem.tbl Filter Source Source orders.tbl Source orders.tbl Source orders.tbl orders.tbl Task Manager Task Manager Job Manager Task Manager 26

Task Manager Operations are split up into tasks depending on the specified parallelism Each parallel instance of an operation runs in a separate task slot The scheduler may run several tasks from different operators in one task slot S l o t S l o t S l o t Task Manager Task Manager Task Manager 27

Execution Setups 28

Ways to Run a Flink Program Hadoop M/R Table Gelly ML flow/beam MRQL Cascading (WiP) Table SAMOA flow/beam Set (Java/Scala/Python) Stream (Java/Scala) Streaming dataflow runtime Local Remote Yarn Tez Embedded 29

Local Execution Starts local Flink cluster All processes run in the same JVM Behaves just like a regular Cluster Very useful for developing and debugging Task Manager Task Manager Job Manager Task Manager Task Manager JVM 30

Embedded Execution Runs operators on simple Java collections Lower overhead Does not use memory management Useful for testing and debugging 31

Remote Execution Client Submit job Job Manager Submit a Job remotely Monitor the status of a job Task Manager Task Manager Task Manager Task Manager Cluster 32

YARN Execution Multi-user scenario Resource sharing Uses YARN containers to run a Flink cluster Easy to setup Node Manager Job Manager Node Manager Resource Manager Node Manager Task Manager Node Manager Client Task Manager Other Applica4on YARN Cluster 33

Execution Leverages Apache Tez s runtime Built on top of YARN Good YARN citizen Fast path to elastic deployments Slower than native Flink 34

Flink compared to other projects 35

Batch & Streaming projects Batch only Streaming only Hybrid 36

Batch comparison API low-level high-level high-level Transfer batch batch pipelined & batch Memory Management IteraHons disk-based JVM-managed Ac4ve managed file system cached in-memory cached streamed Fault tolerance task level task level job level Good at massive scale out data explora4on Libraries many external built-in & external heavy backend & itera4ve jobs evolving built-in & external 37

Streaming comparison Streaming true mini batches true API low-level high-level high-level Fault tolerance tuple-level ACKs RDD-based (lineage) coarse checkpoin4ng State not built-in external internal Exactly once at least once exactly once exactly once Windowing not built-in restricted flexible Latency low medium low Throughput medium high high 38

Thank you for listening! 39