Main Memory Map Reduce (M3R) PL Day, September 23, 2013 Avi Shinnar, David Cunningham, Ben Herta, Vijay Saraswat, Wei Zhang In collaboralon with: Yan Li*, Dave Grove, Mikio Takeuchi**, Salikh Zakirov**, Juemin Zhang * IBM China Research Lab ** IBM Tokyo Research Lab Otherwise IBM TJ Watson Research Lab, New York 1
M3R (engine) Hadoop performance low latency resilience scalability X10 Java in- memory out- of- core
X10 Java - like language designed for performance and produclvity at scale Asynchronous ParLLoned Global Address Space programming model async S: run S as a separate ac/vity at (P) S: run S at place P finish S: wait for termina/on of children ac/vi/es MPI style barriers, local atomic synchronizalon Advanced type system reified generics, closures, dependent types MulLple backends: Java, C++, CUDA hap://x10- lang.org/
Main- Memory Map Reduce in X10 sorlng (keys) mull- threading secondary sorlng (values) iteralve jobs combiners debugging map- only jobs out of core shuffle profiling user controlled serializalon
M3R/Hadoop Architecture Java Hadoop App X10 M3R jobs mullple jobs mullple jobs M3R/Hadoop adaptor Hadoop Map Reduce Engine M3R Engine JVM only JVM/Na/ve HDFS HDFS HDFS data Java M3R jobs X10 Java
CompaLbility M3R / Hadoop Performance
Hadoop Job File System (HDFS) Input Map Reduce Output (InputFormat/ (Mapper) (Reducer) (OutputFormat/ RecordReader/ InputSplit) RecordWriter File System Shuffle File System 2013 IBM CorporaLon OutputCommi:er)
M3R/Hadoop Job: cache File System (HDFS) Cache Input Map Reduce Output (InputFormat/ (Mapper) (Reducer) (OutputFormat/ RecordReader/ RecordWriter InputSplit) File System Shuffle File System OutputCommi:er)
M3R/Hadoop Job: in- memory File System (HDFS) Cache Input Map Reduce Output (InputFormat/ (Mapper) (Reducer) (OutputFormat/ RecordReader/ RecordWriter InputSplit) File System Shuffle File System OutputCommi:er)
M3R/Hadoop Job: co- localon File System (HDFS) Cache Input Map Reduce Output (InputFormat/ (Mapper) (Reducer) (OutputFormat/ RecordReader/ RecordWriter InputSplit) File System Shuffle File System OutputCommi:er)
Iterated Matrix Vector mullplicalon V Algorithm ( standard HPC ) Row block parllon G Replicate V In parallel, at each place, mullply each row of G with V. In parallel, each place broadcasts its segment of V to all others This reassembles V for next phase. G V Performance key Read the appropriate part of G once, never communicate it. Reassembly is local.
M3R/Hadoop Job: locality File System (HDFS) Cache Input Map Reduce Output (InputFormat/ (Mapper) (Reducer) (OutputFormat/ RecordReader/ RecordWriter InputSplit) ParLLoner Shuffle OutputCommi:er)
ParLLon Stability in M3R The reducer associated with a given parllon number will always be run at the same place Assuming the number of reducers and the number of places remains the same, The number of reducers is determined by the applicalon. The number of places is fixed for the duralon of the M3R server.
Sparse Matrix Vector MulLplicaLon 2000 M3R/Hadoop 1800 Hadoop Expon. (M3R/Hadoop) 1600 Expon. (Hadoop) Sparse Matrix Vector Multiplication 1400 Sparse Matrix Vector Multiplication 45 40 35 Time (s) 1200 M3R/Hadoop 1000 Expon. (M3R/Hadoop) 800 Time (s) 30 25 20 15 600 400 200 0 0 200000 400000 600000 800000 1000000 1200000 1400000 1600000 Size M (G is an MxM matrix with sparsity 0.001) 10 5 0 50X 0 200000 400000 600000 800000 1000000 1200000 1400000 1600000 Size M (G is an MxM matrix of sparsity 0.001)
CompaLbility M3R / Hadoop Performance
JobTracker
CompaLbility M3R / Hadoop Performance
DML results (Nov. 2011) GNNMF Hadoop M3R Speedup 100K 1489s 115s 13x 200K 1492s 185s 8.1x 400K 1481s 300s 4.9x Linear Regression Hadoop M3R Speedup 1000K 1272s 120s 10.6x 3000K 1438s 185s 7.8x 5000K 1473s 275s 5.4x PageRank Hadoop M3R Speedup 100K 880s 452s 1.9x 200K 885s 574s 1.5x 400K 872s 530s 1.7x
Pig unit tests
Current Status / Future Work VLDB 12 Avraham Shinnar, David Cunningham, Vijay Saraswat, and Benjamin Herta. 2012. M3R: increased performance for in- memory Hadoop jobs. Proc. VLDB Endow. 5, 12 (August 2012), 1736-1747. Things generally work quite well Working on out- of- core shuffle Performance degradalon instead of crashing Working on dynamic class loading