Mixing Hadoop and HPC Workloads on Parallel Filesystems

Size: px

Start display at page:

Download "Mixing Hadoop and HPC Workloads on Parallel Filesystems"

Kerry Norman
10 years ago
Views:

1 Mixing Hadoop and HPC Workloads on Parallel Filesystems Esteban Molina-Estolano *, Maya Gokhale, Carlos Maltzahn *, John May, John Bent, Scott Brandt * * UC Santa Cruz, ISSDM, PDSI Lawrence Livermore National Laboratory Los Alamos National Laboratory

John May, John Bent, Scott Brandt * * UC Santa Cruz, ISSDM,

2 Motivation Strong interest in running both HPC and large-scale data mining workloads on the same infrastructure Hadoop-tailored filesystems (e.g. CloudStore) and highperformance computing filesystems (e.g. PVFS) are tailored to considerably different workloads Existing investments in HPC systems and Hadoop systems should be usable for both workloads Goal: Examine the performance of both types of workloads running concurrently on the same filesystem Goal: collect I/O traces from concurrent workload runs, for parallel filesystem simulator work

CloudStore) and highperformance computing PVFS) are tailored to considerably different workloads Existing investments in HPC systems

3 MapReduce-oriented filesystems Large-scale batch data processing and analysis Single cluster of unreliable commodity machines for both storage and computation Data locality is important for performance Examples: Google FS, Hadoop DFS, CloudStore

commodity machines for both storage and computation Data

4 Hadoop DFS architecture 2"3,,4(567("$'8).%'.9$%!"#$%&'%(!)*%$+,$%(-".),&"/(!"0,$".,$1 "##$%&&"'())$*'$'+",*)-.!

5 High-Performance Computing filesystems 2)3456%$7,$+"&'%(8,+9:.)&3(7)/%;1;.%+; High-throughput, lowlatency workloads!!"#$%&$'()#$*)&+,-(.% -/&0123,.('4-(/56! 7'2$"&02&)'08,60*/'/&0, 2(9*)&0,/15,6&('/#0, 2-)6&0'6+,$"#$%6*005, :'"5#0,:0&.001,&$09! ;3*"2/-,.('4-(/58, 6"9)-/&"(1,2$024*("1&"1#! Architecture: separate compute and storage clusters, high-speed bridge between them Typical workload: simulation checkpointing.$/01#()2314#(% 56'7840((9):%69'( Examples: PVFS, Lustre, PanFS, Ceph!"#$%&'%(!)*%$+,$%(-".),&"/(!"0,$".,$1 "#$%&'()*%(&)+(#,$%-!

('4-(/58, 6"9)-/&"(1,2$024*("1&"1#! <=/9*-068,>?

6 Running each workload on the non-native filesystem Two-sided problem: running HPC workloads on a Hadoop filesystem, and Hadoop workloads on an HPC filesystem Different interfaces: HPC workloads need a POSIX-like interface and shared writes Hadoop is write-once-read-many Different data layout policies

HPC filesystem Different interfaces: HPC workloads need a POSIX-like

7 Running HPC workloads on a Hadoop filesystem Chosen filesystem: CloudStore Downside of Hadoop s HDFS: no support for shared writes (needed for HPC N-1 workloads) Cloudstore has HDFS-like architecture, and shared write support

support for shared writes (needed for HPC N-1 workloads)

8 Running Hadoop workloads on an HPC filesystem Chosen HPC filesystem: PVFS PVFS is open-source and easy to configure Tantisiriroj et al. at CMU have created a shim to run Hadoop on PVFS Shim also adds prefetching, buffering, exposes data layout

9 The two concurrent workloads IOR checkpointing workload writes large amounts of data to disk from many clients N-1 and N-N write patterns Hadoop MapReduce HTTP attack classifier (TFIDF) Using a pre-generated attack model, classify HTTP headers as normal traffic or attack traffic

Hadoop MapReduce HTTP attack classifier (TFIDF) Using a pre-generated

12 Experimental Setup System: 19 nodes, 2-core 2.4 GHz Xeon, 120 GB disks IOR baseline: N-1 strided workload, 64 MB chunks IOR baseline: N-N workload, 64 MB chunks TFIDF baseline: classify 7.2 GB of HTTP headers Mixed workloads: IOR N-1 and TFIDF, IOR N-N and TFIDF Checkpoint size adjusted to make IOR and TFIDF take the same amount of time

baseline: N-N workload, 64 MB chunks TFIDF baseline: classify 7.

13 Performance metrics Throughputs are not comparable between workloads Per-workload throughput: measure how much each job is slowed down by the mixed workload Runtime: compare the runtime of the mixed workload with the runtime of the same jobs run sequentially

slowed down by the mixed workload Runtime: compare the runtime

14 Hadoop performance results Classification throughput (MB/s) TFIDF classification throughput, standalone and with IOR CloudStore Baseline with IOR N-1 with IOR N-N PVFS

15 IOR performance results Write throughput (MB/s) IOR checkpointing on CloudStore N-1 N-N IOR checkpointing on PVFS Standalone Mixed N-1 N-N

16 Runtime (seconds) Runtime results Runtime comparison of mixed vs. serial workloads Serial runtime Mixed runtime PVFS N-1 PVFS N-N CloudStore N-1 CloudStore N-N

17 Tracing infrastructure We gather traces to use for our parallel filesystem simulator Existing tracing mechanisms (e.g. strace, Pianola, Darshan) don t work well with Java or CloudStore Solution: our own tracing mechanisms for IOR and Hadoop

(e.g. strace, Pianola, Darshan) don t work well with Java

18 Tracing IOR workloads 2$"')&3(456(#,$7/,"89 Trace shim intercepts I/O calls, sends to stdio!!"#$%&'()*&$#+,-"%'&./0&$#11'2&'%34'&,5&',4)5 #$%&'()*&.$/#' #$%&'()* #$%&'()*&+*,-) #$%&'()*&0.##$ <((8)9>?8;G)>?)A:&9;=) #$%&'()*&1234 #$%&'()*&560.# 89:;< #$%&'()*&73/!"#$%&'%(!)*%$+,$%(-".),&"/(!"0,$".,$1!"

$/#' #$%&'()* #$%&'()*&+*,-) #$%&'()*&0.##$ 89,*9&9;=)>?@*<-)88&AB=>?

19 2$"')&3(4"5,,6(7"68%59'% Tracing Hadoop #$%&'$(+0%.% $'%/ #$%&'$()*'+,-.'/ 56(+7()*'+,-.'/ :6%& ;$%".%!!"#$$%&'(")*+,&-.*/0&1(")2(3*4256-'2/! Tracing shim wraps filesystem interfaces, sends I/O calls! C"-2&9*42-6-'2/-&/$#*9*2#&'$&D("%&E+%>':F>'%>'5'(2"/-& to Hadoop logs! E:F&$%2("'*$+-&4$,,2#&'$&!"#$$%&%2(='"-G&4$,- 56(+70%.% $'%/ 9%:;;3<*;=- #$%&'$(+0%.% $'%/ 56(+70%.% $'%/

C"-2&9*42-6-'2/-&/$#*9*2#&'$&D("%&E+%>':F>'%>'5'(2"/-& to Hadoop logs! E:F&$%2("'*$+-&4$,,2#&'$&!"#$$%&%2(='"-G&4$,- 56(+70%.%1234.+.$'%/ 9%:;;3<*;=- #$%&'$(+0%.

20 Tracing overhead Trace data goes to NFS-mounted share (no disk overhead) Small Hadoop reads caused huge tracing overhead Solution: record traces behind read-ahead buffers Overhead (throughput slowdown): IOR checkpointing: 1% TFIDF Hadoop: 5% Mixed workloads: 10%

Solution: record traces behind read-ahead buffers Overhead

21 Conclusions Each mixed workload component is noticeably slowed, but... If only total runtime matters, the mixed workloads are faster PVFS shows different slowdowns for N-N vs. N-1 workloads Tracing infrastructure: buffering required for small I/O tracing Future work: Run experiments at a larger scale Use experimental results to improve parallel filesystem simulator Investigate scheduling strategies for mixed workloads

22 Questions? Esteban Molina-Estolano:

Performance Analysis of Mixed Distributed Filesystem Workloads

Performance Analysis of Mixed Distributed Filesystem Workloads Esteban Molina-Estolano, Maya Gokhale, Carlos Maltzahn, John May, John Bent, Scott Brandt Motivation Hadoop-tailored filesystems (e.g. CloudStore)