Life sciences big data e-infrastructure concepts Maarten Kooyman 2014-11-13 Tue Maarten Kooyman Life sciences big data e-infrastructure concepts 2014-11-13 Tue 1 / 1
About me Knowing each other helps to communicate Maarten Kooyman Life sciences big data e-infrastructure concepts 2014-11-13 Tue 2 / 1
Education Knowing each other helps to communicate Maarten Kooyman Life sciences big data e-infrastructure concepts 2014-11-13 Tue 3 / 1
Work Knowing each other helps to communicate Maarten Kooyman Life sciences big data e-infrastructure concepts 2014-11-13 Tue 4 / 1
Current Work Knowing each other helps to communicate Maarten Kooyman Life sciences big data e-infrastructure concepts 2014-11-13 Tue 5 / 1
Cartesius the national supercomputer Maarten Kooyman Life sciences big data e-infrastructure concepts 2014-11-13 Tue 6 / 1
Lisa the national compute cluster Maarten Kooyman Life sciences big data e-infrastructure concepts 2014-11-13 Tue 7 / 1
Grid The grid infrastructure Maarten Kooyman Life sciences big data e-infrastructure concepts 2014-11-13 Tue 8 / 1
Hadoop Big Data analytics framework Maarten Kooyman Life sciences big data e-infrastructure concepts 2014-11-13 Tue 9 / 1
HPC Cloud The cloud computing infrastructure Maarten Kooyman Life sciences big data e-infrastructure concepts 2014-11-13 Tue 10 / 1
Discussion And will continue during coffee Maarten Kooyman Life sciences big data e-infrastructure concepts 2014-11-13 Tue 11 / 1
Credits Profile by Marek Polakovic from The Noun Project Graduate by Wilson Joseph from The Noun Project Graduate by Wilson Joseph from The Noun Project User by Wilson Joseph from The Noun Project Superhero by Moriah Rich from The Noun Project Cow by Alessandro Suraci from The Noun Project Grid by Sblendone from The Noun Project divide by Lorena Salagre from The Noun Project Adventure by Ben Markoch from The Noun Project Cloud by Lil Squid from The Noun Project All icons on this list are licensed under Creative Commons Attribution Maarten Kooyman Life sciences big data e-infrastructure concepts 2014-11-13 Tue 12 / 1
Cartesius the national supercomputer Communicate fast between jobs 15008 cores 132 GPU 2.6 or * GB/core low latency network Coming soon! GPU2GPU direct communication More nodes with AVX2 instructions System grows over time Maarten Kooyman Life sciences big data e-infrastructure concepts 2014-11-13 Tue 13 / 1
Lisa the national compute cluster Large simple cluster: 8960 cores 3.5 GB/core NFS home drive Coming soon! Intel Xeon Phi unified data storage with most SURFsara systems Maarten Kooyman Life sciences big data e-infrastructure concepts 2014-11-13 Tue 14 / 1
Grid The grid infrastructure Massive embarrassingly parallel calculations. Under SURFsara administration cluster cores memory LSG 1350 cores 4GB Gina 3024 cores 4 or 8 GB filesystem: fast local storage (few TB) gigantic global storage (used 5.5 PB) Available cores on lsgrid VO 17590 (SURFsara/ NIHEF/ RUG-CIT) Upscaling European Grid Infrastructure (EGI) possible Coming soon! Newer nodes(avx2, 8GB Mem/core) New documentation CernVM-FS Research: Docker on grid Maarten Kooyman Life sciences big data e-infrastructure concepts 2014-11-13 Tue 15 / 1
Hadoop Big Data analytics framework Divide and Conquer: distributed data and calculations 720 cores 8GB/core 600TB cluster filesystem Coming soon! Roughly double the amount of nodes More than double the storage HBase as production service Hortonworks 1.3 ElasticSearch Maarten Kooyman Life sciences big data e-infrastructure concepts 2014-11-13 Tue 16 / 1
HPC Cloud the cloud computing infrastructure Maximum flexibility: be your own admin! 1280 cores 8 GB memory filesystem: NFS Coming soon! Big Memory node (2TB) GPU s OpenNebula 4.x 3 different filesystems (NFS, Ceph, local) Beefier NFS servers SSD for some local space Research: optimise virtualisation switches for KVM Maarten Kooyman Life sciences big data e-infrastructure concepts 2014-11-13 Tue 17 / 1