Introduction History Design Blue Gene/Q Job Scheduler Filesystem Power usage Performance Summary
Sequoia is a petascale Blue Gene/Q supercomputer Being constructed by IBM for the National Nuclear Security Administration as part of the Advanced Simulation and Computing Program (ASC) Primarily for nuclear weapons simulation at Lawrence Livermore National Laboratory Scientific purposes like astronomy, energy, studying of the human genome, and climate change
Blue Gene is an IBM project aimed at designing supercomputers IBM has created three generations of supercomputers, Blue Gene/L, Blue Gene/P, and Blue Gene/Q Blue Gene/L November 2004 6-rack system each rack 1,024 compute nodes first place in the TOP500 list performance of 70.72 TFLOPS Blue Gene/P November 2009 2-rack system each rack 2048 compute nodes 8 th place in the TOP500 list
Node Architecture : IBM Blue Gene/Q design 98,304 compute nodes Total of 1.6 million processor cores, 1.6PBmemory 96 racks covering an area of about 3,000 square feet Job Scheduler : Simple Linux Utility for Resource Management (SLURM )job scheduler Used by Dawn prototype and China's Tianhe-IA
Filesystem : Lustre parallel file system Ported ZFS management system Power Usage : Low power consumption Estimated to beat the current (2011) top 500 leaders by 3 time the power efficiency
360 mm² Cu-45 technology (SOI) ~ 1.47 B transistors 16 user + 1 service processors all processors are symmetric each 4-way multi-threaded 64 bits PowerISA 1.6 GHz L1 I/D cache = 16kB/16kB L1 prefetch engines each processor has Quad FPU (4-wide double precision, SIMD) peak performance 204.8 GFLOPS@55W
Central shared L2 cache: 32 MB edram multiversioned cache will support transactional memory, speculative execution. supports atomic ops Dual memory controller 16 GB external DDR3 memory 1.33 Gb/s 2 * 16 byte-wide interface (+ECC) Chip-to-chip networking Router logic integrated into BQC chip. External IO PCIe Gen2 interface
Assistant to the 16 user cores Offload interrupt handling Asynchronous I/O completion Messaging assist, e.g. MPI pacing Offload RAS Event handling
Simple Linux Utility for Resource Management (or simply SLURM) It is an open source job scheduler Used by most of the supercomputers and computer clusters It performs these three major jobs : Allocate and non-exclusive access to resources Provides a framework for starting, executing, and monitoring work especially MPI Arbitrates contention for resources by managing a queue of pending jobs SLURM is designed to handle thousands of nodes in a single cluster and can sustain throughput of 120,000 jobs per hour SLURM's design is very modular with dozens of optional plugins
Lustre is a parallel distributed file system Generally used for large scale cluster computing Name Lustre is derived from Linux and cluster Lustre file systems are scalable Can support tens of thousands of client systems, tens of petabytes (PB) of storage, and hundreds of gigabytes per second (GB/s) of aggregate I/O throughput
ZFS volume manager designed by Sun Microsystems. Features : - Verification against data corruption Support for high storage capacities Continuous integrity checking and automatic repair
Draw about 6 MW of power Projected to have an unprecedented efficiency in performance per watt 3000 Mflops/watt, 7 times as efficient as the Blue Gene/P design Estimated to beat the top-500 leaders (2011) with thrice the power efficiency
In November 2011,an initial 4-rack Blue Gene/Q system of 4096 nodes, 65536 user processor cores #17 in thetop500 list Achieved top position in the Graph500 list Blue Gene/Q systems also topped the Green500 list of most energy efficient supercomputers with about 2 GFLOPS/W
Blue Gene/Q is deployed at Lawrence Livermore National Laboratory, Sequoia is expected to achieve 20PFLOPS at peak performance, approximately two times higher compared to the currently K computer.