The K computer: Project overview



Similar documents
SPARC64 VIIIfx: CPU for the K computer

Current Status of FEFS for the K computer


Operating System for the K computer

Accelerating CFD using OpenFOAM with GPUs

HPC Update: Engagement Model

Performance Improvement of Application on the K computer

Trends in High-Performance Computing for Power Grid Applications

Technical Computing Suite Job Management Software

Energy efficient computing on Embedded and Mobile devices. Nikola Rajovic, Nikola Puzovic, Lluis Vilanova, Carlos Villavieja, Alex Ramirez

21. Software Development Team

OpenPOWER Outlook AXEL KOEHLER SR. SOLUTION ARCHITECT HPC

Visit to the National University for Defense Technology Changsha, China. Jack Dongarra. University of Tennessee. Oak Ridge National Laboratory

Operations Management Software for the K computer

Sun Constellation System: The Open Petascale Computing Architecture

GPU System Architecture. Alan Gray EPCC The University of Edinburgh

HETEROGENEOUS HPC, ARCHITECTURE OPTIMIZATION, AND NVLINK

ANALYSIS OF SUPERCOMPUTER DESIGN

Building a Top500-class Supercomputing Cluster at LNS-BUAP

CPU Session 1. Praktikum Parallele Rechnerarchtitekturen. Praktikum Parallele Rechnerarchitekturen / Johannes Hofmann April 14,

Performance of the JMA NWP models on the PC cluster TSUBAME.

SPARC64 X: Fujitsu s New Generation 16 Core Processor for the next generation UNIX servers

Recent activities on Big Data Assimilation in Japan

A GPU COMPUTING PLATFORM (SAGA) AND A CFD CODE ON GPU FOR AEROSPACE APPLICATIONS

Architecture of Hitachi SR-8000

Jean-Pierre Panziera Teratec 2011

Data-analysis scheme and infrastructure at the X-ray free electron laser facility, SACLA

Data Centre Modernisation

Datasheet FUJITSU Software Systemwalker Software Configuration Manager V15

Build an Energy Efficient Supercomputer from Items You can Find in Your Home (Sort of)!

A New, High-Performance, Low-Power, Floating-Point Embedded Processor for Scientific Computing and DSP Applications

Jezelf Groen Rekenen met Supercomputers

Clusters: Mainstream Technology for CAE

Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales

Panasas High Performance Storage Powers the First Petaflop Supercomputer at Los Alamos National Laboratory

COMP/CS 605: Intro to Parallel Computing Lecture 01: Parallel Computing Overview (Part 1)

TSUBAME-KFC : a Modern Liquid Submersion Cooling Prototype Towards Exascale

Hadoop on the Gordon Data Intensive Cluster

Seeking Opportunities for Hardware Acceleration in Big Data Analytics

Lecture 1: the anatomy of a supercomputer

Performance Comparison of Fujitsu PRIMERGY and PRIMEPOWER Servers

Intel Solid- State Drive Data Center P3700 Series NVMe Hybrid Storage Performance

<Insert Picture Here> T4: A Highly Threaded Server-on-a-Chip with Native Support for Heterogeneous Computing

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

GPU Hardware and Programming Models. Jeremy Appleyard, September 2015

Lecture 11: Multi-Core and GPU. Multithreading. Integration of multiple processor cores on a single chip.

Adonis Technical Requirements

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1

Power Efficiency Comparison: Cisco UCS 5108 Blade Server Chassis and IBM FlexSystem Enterprise Chassis

OpenMP Programming on ScaleMP

Dell High-Performance Computing Clusters and Reservoir Simulation Research at UT Austin.

Kriterien für ein PetaFlop System

PRIMERGY server-based High Performance Computing solutions

Accelerating Server Storage Performance on Lenovo ThinkServer

Mixed Precision Iterative Refinement Methods Energy Efficiency on Hybrid Hardware Platforms

HP ProLiant Gen8 vs Gen9 Server Blades on Data Warehouse Workloads

Sun 8Gb/s Fibre Channel HBA Performance Advantages for Oracle Database

Next Generation GPU Architecture Code-named Fermi

SUN ORACLE EXADATA STORAGE SERVER

Summary. Key results at a glance:

Query Acceleration of Oracle Database 12c In-Memory using Software on Chip Technology with Fujitsu M10 SPARC Servers

High Performance Computing Infrastructure in JAPAN

HPC and Big Data. EPCC The University of Edinburgh. Adrian Jackson Technical Architect

CaaS SMB Pricing Guide

WHITE PAPER FUJITSU PRIMERGY SERVERS PERFORMANCE REPORT PRIMERGY BX620 S6

Oracle Database Scalability in VMware ESX VMware ESX 3.5

Intel Xeon Processor E5-2600

SOSCIP Platforms. SOSCIP Platforms

Kalray MPPA Massively Parallel Processing Array


PCIe Over Cable Provides Greater Performance for Less Cost for High Performance Computing (HPC) Clusters. from One Stop Systems (OSS)

Dragon Medical Enterprise Network Edition Technical Note: Requirements for DMENE Networks with virtual servers

1-Gigabit TCP Offload Engine

Fujitsu s Architectures and Collaborations for Weather Prediction and Climate Research

LBM BASED FLOW SIMULATION USING GPU COMPUTING PROCESSOR

How To Improve Performance On A P4080 Processor

Bernie Velivis President, Performax Inc

Study of virtual data centers for cost savings and management

Itanium 2 Platform and Technologies. Alexander Grudinski Business Solution Specialist Intel Corporation

Introduction to High Performance Cluster Computing. Cluster Training for UCL Part 1

Supercomputing Status und Trends (Conference Report) Peter Wegner

Power Noise Analysis of Large-Scale Printed Circuit Boards

SGI High Performance Computing

Report on the Sunway TaihuLight System. Jack Dongarra. University of Tennessee. Oak Ridge National Laboratory

Computer Graphics Hardware An Overview

This Unit: Putting It All Together. CIS 501 Computer Architecture. Sources. What is Computer Architecture?

An Alternative Storage Solution for MapReduce. Eric Lomascolo Director, Solutions Marketing

High Performance Tier Implementation Guideline

Fixed Price Website Load Testing

Initial Hardware Estimation Guidelines. AgilePoint BPMS v5.0 SP1

Cisco for SAP HANA Scale-Out Solution on Cisco UCS with NetApp Storage

HUAWEI TECHNOLOGIES CO., LTD. HUAWEI FusionServer X6800 Data Center Server

An Oracle White Paper December Oracle Virtual Desktop Infrastructure: A Design Proposal for Hosted Virtual Desktops

Delivering Quality in Software Performance and Scalability Testing

HP recommended configuration for Microsoft Exchange Server 2010: HP LeftHand P4000 SAN

EMC SYMMETRIX VMAX 10K

Accelerating Microsoft Exchange Servers with I/O Caching

How to Deploy OpenStack on TH-2 Supercomputer Yusong Tan, Bao Li National Supercomputing Center in Guangzhou April 10, 2014

Big Data Assimilation Revolutionizing Weather Prediction

Transcription:

The Next-Generation Supercomputer The K computer: Project overview SHOJI, Fumiyoshi Next-Generation Supercomputer R&D Center, RIKEN

The K computer Outline Project Overview System Configuration of the K computer Facilities for the system

The K computer Outline Project Overview System Configuration of the K computer Facilities for the system

The K computer What is K computer? 京 (Kei) is a nickname of the Next-Generation Supercomputer system. The name was chosen from public applications this July. 京 is a Japanese prefix number like mega, tera, peta, etc., which means 10 16, or 10 peta. 万 (man)=10 4, 億 (oku)=10 8, 兆 (cho)=10 12, 垓 (gai)=10 20, K computer is assigned for English documents. A logo was also determined. It was written by a famous Japanese calligrapher Souun TAKEDA. Another meaning of the 京 is a big gate. A new era of computational science is coming though the gate 京. by hoping promised future success.

The K computer Goals of the Next-Generation Supercomputer project Development and installation of the most advanced high performance supercomputer system with LINPACK performance of 10 petaflops. Development and deployment of application software, which should be made to attain the system maximum capability, in various science and engineering fields. Establishment of an Advanced Institute for Computational Science:AICS as one of the Center of Excellence around supercomputing facilities. The AICS has been established in Kobe at October 2010.

The K computer Schedule of the project We are here. FY2006 FY2007 FY2008 FY2009 FY2010 FY2011 FY2012 System Conceptual design Detailed design Prototype, evaluation Production, installation, and adjustment Tuning and and improvement Buildings Applications Next-Generation Integrated Nanoscience Simulation Next-Generation Integrated Life Simulation Computer building Research building Development, production, and evaluation Development, production, and evaluation Design Design Construction Construction Verification Verification

First installation of the K computer The eight racks has been housed at October 1, 2010. First Linpack result by a part of the system Rmax:48.03TFLOPS(Rpeak:52.22TFLOPS) Power:57.96kW

The K computer Outline Project Overview System Configuration of the K computer Facilities for the system

Networks for Control and Management The K computer System configuration Users Control Servers Compute Nodes Number of CPUs > 80K Number of cores > 640K Total memory capacity > 1PB Internet System Configuration Interconnect Network Multi-dimensional Mesh/Torus Management Servers Job & User Management Local File System Global Global I/O I/O Networks Networks Global File System Frontend Servers

The K computer CPU features (Fujitsu SPARC64 TM VIIIfx) 8 cores 2 SIMD operation units/core 2 Multiply & add floating-point operations (SP or DP) are executed in one SIMD instruction 256 FP registers/core (double precision) Performance 16GFLOPS/core, 128GFLOPS/CPU 2.2GFLOPS/W (58W at 30 by water cooling) Hardware barrier among cores Pre-fetch instruction Shared 6MB L2 Cache (12-way) Software controllable cache (sectored cache) Reference: SPARC64 TM VIIIfx Extensions http://img.jp.fujitsu.com/downloads/jp/jhpc/sparc64viiifx-extensions.pdf 45nm CMOS process, 2GHz 22.7mm x 22.6mm 760 M transistors

5GB/s (peak) x 2 5GB/s (peak) x 2 Compute nodes and Network Compute nodes (CPUs): > 80,000 Number of cores: > 640,000 Peak performance: > 10PFLOPS Memory: > 1PB (16GB/node) 6-dimensional mesh/torus network: Tofu 10 connections to each adjacent node Peak bandwidth: 5GB/s x 2 for each connection Logically 3-dimensional torus network SPARC64 TM VIIIfx 5GB/s(peak) x 2 Compute ノード node CPU: 128GFLOPS (8 Core) Core Core SIMD(4FMA) Core SIMD(4FMA) Core SIMD(4FMA) Core SIMD(4FMA) Core SIMD(4FMA) Core16GFlops SIMD(4FMA) Core16GFlops 16GFlops SIMD(4FMA) 16GFlops SIMD(4FMA) 16GFlops 16GFlops 16GFlops 16GFLOPS 5GB/s (peak) x 2 z 5GB/s(peak) x 2 L2$:6MB 5MB 64GB/s MEM: 16GB y x Courtesy of FUJITSU Ltd.

The Packaging K computer of the system A rack consists of 24 system boards, 6 IO boards, power supply units, system storages, and diagnostic processors. A hose pipe is connected to the water loop under the floor. 796mm システムボード System Board CPU 2060mm ~560mm ICC LSI for interconnect ~460mm

The K computer Outline Project Overview System Configuration of the K computer Facilities for the system

AICS: location of the K computer in Kobe Kyoto AICS (Advanced Institute for Computational Science) was established at RIKEN last July. Kobe Tokyo 450km (280miles) west from Tokyo

Layout of the buildings Research Building Computer building Chillers Substation Supply

Image of the K computer More than 800 racks will be housed. The full system will be in operational in 2012.

The K computer Summary The Next-Generation Supercomputer 京 (kei), K computer The facilities for the K computer is complete. The installation of the K computer has started. The first measurement of LINPACK on a part of the system has been done. The full system of the K computer will be in operational at 2012.

Thank you for your attention! A photo in the early evening