IS-ENES/PrACE Meeting EC-EARTH 3. A High-resolution Configuration

Similar documents
A Study on the Scalability of Hybrid LS-DYNA on Multicore Architectures

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

Sun Constellation System: The Open Petascale Computing Architecture

Department of Computer Sciences University of Salzburg. HPC In The Cloud? Seminar aus Informatik SS 2011/2012. July 16, 2012

Performance of the JMA NWP models on the PC cluster TSUBAME.

Parallel Programming Survey

A Flexible Cluster Infrastructure for Systems Research and Software Development

Oracle Database Scalability in VMware ESX VMware ESX 3.5

Performance of HPC Applications on the Amazon Web Services Cloud

Clusters: Mainstream Technology for CAE

ECLIPSE Best Practices Performance, Productivity, Efficiency. March 2009

Silviu Panica, Marian Neagul, Daniela Zaharie and Dana Petcu (Romania)

Finite Elements Infinite Possibilities. Virtual Simulation and High-Performance Computing

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

Evaluating HDFS I/O Performance on Virtualized Systems

Performance Evaluation of the XDEM framework on the OpenStack Cloud Computing Middleware

Cluster Scalability of ANSYS FLUENT 12 for a Large Aerodynamics Case on the Darwin Supercomputer

SR-IOV: Performance Benefits for Virtualized Interconnects!

How To Build A Supermicro Computer With A 32 Core Power Core (Powerpc) And A 32-Core (Powerpc) (Powerpowerpter) (I386) (Amd) (Microcore) (Supermicro) (

ALPS Supercomputing System A Scalable Supercomputer with Flexible Services

Interconnect Efficiency of Tyan PSC T-630 with Microsoft Compute Cluster Server 2003

Benchmark Hadoop and Mars: MapReduce on cluster versus on GPU

Scalability and Classifications

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

PARALLEL & CLUSTER COMPUTING CS 6260 PROFESSOR: ELISE DE DONCKER BY: LINA HUSSEIN

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC

SR-IOV In High Performance Computing

LS DYNA Performance Benchmarks and Profiling. January 2009

An Oracle White Paper June High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database

FLOW-3D Performance Benchmark and Profiling. September 2012

IS-ENES WP3. D3.8 - Report on Training Sessions

Parallel Software usage on UK National HPC Facilities : How well have applications kept up with increasingly parallel hardware?

Panasas High Performance Storage Powers the First Petaflop Supercomputer at Los Alamos National Laboratory

High Performance Computing. Course Notes HPC Fundamentals

A Cloud Computing Approach for Big DInSAR Data Processing

ECLIPSE Performance Benchmarks and Profiling. January 2009

Improving Grid Processing Efficiency through Compute-Data Confluence

CUTTING-EDGE SOLUTIONS FOR TODAY AND TOMORROW. Dell PowerEdge M-Series Blade Servers

Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales

SRNWP Workshop. HP Solutions and Activities in Climate & Weather Research. Michael Riedmann European Performance Center

Performance Characteristics of a Cost-Effective Medium-Sized Beowulf Cluster Supercomputer

Multi-Threading Performance on Commodity Multi-Core Processors

Recommended hardware system configurations for ANSYS users

LS-DYNA Scalability on Cray Supercomputers. Tin-Ting Zhu, Cray Inc. Jason Wang, Livermore Software Technology Corp.

Workshop on Parallel and Distributed Scientific and Engineering Computing, Shanghai, 25 May 2012

David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems

Toward a practical HPC Cloud : Performance tuning of a virtualized HPC cluster

Cray Gemini Interconnect. Technical University of Munich Parallel Programming Class of SS14 Denys Sobchyshak

On-Demand Supercomputing Multiplies the Possibilities

benchmarking Amazon EC2 for high-performance scientific computing

Comparing the performance of the Landmark Nexus reservoir simulator on HP servers

Symmetric Multiprocessing

Lattice QCD Performance. on Multi core Linux Servers

HPC Update: Engagement Model

Part I Courses Syllabus

ECDF Infrastructure Refresh - Requirements Consultation Document

Cluster, Grid, Cloud Concepts

MapReduce on GPUs. Amit Sabne, Ahmad Mujahid Mohammed Razip, Kun Xu

HPC performance applications on Virtual Clusters

Data Management/Visualization on the Grid at PPPL. Scott A. Klasky Stephane Ethier Ravi Samtaney

DELL. Virtual Desktop Infrastructure Study END-TO-END COMPUTING. Dell Enterprise Solutions Engineering

Windows Compute Cluster Server Miron Krokhmal CTO

Boosting Data Transfer with TCP Offload Engine Technology

1 Bull, 2011 Bull Extreme Computing

IOS110. Virtualization 5/27/2014 1

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical

Parallel Ray Tracing using MPI: A Dynamic Load-balancing Approach

PRIMERGY server-based High Performance Computing solutions

Dell High-Performance Computing Clusters and Reservoir Simulation Research at UT Austin.

22S:295 Seminar in Applied Statistics High Performance Computing in Statistics

How To Compare Amazon Ec2 To A Supercomputer For Scientific Applications

Optimization of Cluster Web Server Scheduling from Site Access Statistics

How To Monitor Infiniband Network Data From A Network On A Leaf Switch (Wired) On A Microsoft Powerbook (Wired Or Microsoft) On An Ipa (Wired/Wired) Or Ipa V2 (Wired V2)

The Methodology Behind the Dell SQL Server Advisor Tool

Accelerating From Cluster to Cloud: Overview of RDMA on Windows HPC. Wenhao Wu Program Manager Windows HPC team

SNIC Project based user support 9/1/14 Page 1 of 7

High Performance Computing

Improving System Scalability of OpenMP Applications Using Large Page Support

PARALLELS CLOUD STORAGE

Big Fast Data Hadoop acceleration with Flash. June 2013

NoSQL Performance Test In-Memory Performance Comparison of SequoiaDB, Cassandra, and MongoDB

Improved LS-DYNA Performance on Sun Servers

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage

CHAPTER FIVE RESULT ANALYSIS

A PERFORMANCE COMPARISON USING HPC BENCHMARKS: WINDOWS HPC SERVER 2008 AND RED HAT ENTERPRISE LINUX 5

Introduction to grid technologies, parallel and cloud computing. Alaa Osama Allam Saida Saad Mohamed Mohamed Ibrahim Gaber

An Integrated Simulation and Visualization Framework for Tracking Cyclone Aila

HPC and Big Data. EPCC The University of Edinburgh. Adrian Jackson Technical Architect

How To Build A Cloud Computer

GPU System Architecture. Alan Gray EPCC The University of Edinburgh

Kashif Iqbal - PhD Kashif.iqbal@ichec.ie

Cluster Implementation and Management; Scheduling

How to Run Parallel Jobs Efficiently

2009 Oracle Corporation 1

Interactive Data Visualization with Focus on Climate Research

Power Efficiency Comparison: Cisco UCS 5108 Blade Server Chassis and Dell PowerEdge M1000e Blade Enclosure

Building a Top500-class Supercomputing Cluster at LNS-BUAP

Parallel Processing using the LOTUS cluster

Cloud computing. Intelligent Services for Energy-Efficient Design and Life Cycle Simulation. as used by the ISES project

Exascale Challenges and General Purpose Processors. Avinash Sodani, Ph.D. Chief Architect, Knights Landing Processor Intel Corporation

Transcription:

IS-ENES/PrACE Meeting EC-EARTH 3 A High-resolution Configuration

Motivation Generate a high-resolution configuration of EC-EARTH to Prepare studies of high-resolution ESM in climate mode Prove and improve EC-EARTH 3 capabilities as a scientific tool Study and improve scalability of EC-EARTH 3 Work towards PrACE systems These activities were boosted by the OASIS Dedicated User Support 2010. Moreover, the National Supercomputing Centre (NSC) of Sweden supported the work within IS-ENES. Acknowledgements to Eric Maisonnave, CERFACS, and Chandan Basu, NSC.

Configuration: Component models Atmosphere: IFS ECMWF's forecastings system cycle 36R1 EC-EARTH changes regarding long integrations, ocean coupling, aerosols,... Ocean: Nemo + LIM2 Release 3.3beta (continuously updated) Coupler: OASIS3 Development trunk (continuously updated) Packaging New development for EC-EARTH 3, focused at Portability, Consistency, and Easy configuration

Configuration: Grids and Coupling Atmosphere T799/N800 grid with 62 levels Approx. 0.25º horizontal resolution Ocean ORCA025 grid Approx. 0.25º horizontal resolution Coupling setup EC-EARTH specific implementation in IFS (ongoing development) Nemo/LIM coupling interface with minor EC-EARTH changes OASIS3 pseudo parallel mode using 10 instances Number of coupling fields: 20 (16+4)

Configuration: Test platform Distributed memory cluster (Dell PowerEdge) 1268 compute nodes 2x AMD Opteron (x4 cores) 2x8 GB DDR2 10144 cores (of which we were using just over 1600) Full bisection bandwidth Infiniband fabric (2GB/s/link) Scali/Plattform MPI, OpenMPI Intel Compiler Suite 10.1

Results: Load balancing OASIS tool used to study load balance between IFS and Nemo Coupling overhead evaluated Disadvantage: Sequential OASIS only Balance ration of 1:4 cores for Ocean:Atmosphere Varies with the overall processor number Performance numbers (not so much the performance!) depends very much on load balancing!

Results: Scaling analysis Scalability is crucial for targeting large systems However, it's a tricky business For coupled systems, scalability is multi-dimensional Scalability is evaluated Assuming balance ratio provided by sequential Oasis run Guessing/trying otherwise Starting from a multi-core run Compare results to standalone IFS runs

Results: Scalability

Results: Scalability (IFS only)

Results: Parallel efficiency

Results: Parallel efficiency (cont.)

Results: Scalability (cont.)

Results: Data issues Mandatory to have realistic output activity, even at high resolution Output data had manageable size during tests Preprocessing? Data transfer? Long runs? Ensembles? Initial tests of Nemo's new I/O system (IOM) Apparently appropriate architecture for massively parallel systems Experimental. No satisfying results (yet?!)

Conclusions EC-EARTH 3 high-resolution configuration was up and running surprisingly fast and with little problems Portability is extremely important (and not for free) Scalability does not degrade seriously (if at all) compared to standalone IFS run. Is this sufficient? Does it scale to O(10'000) cores? Hard to tell. After component scalability, the coupling setup is (not surprisingly) crucial for performance OASIS3 did not prove to be a bottleneck in the chosen configuration. It might later. The work raised many interesting questions...