Making small GeForce gtx 750Ti GPU cluster with less than $5000

Similar documents
Pedraforca: ARM + GPU prototype

LBM BASED FLOW SIMULATION USING GPU COMPUTING PROCESSOR

Several tips on how to choose a suitable computer

Several tips on how to choose a suitable computer

NVIDIA GRID OVERVIEW SERVER POWERED BY NVIDIA GRID. WHY GPUs FOR VIRTUAL DESKTOPS AND APPLICATIONS? WHAT IS A VIRTUAL DESKTOP?

NVIDIA Quadro M4000 Sync PNY Part Number: VCQM4000SYNC-PB. User Guide

GeoImaging Accelerator Pansharp Test Results

GPUs for Scientific Computing

HP ProLiant SL270s Gen8 Server. Evaluation Report

CORRIGENDUM TO TENDER FOR HIGH PERFORMANCE SERVER

HP PCIe IO Accelerator For Proliant Rackmount Servers And BladeSystems

CLOUD GAMING WITH NVIDIA GRID TECHNOLOGIES Franck DIARD, Ph.D., SW Chief Software Architect GDC 2014

High-Density Network Flow Monitoring

LS DYNA Performance Benchmarks and Profiling. January 2009

Optimizing GPU-based application performance for the HP for the HP ProLiant SL390s G7 server

HP Workstations graphics card options

GOLD20TH-GTX980-P-4GD5

ECLIPSE Performance Benchmarks and Profiling. January 2009

GPU System Architecture. Alan Gray EPCC The University of Edinburgh

Overview of HPC Resources at Vanderbilt

Tekla Structures 18 Hardware Recommendation

PCIe Over Cable Provides Greater Performance for Less Cost for High Performance Computing (HPC) Clusters. from One Stop Systems (OSS)

Energy efficient computing on Embedded and Mobile devices. Nikola Rajovic, Nikola Puzovic, Lluis Vilanova, Carlos Villavieja, Alex Ramirez

Videocard Benchmarks Over 600,000 Video Cards Benchmarked

Accelerating CFD using OpenFOAM with GPUs

GPU Programming in Computer Vision

Why You Need the EVGA e-geforce 6800 GS

NVIDIA workstation 3D graphics card upgrade options deliver productivity improvements and superior image quality

Stream Processing on GPUs Using Distributed Multimedia Middleware

Hardware Acceleration for CST MICROWAVE STUDIO

Computer Information & Recommendations

HP Z Workstations graphics card options

Migrating Control System Servers to Virtual Machines

Technical bulletin: Remote visualisation with VirtualGL

Converged storage architecture for Oracle RAC based on NVMe SSDs and standard x86 servers

Balancing CPU, Storage

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Choosing a computer for ANY-maze

32-bit and 64-bit BarTender. How to Select the Right Version for Your Needs WHITE PAPER

Graphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011

TESLA M2075 DUAL-SLOT COMPUTING PROCESSOR MODULE

A GPU COMPUTING PLATFORM (SAGA) AND A CFD CODE ON GPU FOR AEROSPACE APPLICATIONS

SIGMOD RWE Review Towards Proximity Pattern Mining in Large Graphs

NVIDIA VIDEO ENCODER 5.0

Installation Guide. (Version ) Midland Valley Exploration Ltd 144 West George Street Glasgow G2 2HG United Kingdom

TESLA M2050 AND TESLA M2070 DUAL-SLOT COMPUTING PROCESSOR MODULES

================================================================== CONTENTS ==================================================================

Planning Your Installation or Upgrade

Transcend the Vision. Embedded Graphic Solutions that Lead to New Territory. Embedded Graphic Solutions.

Low-Power Amdahl-Balanced Blades for Data-Intensive Computing

Choosing a Computer for Running SLX, P3D, and P5

PNY Professional Solutions NVIDIA GRID - GPU Acceleration for the Cloud

Home Exam 3: Distributed Video Encoding using Dolphin PCI Express Networks. October 20 th 2015

GPGPU Computing. Yong Cao

Dragon Medical Enterprise Network Edition Technical Note: Requirements for DMENE Networks with virtual servers

System Configuration and Order-information Guide ECONEL 100 S2. March 2009

NVIDIA Tesla. GPU Computing Technical Brief. Version /24/07

Final Project Report. Trading Platform Server

LabStats 5 System Requirements

HP Blade Workstation Solution FAQ

TESLA K20X GPU ACCELERATOR

HP Workstations graphics card options

New levels of efficiency and optimized design. The latest Intel CPUs. 2+1 expandability in UP 1U

Dell Virtualization Solution for Microsoft SQL Server 2012 using PowerEdge R820

How to choose a suitable computer

HUAWEI TECHNOLOGIES CO., LTD. HUAWEI FusionServer X6800 Data Center Server

Larger, active workgroups (or workgroups with large databases) must use one of the full editions of SQL Server.

The Evolution of Computer Graphics. SVP, Content & Technology, NVIDIA

Turbomachinery CFD on many-core platforms experiences and strategies

How To Build An Ark Processor With An Nvidia Gpu And An African Processor

Clusters with GPUs under Linux and Windows HPC

Building Clusters for Gromacs and other HPC applications

NVIDIA GeForce GTX 580 GPU Datasheet

SQL Server Instance-Level Benchmarks with DVDStore

ZEN LOAD BALANCER EE v3.02 DATASHEET The Load Balancing made easy

Characterizing Task Usage Shapes in Google s Compute Clusters

PassMark - G3D Mark High End Videocards - Updated 12th of November 2015

LLamasoft K2 Enterprise 8.1 System Requirements

AppDynamics Lite Performance Benchmark. For KonaKart E-commerce Server (Tomcat/JSP/Struts)

GEFORCE 3D VISION QUICK START GUIDE

Hyperscale. The new frontier for HPC. Philippe Trautmann. HPC/POD Sales Manager EMEA March 13th, 2011

Call Center - Purchased Solution 64 Bit Installation Instructions

Stovepipes to Clouds. Rick Reid Principal Engineer SGI Federal by SGI Federal. Published by The Aerospace Corporation with permission.

TESLA K20 GPU ACCELERATOR

APPLICATIONS OF LINUX-BASED QT-CUDA PARALLEL ARCHITECTURE

Findings in High-Speed OrthoMosaic

How To Configure Your Computer With A Microsoft X86 V3.2 (X86) And X86 (Xo) (Xos) (Powerbook) (For Microsoft) (Microsoft) And Zilog (X

Data center modeling, and energy efficient server management

XPS Views. Specifications

Mellanox Cloud and Database Acceleration Solution over Windows Server 2012 SMB Direct

Introduction to GP-GPUs. Advanced Computer Architectures, Cristina Silvano, Politecnico di Milano 1

PC Laptop Prices. Microsoft. Safeware Protection. Low Academic Pricing from UCLA s Computer Store of Choice

VMWare Workstation 11 Installation MICROSOFT WINDOWS SERVER 2008 R2 STANDARD ENTERPRISE ED.

Understanding the Performance of an X User Environment

NVIDIA GRID K2 GRAPHICS BOARD

A Highly Versatile Virtual Data Center Ressource Pool Benefits of XenServer to virtualize services in a virtual pool

ZEN LOAD BALANCER EE v3.04 DATASHEET The Load Balancing made easy

Icepak High-Performance Computing at Rockwell Automation: Benefits and Benchmarks

NVIDIA CUDA Software and GPU Parallel Computing Architecture. David B. Kirk, Chief Scientist

VMware vcenter Update Manager Performance and Best Practices VMware vcenter Update Manager 4.0

Transcription:

From NVidia's Web Making small GeForce gtx 750Ti GPU cluster with less than $5000 K.Hoshina Sep. 15 2014 IceCube Collaboration Meeting

Motivation All simulation production uses GPU Currently we have ~200 GPU nodes, but... Usually a dataset uses only less than 30 GPUs at a time because it have to compete with other datasets If your job priority is lower than others, your jobs may stack at GPU task for more than weeks It is essential to "keep processing" GPU tasks in order not to waste time CPU Jobs don't start until GPU jobs are finished, even if we have plenty CPU nodes NuGen Photon Propagation DetectorSim L1 processing L2 processing

We have $5000, then... It's a little bit off for buying a blade server Main issue is power supply : Most of GPU cards (e.g. gtx 680) require ~200W per card, then power unit must supply more than ~800W (for two cards). It's not easy to find a server with two PCIe Gen3 x24 slots and high power unit. Only SunMicro (Oracle) provides them with a reasonable price, but they are still expensive in Japan. ERI group have 6 old Dell PowerEdgeT410 machines with 48 CPU cores. They are good enough for CPU jobs and has one PCI Gen3 x24 port, however, the power unit supplies only 450W. Does having 10 private GPU cores improve the situation? Yes! 10 GPU cores are already ~30% of IceCube public GPUs you may available at a time, and it won't stop even if your priority is low. So, Can we use low-power GPUs with cheap PC?

GEForce GTX 750 Ti Uses only 75w, works without extra power cable ZOTAC GEForce GTX 750 Ti 2GB (~$140) EVGA GEForce GTX 750 Ti 2GB Superclocked (~$150)

Tested Machines DELL PowerEdge T420 & T410 Homebuild Machine ($580) instructed by Nvidia http://www.geforce.com/whats-new/guides/geforce-gtx-750-ti-mini-itx-pc-build-guide

Homebuild PC *1 Now it's hard to obtain. You may buy SG05B-Lite and buy power unit as option. *2 Since we have CPU cluster already, we didn't pay much for CPU. However, if you want to use it as CPU machine too, you may use quad-core CPU. Also, you need to increase memory size ~ 4GB / core if you want to use it for phonics-table-based reconstructions. Note that GPU occupies one CPU core.

Performance (ppc 1e+11 photons) Machine System GPU Device Time [ms] ratio CobaltGPU SL6.4 cuda 5.5 EVGA NVidia GeForce 680 Running on 8 MPs x 1024 threads 3689659 1.0 homebuild P C SL6.5 cuda 6.0 EVGA NVidia GeForce 750 Ti 2GB SC (75W) Running on 5 MPs x 1024 threads 4987016 1.35 homebuild P C SL6.5 cuda 6.0 Zotac NVidia GeForce 750 Ti 2GB (75W) Running on 5 MPs x 1024 threads 5658848 1.53 DELL T420 SL6.3 cuda 6.0 EVGA NVidia GeForce 750 Ti 2GB SC (75W) Running on 5 MPs x 1024 threads 4986203 1.35 DELL T420 SL6.3 cuda 6.0 Zotac NVidia GeForce 750 Ti 2GB (75W) Running on 5 MPs x 1024 threads 5659683 1.53 DELL T410 SL6.3 cuda EVGA NVidia GeForce 660 (140W) Running on 5 MPs x 1024 threads 5603139 1.52 *Nvidia announces that the performance of GeForce 750Ti is lower than 660, but it looks comparable for our simulation and even faster for EVGA Superclocked card.

ERI01 CPU + GPU mini cluster 6 DELL T410 (8CPU cores + 1GPU core /Machine) 6 HomePC (2CPU cores + 1GPU core /Machine) On Condor: 45 CPU cores + 12 GPU cores IceProd support Total cost for upgrading ERI01 cluster : ~$4500 ERI01 is primarily for EarthCore related simulation, but feel free to use it when the grid is not busy :)

Summary We compared performances of NVidia gtx 750Ti. EVGA GeForce 750 Ti 2GB SC(Superclocked) model showed remarkable speed performance for PPC test. The price of EVGA gtx 750Ti SC is now ~$140, uses only 75W and does not require additional power. If your PC have a PCIe x24 port, it will be the most economical option to have a GPU test machine. We did simple stress test. No fatal error is observed. The temperature was stable and low enough through the test. Power consumption and heat problem may not be an issue for 750Ti. A homebuild PC costs only less than $600 per 1 GPU core + 2 CPU cores. With $5000 you may build a test cluster with 8 CPU cores + 8 GPU cores. This is almost comparable to buy a blade server if you can pay another $1000+ (and the performance is better for blade server + high-end GPU card). Still the home build cluster is good for most of institute where no computer specialist exists, because it's easy to fix or replace when technical problems happen.

Setup Procedures (Go to http://www.icecube.wisc.edu/~hoshina/ and click "Blog".) How to install a GPU card http://icecube.wisc.edu/~hoshina/blog/special_blog? cmd=post&id=8 How to install condor + iceprod with GPU setting http://icecube.wisc.edu/~hoshina/blog/special_blog? cmd=post&id=9 How to install Scientific Linux 6.5 to Homebuild Machine http://icecube.wisc.edu/~hoshina/blog/special_blog? cmd=post&id=10