Hardware Acceleration for CST MICROWAVE STUDIO



Similar documents
High Performance Computing in CST STUDIO SUITE

SUBJECT: SOLIDWORKS HARDWARE RECOMMENDATIONS UPDATE

Simulation of Mobile Phone Antenna Performance

High Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/ CAE Associates

Today s handsets have to meet tough ELECTROMAGNETIC SIMULATION OF MOBILE PHONE ANTENNA PERFORMANCE

QuickSpecs. NVIDIA Quadro K5200 8GB Graphics INTRODUCTION. NVIDIA Quadro K5200 8GB Graphics. Technical Specifications

HP Workstations graphics card options

QuickSpecs. NVIDIA Quadro K5200 8GB Graphics INTRODUCTION. NVIDIA Quadro K5200 8GB Graphics. Overview. NVIDIA Quadro K5200 8GB Graphics J3G90AA

Accelerating CFD using OpenFOAM with GPUs

How To Simulate An Antenna On A Computer Or Cell Phone

Maximize Performance and Scalability of RADIOSS* Structural Analysis Software on Intel Xeon Processor E7 v2 Family-Based Platforms

Introduction to GPGPU. Tiziano Diamanti

GPU System Architecture. Alan Gray EPCC The University of Edinburgh

GPU Hardware and Programming Models. Jeremy Appleyard, September 2015

The Bus (PCI and PCI-Express)

QCD as a Video Game?

Forum R.F.& Wireless, Milano il 14 Febbraio 2008 Dr. Emmanuel Leroux Technical Sales Manager for Italy

Welcome to Pericom s PCIe and USB3 ReDriver/Repeater Product Training Module.

Power Rating Simulation of the new QNS connector generation

How to choose a suitable computer

Forum R.F.& Wireless, Roma il 21 Ottobre 2008 Dr. Emmanuel Leroux Country Manager for Italy

USB 3.0 Bandwidth, High Definition Performance

Hardware-Aware Analysis and. Presentation Date: Sep 15 th 2009 Chrissie C. Cui

QuickSpecs. NVIDIA Quadro M GB Graphics INTRODUCTION. NVIDIA Quadro M GB Graphics. Overview

QuickSpecs. NVIDIA Quadro K1200 4GB Graphics INTRODUCTION PERFORMANCE AND FEATURES. Overview

Installation Guide. (Version ) Midland Valley Exploration Ltd 144 West George Street Glasgow G2 2HG United Kingdom

Simulation électromagnétique 3D et câblage, exemples dans l'automobile et aéronautique :

QuickSpecs. NVIDIA Quadro K4200 4GB Graphics INTRODUCTION. NVIDIA Quadro K4200 4GB Graphics. Overview

NVIDIA workstation 3D graphics card upgrade options deliver productivity improvements and superior image quality

OpenPOWER Outlook AXEL KOEHLER SR. SOLUTION ARCHITECT HPC

USB 3.0 to HDMI External Multi Monitor Graphics Adapter with 3-Port USB Hub HDMI and USB 3.0 Mini Dock 1920x1200 / 1080p

Technical bulletin: Remote visualisation with VirtualGL

A general-purpose virtualization service for HPC on cloud computing: an application to GPUs

HP ProLiant SL270s Gen8 Server. Evaluation Report

Several tips on how to choose a suitable computer

Accelerating Simulation & Analysis with Hybrid GPU Parallelization and Cloud Computing

How To Use An Amd Ramfire R7 With A 4Gb Memory Card With A 2Gb Memory Chip With A 3D Graphics Card With An 8Gb Card With 2Gb Graphics Card (With 2D) And A 2D Video Card With

NVIDIA Quadro M4000 Sync PNY Part Number: VCQM4000SYNC-PB. User Guide

GPGPU accelerated Computational Fluid Dynamics

HP Blade Workstation Solution FAQ

PCIe Over Cable Provides Greater Performance for Less Cost for High Performance Computing (HPC) Clusters. from One Stop Systems (OSS)

Icepak High-Performance Computing at Rockwell Automation: Benefits and Benchmarks

LBM BASED FLOW SIMULATION USING GPU COMPUTING PROCESSOR

Three Paths to Faster Simulations Using ANSYS Mechanical 16.0 and Intel Architecture

How To Build An Ark Processor With An Nvidia Gpu And An African Processor

Scaling from Workstation to Cluster for Compute-Intensive Applications

An 8-channel RF Transceive Coil For Parallel Imaging at 7T/298MHz

An Introduction to High-Frequency Circuits and Signal Integrity

Recent Advances in HPC for Structural Mechanics Simulations

Introduction to Cloud Computing

GPUs for Scientific Computing

QuickSpecs. NVIDIA Quadro K2200 4GB Graphics INTRODUCTION. NVIDIA Quadro K2200 4GB Graphics. Technical Specifications

White Paper. Intel Sandy Bridge Brings Many Benefits to the PC/104 Form Factor

What s New in Mike Bailey LabVIEW Technical Evangelist. uk.ni.com

HETEROGENEOUS HPC, ARCHITECTURE OPTIMIZATION, AND NVLINK

HP Workstations graphics card options

USB 3.0 to VGA External Multi Monitor Graphics Adapter with 3-Port USB Hub VGA and USB 3.0 Mini Dock 1920x1200 / 1080p

ALPS Supercomputing System A Scalable Supercomputer with Flexible Services

Computer Graphics Hardware An Overview

COMP/CS 605: Intro to Parallel Computing Lecture 01: Parallel Computing Overview (Part 1)

HP reference configuration for entry-level SAS Grid Manager solutions

NVIDIA GRID OVERVIEW SERVER POWERED BY NVIDIA GRID. WHY GPUs FOR VIRTUAL DESKTOPS AND APPLICATIONS? WHAT IS A VIRTUAL DESKTOP?

Version 3.0 Technical Brief

NVIDIA CUDA Software and GPU Parallel Computing Architecture. David B. Kirk, Chief Scientist

Choosing a Computer for Running SLX, P3D, and P5

A New, High-Performance, Low-Power, Floating-Point Embedded Processor for Scientific Computing and DSP Applications

System Requirements. SAS Profitability Management Deployment

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage

S b at 1.6 Gigabit/Second Bandwidth Encourages Industrial Imaging and Instrumentation Applications Growth

Mixed Precision Iterative Refinement Methods Energy Efficiency on Hybrid Hardware Platforms

Clusters: Mainstream Technology for CAE

Several tips on how to choose a suitable computer

Cray Gemini Interconnect. Technical University of Munich Parallel Programming Class of SS14 Denys Sobchyshak

A FULL-WAVE ANALYSIS OF HIGH GAIN RADIAL LINE SLOT ANTENNAS USING CST STUDIO SUITE

Complete Technology and RFID

PVTC Technical Requirements

Introduction to GP-GPUs. Advanced Computer Architectures, Cristina Silvano, Politecnico di Milano 1

Main Memory Data Warehouses

The Evolution of Computer Graphics. SVP, Content & Technology, NVIDIA

按 一 下 以 編 輯 母 片 標 題 樣 式

THE AMD MISSION 2 AN INTRODUCTION TO AMD NOVEMBER 2014

Deploying and managing a Visualization Onera

IOS110. Virtualization 5/27/2014 1

IC-EMC Simulation of Electromagnetic Compatibility of Integrated Circuits

STORAGE HIGH SPEED INTERCONNECTS HIGH PERFORMANCE COMPUTING VISUALISATION GPU COMPUTING

Next Generation GPU Architecture Code-named Fermi

Visualization Cluster Getting Started

GeoImaging Accelerator Pansharp Test Results

Using GPUs in the Cloud for Scalable HPC in Engineering and Manufacturing March 26, 2014

FPGA Acceleration using OpenCL & PCIe Accelerators MEW 25

Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays. Red Hat Performance Engineering

Improved LS-DYNA Performance on Sun Servers

Efficient Parallel Graph Exploration on Multi-Core CPU and GPU

CFD Implementation with In-Socket FPGA Accelerators

SeaMicro SM Server

Lytec 2014 Hardware and Software Requirements. Lytec Single-User Hardware and Software Requirements

Next Generation Operating Systems

Transcription:

Hardware Acceleration for CST MICROWAVE STUDIO Chris Mason Product Manager Amy Dewis Channel Manager

Agenda 1. Introduction 2. Why use Hardware Acceleration? 3. Hardware Acceleration Technologies 4. Current Features and Product Family 5. Benchmarks 6. Upcoming Features

Quick Acceleware History Highlights Feb 04 Founded Oct 05 Shipped First Product Jan 07 NVIDIA Invests Jun 07 Announced third generation of products Feb 08 71 Employees Feb 08 Acceleware turns 4 years old HQ: Calgary, Canada Publicly-Traded: Toronto Venture Exchange (TSX), Symbol: AXE EM Market: 8 announced software partners Seismic Market: Field trials completed; launch January 30 Image Reconstruction: Customer trials successful; launch Q1 2008

Processing Evolution Antenna Cellphone Head/Cellphone Full Body/Object 10-15 Years Ago 5-10 Years Ago Today Tomorrow Supercomputer Cluster PC Hardware Option Application Size & Complexity Application Example

Acceleware What Do We Do? Leverage the Massive Parallel Computational Capabilities of the GPU Result is a Supercomputer

Why Acceleration? Performance Size of problem Time to solution Cost Infinite appetite for increased performance Today users Compromise Larger and more complex problems Currently taking days, weeks, or longer to run Computer-aided optimization full body Increased performance results in Faster time-to-market = Increased revenues Safer, better products = Reduced cost Better, more-informed decisions in vehicle

Options for Acceleration Multi -Core CPU GPU FPGA Cell - microprocessor architecture designed by Sony, Toshiba and IBM (ex. Playstation 3) Clearspeed proprietary microprocessor architecture ASICs Others

Performance Comparison Maximum Practical Speed Maximum Size of Application Price Multi-core FPGA GPU Cell ClearSpeed

GPU Technology NVIDIA Quadro FX chipset 128 cores, 1.35 GHz, ~500 Gflops 76.8 GB/s memory bandwidth 1.5 GB GDDR3 PCI Express x16 interface

Accuracy and Stability of Hardware Acceleration Both GPU and CPU are 32-bit floating point precision Order of operations and rounding are slightly different between the two However, time domain results are virtually identical The calculations normally done on the CPU are transferred to the Accelerator A30 this is handled within CST MICROWAVE STUDIO

Product Family for CST MICROWAVE STUDIO Accelerator ClusterInABox Installs in your existing workstation (PCI-Express x16 slot) Two Accelerator cards Installs using interface card into your existing workstation

Current Performance Accelerator A30 Performance 1M 24M meshcells fully accelerated >24M meshcells via Soft Memory Limit* ~7x speed up (referenced against a 4-core Woodcrest system running in software only) * Soft Memory Limit extends the maximum simulation size beyond that of the memory on the Accelerator by sharing memory with the host computer. Performance will depend on the proportion of memory shared.

Current Performance ClusterInABox Dual D30 Performance Contains two Accelerator A30 cards 1M 48M meshcells supported on hardware >48M meshcells supported in Soft Memory Limit ~12x speed up (vs. typical software only run on a 4-core Woodcrest system)

Ideal Simulations for Hardware Acceleration Within the Time Domain Solver of CST MICROWAVE STUDIO, the following types of simulations up to 24M meshcells (for one Accelerator A30) or up to 48M meshcells (for a ClusterInABox D30) will be best suited to hardware acceleration: 1. Antennas, cell phones. 2. PCB, signal integrity and package analysis. 3. Cables, connectors, waveguides, RF components.

Currently Supported Features Features Supported by hardware acceleration: PBA (Perfect Boundary Approximation) Open and closed boundaries TST (Thin Sheet Technique) Lossy metal model Lossless and lossy heterogeneous dielectric materials Far field monitors

Operating Systems Currently Supported: Windows XP (32-bit) Windows XP 64-bit (strongly recommended) Future Supported: Red Hat Enterprise Linux WS 4 Windows Vista

IC Package Benchmark Speed Up Factor 14 13 12 10 11 9 8 7 6 5 4 3 2 1 0 0 20 40 60 Simulation Size (Mcells) ClusterInABox D30 Accelerator A30 Software Only Note: Above speed up factors are determined by comparison to a 4-core Woodcrest system.

IBM Challenge Problem Size of Model: 27 million meshcells Shape: X: 677 Y: 1159 Z: 36 Calculation times: Model run in Software Only Model run on a ClusterInABox D30 Speed Up Factor Solver Time (hrs) 273.8 18.4 14.9 Total Run Time (hrs) 276.2 20.0 13.8 Note: All 36 ports were run in this simulation. Above speed up factors are determined by comparison to a 4-core Woodcrest system.

Cell Phone Head Benchmark Speed Up Factor 14 13 12 10 11 9 8 7 6 5 4 3 2 1 0 0 10 20 30 40 Simulation Size (Mcells) Note: Above speed up factors are determined by comparison to a 4-core Woodcrest system. ClusterInABox D30 Accelerator A30 Software

Features Coming Soon Hardware Acceleration is supported in CST MICROWAVE STUDIO 2008 and beyond. Future Developments: Red Hat Enterprise Linux Windows Vista ClusterInABox Quad Q30 Hardware Dispersive Materials Periodic Boundaries And More

Thank You

Contact Details