First In Vivo Medical Images Using Photon- Counting, Real-Time GPU Reconstruction

Size: px
Start display at page:

Download "First In Vivo Medical Images Using Photon- Counting, Real-Time GPU Reconstruction"

Transcription

1 First In Vivo Medical Images Using Photon- Counting, Real-Time GPU Reconstruction A.P. Lowell P. Kahn J. Ku 25 March 2014

2 Overview Application Algorithms History and Limitations of Traditional Processors GPU Solution

3 Overview Application Algorithms History and Limitations of Traditional Processors GPU Solution

4 Application General: Cardiac Fluoroscopy System

5 Application Makes Live Video of a Beating Heart

6 General: Overview Application Used for non-surgical cardiac procedures Assessment of stenosis Angioplasty Stent Placement Real-Time Digital X-ray Imaging High data throughput and processing Multiple equipment racks and custom enclosures across several rooms

7 Application Functional Blocks Control Room Equipment Space Exam Area Facility Installations Image Displays Image Displays Display Processor User Controls Equipment Racks Imaging Chassis I/O Gantry Gantry C-Arm X-ray Detector X-ray Tube Heat Exchanger HVPS PDU UPS Gantry Pedestal Motion Control Patient Table Motion Control User Controls

8 Background Application Triple Ring Technologies is a contract R&D firm specializing in sensor-based systems Project was funded by: NovaRay, Inc. National Institutes of Health Clinical work at University of Wisconsin, Madison Initial implementation by MultiCoreWare, Inc.

9 Application Performance Summary Real-time Tomosynthesis Input: continuous sensor images ~123 billion rays/second ~40 Gbps sensor downlink rate 640x320 photon-counting sensor array 10,000 scanned source locations 1.28 μs/snapshot Output: live video x1000-pixel focal planes internally x1000-pixel best-focus image output 30 frames/second ~1.4 trillion mathematical operations per second

10 Application Technological Novelty: Scanning-Beam Digital X-ray Geometry Traditional X-ray Point X-ray Source Large-Area Detector Close To Patient SBDX Large-Area X-ray Source Small-Area Detector Far From Patient Imaging Information Flat Projection 3-D Dose Acceptable 80% to 90% less

11 Application Technological Novelty: Reverse Geometry Standard Geometry Reverse Geometry

12 Application Technological Novelty: Reverse Geometry Scattered x-rays miss the detector: less noise! Standard Geometry Reverse Geometry

13 Application Technological Novelty: Reverse Geometry Multiple source perspectives 3D tomography Standard Geometry Reverse Geometry

14 Overview General Application Algorithms History and Limitations of Traditional Processors GPU Solution

15 Algorithms Tomosynthesis: Focal Planes D1 D2 D3 High Plane Focal-Plane Low Plane Detector Plane Images must be reconstructed Within a focal plane: Rays from a set of source/detector combinations converge to the same pixel constructive reinforcement Outside the focal plane: Rays from same set of source/detector combinations diverge into different pixels result is blurring Rate of divergence defines depth-of-field Requires multiple focal-planes to image full volume Source Locations S1 S2 S3

16 Algorithms Tomosynthesis: Digital Lens A Virtual Image Plane A B Virtual Lens Detector Plane Focal-Plane Mapping of rays to image pixels is the virtual equivalent of having a physical lens at the detector plane to bend the rays onto a focal plane Changing the bending characteristic of the virtual lens (ie. the mapping function) creates different focal-planes Source Plane

17 Algorithms Tomosynthesis: Digital Lens Focal Plane A In-Focus A Virtual Image Plane Focal Plane B In-Focus B Virtual Image Plane A Detector Plane Detector Plane A Focal-Plane A Focal-Plane B B Source Plane Source Plane

18 Application Tomosynthesis: Focal Plane Example

19 Algorithms Reconstruction By Projection Geometric projection based on ray-tracing Both projection coefficients and extent vary with focalplane

20 Algorithms Reconstruction by Projection: Basic Geometry Detector Elements Rays from each source location to each detector element intersect the focal-plane within some window that spans a (typically) non-integer number of pixels Focal-Plane Pixels Source Locations

21 Algorithms Tomosynthesis: Basic Geometry Detector Elements Windows from adjacent detector elements will (in general) overlap at the boundary pixels Overlap is not constant -- projection kernel varies between detector elements Focal-Plane Pixels Source Locations

22 Algorithms Reconstruction by Projection: Basic Geometry Detector Elements Windows from adjacent source locations will overlap Multiple detector samples for each reconstructed image pixel Focal-Plane Pixels Source Locations

23 Algorithms Reconstruction By Projection: Rotated Detector Rotation of detector improves sampling as projection advances across the image

24 Algorithms Reconstruction By Projection: Rotated Detector Rotation of detector improves sampling as projection advances across the image However, now a given detector row or column does not map consistently onto a pixel row or column the pixel row indices change with detector column, and vice-versa

25 Algorithms Tomosynthesis = CT? CT No SBDX CT SBDX Perspective Parallel to Rays Perpendicular to Rays Sample Rate <~500 Msps (high-end) 7.7 Gsps Response Time ASAP 30 fps, < 100ms latency Projection Geometry Reconstruction Irregular Varies with rotation angle Correct geometric distortion Filtered back-projection Regular Integer source step-size Allow geometric distortion Unfiltered back-projection

26 Algorithms Plane-Selection Single Focal-Plane Best Focus

27 Plane-Selection Algorithms Detect features of interest (things in-focus ) in each focal-plane Algorithms may include matched filters, gradient estimation, topological operators,. Calculate figures-of-merit Major impediments high levels of Poisson noise in dark regions low contrast for small features Select which plane to display in final image on a pixelby-pixel basis Plane-to-plane comparison over a large number of planes

28 Application Live Image from GPU system

29 Algorithms Other Processing Artifact removal (per focal plane) Residue of reconstruction methods: pattern noise, gain corrections Dynamic range adjustment (per focal plane) Typical image dynamic range is far in excess of display capabilities and of the human visual system Noise management Noise is dominated by photon statistics rather than by scatter User-applied filters Temporal averaging with motion-detection Edge enhancement Contrast enhancement

30 Temporal Constraints Algorithms Thermal loading of x-ray target mandates re-scan of source locations Previously-reconstructed pixels must be re-visited Requires a large fraction of the final image to remain resident in memory for re-scanning Real-Time feedback of physical manipulations: Hand-Eye coordination for the surgeon Imposes maximum latency requirement of < ~100 ms along with sustained 30Hz frame rate

31 Overview Application Algorithms History and Limitations of Traditional Processors GPU Solution

32 Previous Implementations History ~10x increase in resolution/calculations per generation 1 st and 2 nd Generations FPGAs: fully-custom parallel pipelines > $15k/focal-plane x 16 focal-planes = >$240k/system Memory-constrained Development and maintenance difficult 3 rd Generation MPPA (Ambric/Nethra): 336 processors with local memory and flexible data distribution mesh Obsolete architecture Still used FPGAs for input formatting/post-processing ~$1500/focal-plane x 32 focal-planes = ~$48k/system Proprietary development environment

33 History Generation 2 Blue: FPGAs 1 focalplane/board Green: FPGAs Artifact removal Dynamic Range Management Separate board for planeselection

34 History Generation 3 Blue: MPPAs 1 focalplane/chip Green: FPGAs Data input/format Artifact removal Dynamic Range Management Same board used for planeselection

35 Traditional Processors: History Previous attempts to map algorithms to common commercial processors failed DSP Cell GPU Limitations: I/O: bandwidth Memory: Available resources (buffer results for many focal planes) Memory: Cache sizing (fall off the cache) Memory: Burst optimization 2-D array access adjacent accesses in one dimension but not in the other Degree of management required by slower host processors

36 Overview Application Algorithms History and Limitations of Traditional Processors GPU Solution

37 What is our configuration? GPU Solution

38 What is our configuration? GPU Solution 9x K20 ~$850/focal-plane x 32 planes = ~$27k/system 1x GTX680 (for managing displays) PCIe 2.0 backplane Redhat on Supermicro Cuda 5

39 GPU Solution Logical Configuration and Data Flow Ethernet Switch 1000-base-T Image Reconstruction PCIe K20 K20 K20 K20 K20 K20 K20 K20 PCIe Multi-Cast, RDMA PCIe Re-scan Aggregator Fiber X-ray Detector Framing Fiber System Controller (Mediation) PCIe Disk Array X-ray Source Framing Fiber Supermicro PCIe K20 Artifact Removal Dynamic Range Management Plane-Selection PCIe 1000-base-T GTX680 HDMI (GigE Vision) To External System Display

40 GPU Solution Physical Configuration 1000-base-T X-ray Detector 1000-base-T Re-scan Aggregator x8 (Gen 1) Multi-Cast Sensor Data Image Reconstruction PCIe Chassis (Cubix 8) x16 x16 Multi-Cast Sensor Data x16 PCIe Switch (Gen 2) x16 x16 x16 Multi-Cast Sensor Data x16 PCIe Switch (Gen 2) x16 x16 x16 K20 K20 x16 K20 PCIe Switch (Gen 2) K20 x16 x16 K20 x16 x16 x16 x16 K20 x16 K20 PCIe Switch (Gen 2) K20 PCIe Chassis (Cubix 8) x8 (Gen 1) Interconnect Multi-Cast Sensor Data 1000-base-T Disk Array GigE Vision Display 1000-base-T Ethernet Switch 1000-base-T X-ray Source 1000-base-T µp K20 x16 x16 x16 x16 PCIe Switch (Gen 3) GTX 680 Host Computer Artifact Removal Dynamic Range Management Plane-Selection Display 1000-base-T System Controller

41 GPU Solution What is new now that allows it to work? Gen 2 PCIe Interface with multi-cast High-enough bandwidth All planes use the same data set and must receive the same data stream GPU Direct or Remote DMA (RDMA) Allows source data streaming directly to GPUs, bypassing the host Dynamic Parallelism Decreases latency by allowing management of parallel operations without host intervention Significant increase in fast shared memory Significant increase in core density

42 Application Live Image from GPU system

43 GPU Solution What would make it better? More shared memory we are still bandwidth-limited! RDMA improvements Bidirectional: We have to get the images out as well as getting the data in Peer-to-Peer (GPU-to-GPU) communication/coordination without host intervention Better support for real-time operations Timeouts/Host waits Support for code executing on streaming multiprocessor CUDA API is optimized for batch operations, not streaming operations Better debugging support for multi-gpu systems Ability to isolate reporting to subsets

44 Reference S4363: Accelerated X-ray Imaging: Real- Time Multi-Plane Image Reconstruction with CUDA discusses an alternate implementation of the reconstruction algorithm

45 Thanks To Paul Kahn, Jamie Ku, and the rest of the TRT team NovaRay, Inc. NIH University of Wisconsin at Madison MultiCoreWare

NVIDIA GeForce GTX 580 GPU Datasheet

NVIDIA GeForce GTX 580 GPU Datasheet NVIDIA GeForce GTX 580 GPU Datasheet NVIDIA GeForce GTX 580 GPU Datasheet 3D Graphics Full Microsoft DirectX 11 Shader Model 5.0 support: o NVIDIA PolyMorph Engine with distributed HW tessellation engines

More information

Router Architectures

Router Architectures Router Architectures An overview of router architectures. Introduction What is a Packet Switch? Basic Architectural Components Some Example Packet Switches The Evolution of IP Routers 2 1 Router Components

More information

Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data

Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data Amanda O Connor, Bryan Justice, and A. Thomas Harris IN52A. Big Data in the Geosciences:

More information

Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data

Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data Amanda O Connor, Bryan Justice, and A. Thomas Harris IN52A. Big Data in the Geosciences:

More information

Comp 410/510. Computer Graphics Spring 2016. Introduction to Graphics Systems

Comp 410/510. Computer Graphics Spring 2016. Introduction to Graphics Systems Comp 410/510 Computer Graphics Spring 2016 Introduction to Graphics Systems Computer Graphics Computer graphics deals with all aspects of creating images with a computer Hardware (PC with graphics card)

More information

Medical Image Processing on the GPU. Past, Present and Future. Anders Eklund, PhD Virginia Tech Carilion Research Institute andek@vtc.vt.

Medical Image Processing on the GPU. Past, Present and Future. Anders Eklund, PhD Virginia Tech Carilion Research Institute andek@vtc.vt. Medical Image Processing on the GPU Past, Present and Future Anders Eklund, PhD Virginia Tech Carilion Research Institute andek@vtc.vt.edu Outline Motivation why do we need GPUs? Past - how was GPU programming

More information

System Interconnect Architectures. Goals and Analysis. Network Properties and Routing. Terminology - 2. Terminology - 1

System Interconnect Architectures. Goals and Analysis. Network Properties and Routing. Terminology - 2. Terminology - 1 System Interconnect Architectures CSCI 8150 Advanced Computer Architecture Hwang, Chapter 2 Program and Network Properties 2.4 System Interconnect Architectures Direct networks for static connections Indirect

More information

1. If we need to use each thread to calculate one output element of a vector addition, what would

1. If we need to use each thread to calculate one output element of a vector addition, what would Quiz questions Lecture 2: 1. If we need to use each thread to calculate one output element of a vector addition, what would be the expression for mapping the thread/block indices to data index: (A) i=threadidx.x

More information

Advances in scmos Camera Technology Benefit Bio Research

Advances in scmos Camera Technology Benefit Bio Research Advances in scmos Camera Technology Benefit Bio Research scmos camera technology is gaining in popularity - Why? In recent years, cell biology has emphasized live cell dynamics, mechanisms and electrochemical

More information

Graphical displays are generally of two types: vector displays and raster displays. Vector displays

Graphical displays are generally of two types: vector displays and raster displays. Vector displays Display technology Graphical displays are generally of two types: vector displays and raster displays. Vector displays Vector displays generally display lines, specified by their endpoints. Vector display

More information

Next Generation GPU Architecture Code-named Fermi

Next Generation GPU Architecture Code-named Fermi Next Generation GPU Architecture Code-named Fermi The Soul of a Supercomputer in the Body of a GPU Why is NVIDIA at Super Computing? Graphics is a throughput problem paint every pixel within frame time

More information

Packet-based Network Traffic Monitoring and Analysis with GPUs

Packet-based Network Traffic Monitoring and Analysis with GPUs Packet-based Network Traffic Monitoring and Analysis with GPUs Wenji Wu, Phil DeMar wenji@fnal.gov, demar@fnal.gov GPU Technology Conference 2014 March 24-27, 2014 SAN JOSE, CALIFORNIA Background Main

More information

Next Generation Operating Systems

Next Generation Operating Systems Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015 The end of CPU scaling Future computing challenges Power efficiency Performance == parallelism Cisco Confidential 2 Paradox of the

More information

Direct GPU/FPGA Communication Via PCI Express

Direct GPU/FPGA Communication Via PCI Express Direct GPU/FPGA Communication Via PCI Express Ray Bittner, Erik Ruf Microsoft Research Redmond, USA {raybit,erikruf}@microsoft.com Abstract Parallel processing has hit mainstream computing in the form

More information

GPU-based Decompression for Medical Imaging Applications

GPU-based Decompression for Medical Imaging Applications GPU-based Decompression for Medical Imaging Applications Al Wegener, CTO Samplify Systems 160 Saratoga Ave. Suite 150 Santa Clara, CA 95051 sales@samplify.com (888) LESS-BITS +1 (408) 249-1500 1 Outline

More information

Computed Tomography Resolution Enhancement by Integrating High-Resolution 2D X-Ray Images into the CT reconstruction

Computed Tomography Resolution Enhancement by Integrating High-Resolution 2D X-Ray Images into the CT reconstruction Digital Industrial Radiology and Computed Tomography (DIR 2015) 22-25 June 2015, Belgium, Ghent - www.ndt.net/app.dir2015 More Info at Open Access Database www.ndt.net/?id=18046 Computed Tomography Resolution

More information

CHAPTER 3: DIGITAL IMAGING IN DIAGNOSTIC RADIOLOGY. 3.1 Basic Concepts of Digital Imaging

CHAPTER 3: DIGITAL IMAGING IN DIAGNOSTIC RADIOLOGY. 3.1 Basic Concepts of Digital Imaging Physics of Medical X-Ray Imaging (1) Chapter 3 CHAPTER 3: DIGITAL IMAGING IN DIAGNOSTIC RADIOLOGY 3.1 Basic Concepts of Digital Imaging Unlike conventional radiography that generates images on film through

More information

Chapter 3 SYSTEM SCANNING HARDWARE OVERVIEW

Chapter 3 SYSTEM SCANNING HARDWARE OVERVIEW Qiang Lu Chapter 3. System Scanning Hardware Overview 79 Chapter 3 SYSTEM SCANNING HARDWARE OVERVIEW Since all the image data need in this research were collected from the highly modified AS&E 101ZZ system,

More information

REAL-TIME STREAMING ANALYTICS DATA IN, ACTION OUT

REAL-TIME STREAMING ANALYTICS DATA IN, ACTION OUT REAL-TIME STREAMING ANALYTICS DATA IN, ACTION OUT SPOT THE ODD ONE BEFORE IT IS OUT flexaware.net Streaming analytics: from data to action Do you need actionable insights from various data streams fast?

More information

Network Traffic Monitoring & Analysis with GPUs

Network Traffic Monitoring & Analysis with GPUs Network Traffic Monitoring & Analysis with GPUs Wenji Wu, Phil DeMar wenji@fnal.gov, demar@fnal.gov GPU Technology Conference 2013 March 18-21, 2013 SAN JOSE, CALIFORNIA Background Main uses for network

More information

Implementation of Canny Edge Detector of color images on CELL/B.E. Architecture.

Implementation of Canny Edge Detector of color images on CELL/B.E. Architecture. Implementation of Canny Edge Detector of color images on CELL/B.E. Architecture. Chirag Gupta,Sumod Mohan K cgupta@clemson.edu, sumodm@clemson.edu Abstract In this project we propose a method to improve

More information

Seeking Opportunities for Hardware Acceleration in Big Data Analytics

Seeking Opportunities for Hardware Acceleration in Big Data Analytics Seeking Opportunities for Hardware Acceleration in Big Data Analytics Paul Chow High-Performance Reconfigurable Computing Group Department of Electrical and Computer Engineering University of Toronto Who

More information

Accelerating Wavelet-Based Video Coding on Graphics Hardware

Accelerating Wavelet-Based Video Coding on Graphics Hardware Wladimir J. van der Laan, Andrei C. Jalba, and Jos B.T.M. Roerdink. Accelerating Wavelet-Based Video Coding on Graphics Hardware using CUDA. In Proc. 6th International Symposium on Image and Signal Processing

More information

Sockets vs. RDMA Interface over 10-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck

Sockets vs. RDMA Interface over 10-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck Sockets vs. RDMA Interface over 1-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck Pavan Balaji Hemal V. Shah D. K. Panda Network Based Computing Lab Computer Science and Engineering

More information

Architectures and Platforms

Architectures and Platforms Hardware/Software Codesign Arch&Platf. - 1 Architectures and Platforms 1. Architecture Selection: The Basic Trade-Offs 2. General Purpose vs. Application-Specific Processors 3. Processor Specialisation

More information

Scalability and Classifications

Scalability and Classifications Scalability and Classifications 1 Types of Parallel Computers MIMD and SIMD classifications shared and distributed memory multicomputers distributed shared memory computers 2 Network Topologies static

More information

Petascale Visualization: Approaches and Initial Results

Petascale Visualization: Approaches and Initial Results Petascale Visualization: Approaches and Initial Results James Ahrens Li-Ta Lo, Boonthanome Nouanesengsy, John Patchett, Allen McPherson Los Alamos National Laboratory LA-UR- 08-07337 Operated by Los Alamos

More information

Intel DPDK Boosts Server Appliance Performance White Paper

Intel DPDK Boosts Server Appliance Performance White Paper Intel DPDK Boosts Server Appliance Performance Intel DPDK Boosts Server Appliance Performance Introduction As network speeds increase to 40G and above, both in the enterprise and data center, the bottlenecks

More information

Flash Memory Arrays Enabling the Virtualized Data Center. July 2010

Flash Memory Arrays Enabling the Virtualized Data Center. July 2010 Flash Memory Arrays Enabling the Virtualized Data Center July 2010 2 Flash Memory Arrays Enabling the Virtualized Data Center This White Paper describes a new product category, the flash Memory Array,

More information

HP ProLiant SL270s Gen8 Server. Evaluation Report

HP ProLiant SL270s Gen8 Server. Evaluation Report HP ProLiant SL270s Gen8 Server Evaluation Report Thomas Schoenemeyer, Hussein Harake and Daniel Peter Swiss National Supercomputing Centre (CSCS), Lugano Institute of Geophysics, ETH Zürich schoenemeyer@cscs.ch

More information

Cloud Data Center Acceleration 2015

Cloud Data Center Acceleration 2015 Cloud Data Center Acceleration 2015 Agenda! Computer & Storage Trends! Server and Storage System - Memory and Homogenous Architecture - Direct Attachment! Memory Trends! Acceleration Introduction! FPGA

More information

USB readout board for PEBS Performance test

USB readout board for PEBS Performance test June 11, 2009 Version 1.0 USB readout board for PEBS Performance test Guido Haefeli 1 Li Liang 2 Abstract In the context of the PEBS [1] experiment a readout board was developed in order to facilitate

More information

Unified Computing Systems

Unified Computing Systems Unified Computing Systems Cisco Unified Computing Systems simplify your data center architecture; reduce the number of devices to purchase, deploy, and maintain; and improve speed and agility. Cisco Unified

More information

Performance of Software Switching

Performance of Software Switching Performance of Software Switching Based on papers in IEEE HPSR 2011 and IFIP/ACM Performance 2011 Nuutti Varis, Jukka Manner Department of Communications and Networking (COMNET) Agenda Motivation Performance

More information

Touchstone -A Fresh Approach to Multimedia for the PC

Touchstone -A Fresh Approach to Multimedia for the PC Touchstone -A Fresh Approach to Multimedia for the PC Emmett Kilgariff Martin Randall Silicon Engineering, Inc Presentation Outline Touchstone Background Chipset Overview Sprite Chip Tiler Chip Compressed

More information

How To Build An Ark Processor With An Nvidia Gpu And An African Processor

How To Build An Ark Processor With An Nvidia Gpu And An African Processor Project Denver Processor to Usher in a New Era of Computing Bill Dally January 5, 2011 http://blogs.nvidia.com/2011/01/project-denver-processor-to-usher-in-new-era-of-computing/ Project Denver Announced

More information

E6895 Advanced Big Data Analytics Lecture 14:! NVIDIA GPU Examples and GPU on ios devices

E6895 Advanced Big Data Analytics Lecture 14:! NVIDIA GPU Examples and GPU on ios devices E6895 Advanced Big Data Analytics Lecture 14: NVIDIA GPU Examples and GPU on ios devices Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science IBM Chief Scientist,

More information

18-742 Lecture 4. Parallel Programming II. Homework & Reading. Page 1. Projects handout On Friday Form teams, groups of two

18-742 Lecture 4. Parallel Programming II. Homework & Reading. Page 1. Projects handout On Friday Form teams, groups of two age 1 18-742 Lecture 4 arallel rogramming II Spring 2005 rof. Babak Falsafi http://www.ece.cmu.edu/~ece742 write X Memory send X Memory read X Memory Slides developed in part by rofs. Adve, Falsafi, Hill,

More information

Intel Ethernet Switch Load Balancing System Design Using Advanced Features in Intel Ethernet Switch Family

Intel Ethernet Switch Load Balancing System Design Using Advanced Features in Intel Ethernet Switch Family Intel Ethernet Switch Load Balancing System Design Using Advanced Features in Intel Ethernet Switch Family White Paper June, 2008 Legal INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL

More information

Computer Graphics Hardware An Overview

Computer Graphics Hardware An Overview Computer Graphics Hardware An Overview Graphics System Monitor Input devices CPU/Memory GPU Raster Graphics System Raster: An array of picture elements Based on raster-scan TV technology The screen (and

More information

Graphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011

Graphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011 Graphics Cards and Graphics Processing Units Ben Johnstone Russ Martin November 15, 2011 Contents Graphics Processing Units (GPUs) Graphics Pipeline Architectures 8800-GTX200 Fermi Cayman Performance Analysis

More information

SMB Direct for SQL Server and Private Cloud

SMB Direct for SQL Server and Private Cloud SMB Direct for SQL Server and Private Cloud Increased Performance, Higher Scalability and Extreme Resiliency June, 2014 Mellanox Overview Ticker: MLNX Leading provider of high-throughput, low-latency server

More information

NVIDIA Quadro M4000 Sync PNY Part Number: VCQM4000SYNC-PB. User Guide

NVIDIA Quadro M4000 Sync PNY Part Number: VCQM4000SYNC-PB. User Guide NVIDIA Quadro M4000 Sync PNY Part Number: VCQM4000SYNC-PB User Guide PNY 100 Jefferson Road Parsippany NJ 07054-0218 973-515-9700 www.pny.com/quadro Features and specifications are subject to change without

More information

COMPUTER HARDWARE. Input- Output and Communication Memory Systems

COMPUTER HARDWARE. Input- Output and Communication Memory Systems COMPUTER HARDWARE Input- Output and Communication Memory Systems Computer I/O I/O devices commonly found in Computer systems Keyboards Displays Printers Magnetic Drives Compact disk read only memory (CD-ROM)

More information

GPU Performance Analysis and Optimisation

GPU Performance Analysis and Optimisation GPU Performance Analysis and Optimisation Thomas Bradley, NVIDIA Corporation Outline What limits performance? Analysing performance: GPU profiling Exposing sufficient parallelism Optimising for Kepler

More information

Enhance Service Delivery and Accelerate Financial Applications with Consolidated Market Data

Enhance Service Delivery and Accelerate Financial Applications with Consolidated Market Data White Paper Enhance Service Delivery and Accelerate Financial Applications with Consolidated Market Data What You Will Learn Financial market technology is advancing at a rapid pace. The integration of

More information

Chapter 1 Reading Organizer

Chapter 1 Reading Organizer Chapter 1 Reading Organizer After completion of this chapter, you should be able to: Describe convergence of data, voice and video in the context of switched networks Describe a switched network in a small

More information

SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications. Jürgen Primsch, SAP AG July 2011

SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications. Jürgen Primsch, SAP AG July 2011 SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications Jürgen Primsch, SAP AG July 2011 Why In-Memory? Information at the Speed of Thought Imagine access to business data,

More information

Home Exam 3: Distributed Video Encoding using Dolphin PCI Express Networks. October 20 th 2015

Home Exam 3: Distributed Video Encoding using Dolphin PCI Express Networks. October 20 th 2015 INF5063: Programming heterogeneous multi-core processors because the OS-course is just to easy! Home Exam 3: Distributed Video Encoding using Dolphin PCI Express Networks October 20 th 2015 Håkon Kvale

More information

How To Speed Up A Flash Flash Storage System With The Hyperq Memory Router

How To Speed Up A Flash Flash Storage System With The Hyperq Memory Router HyperQ Hybrid Flash Storage Made Easy White Paper Parsec Labs, LLC. 7101 Northland Circle North, Suite 105 Brooklyn Park, MN 55428 USA 1-763-219-8811 www.parseclabs.com info@parseclabs.com sales@parseclabs.com

More information

Network Traffic Monitoring and Analysis with GPUs

Network Traffic Monitoring and Analysis with GPUs Network Traffic Monitoring and Analysis with GPUs Wenji Wu, Phil DeMar wenji@fnal.gov, demar@fnal.gov GPU Technology Conference 2013 March 18-21, 2013 SAN JOSE, CALIFORNIA Background Main uses for network

More information

A Prototype For Eye-Gaze Corrected

A Prototype For Eye-Gaze Corrected A Prototype For Eye-Gaze Corrected Video Chat on Graphics Hardware Maarten Dumont, Steven Maesen, Sammy Rogmans and Philippe Bekaert Introduction Traditional webcam video chat: No eye contact. No extensive

More information

How To Make A Car A Car Into A Car With A Car Stereo And A Car Monitor

How To Make A Car A Car Into A Car With A Car Stereo And A Car Monitor Designing 1000BASE-T1 Into Automotive Architectures Alexander E Tan Ethernet PHY and Automotive PLM alextan@marvell.com Ethernet IP & Automotive Tech Day October 23 & 24th, 2014 Agenda What Does 1000BASE-T1

More information

GPU for Scientific Computing. -Ali Saleh

GPU for Scientific Computing. -Ali Saleh 1 GPU for Scientific Computing -Ali Saleh Contents Introduction What is GPU GPU for Scientific Computing K-Means Clustering K-nearest Neighbours When to use GPU and when not Commercial Programming GPU

More information

CT Image Reconstruction. Terry Peters Robarts Research Institute London Canada

CT Image Reconstruction. Terry Peters Robarts Research Institute London Canada CT Image Reconstruction Terry Peters Robarts Research Institute London Canada 1 Standard X-ray Views Standard Radiograph acquires projections of the body, but since structures are overlaid on each other,

More information

The Dusk of FireWire - The Dawn of USB 3.0

The Dusk of FireWire - The Dawn of USB 3.0 WWW.LUMENERA.COM The Dusk of FireWire - The Dawn of USB 3.0 Advancements and Critical Aspects of Camera Interfaces for Next Generation Vision Systems WHAT S INSIDE Executive Summary Criteria for Selecting

More information

Texture Cache Approximation on GPUs

Texture Cache Approximation on GPUs Texture Cache Approximation on GPUs Mark Sutherland Joshua San Miguel Natalie Enright Jerger {suther68,enright}@ece.utoronto.ca, joshua.sanmiguel@mail.utoronto.ca 1 Our Contribution GPU Core Cache Cache

More information

NVIDIA VIDEO ENCODER 5.0

NVIDIA VIDEO ENCODER 5.0 NVIDIA VIDEO ENCODER 5.0 NVENC_DA-06209-001_v06 November 2014 Application Note NVENC - NVIDIA Hardware Video Encoder 5.0 NVENC_DA-06209-001_v06 i DOCUMENT CHANGE HISTORY NVENC_DA-06209-001_v06 Version

More information

Scan Time Reduction and X-ray Scatter Rejection in Dual Modality Breast Tomosynthesis. Tushita Patel 4/2/13

Scan Time Reduction and X-ray Scatter Rejection in Dual Modality Breast Tomosynthesis. Tushita Patel 4/2/13 Scan Time Reduction and X-ray Scatter Rejection in Dual Modality Breast Tomosynthesis Tushita Patel 4/2/13 Breast Cancer Statistics Second most common cancer after skin cancer Second leading cause of cancer

More information

Basler. Line Scan Cameras

Basler. Line Scan Cameras Basler Line Scan Cameras High-quality line scan technology meets a cost-effective GigE interface Real color support in a compact housing size Shading correction compensates for difficult lighting conditions

More information

Computer Systems Structure Input/Output

Computer Systems Structure Input/Output Computer Systems Structure Input/Output Peripherals Computer Central Processing Unit Main Memory Computer Systems Interconnection Communication lines Input Output Ward 1 Ward 2 Examples of I/O Devices

More information

RAID. RAID 0 No redundancy ( AID?) Just stripe data over multiple disks But it does improve performance. Chapter 6 Storage and Other I/O Topics 29

RAID. RAID 0 No redundancy ( AID?) Just stripe data over multiple disks But it does improve performance. Chapter 6 Storage and Other I/O Topics 29 RAID Redundant Array of Inexpensive (Independent) Disks Use multiple smaller disks (c.f. one large disk) Parallelism improves performance Plus extra disk(s) for redundant data storage Provides fault tolerant

More information

MRC High Resolution. MR-compatible digital HD video camera. User manual

MRC High Resolution. MR-compatible digital HD video camera. User manual MRC High Resolution MR-compatible digital HD video camera User manual page 1 of 12 Contents 1. Intended use...2 2. System components...3 3. Video camera and lens...4 4. Interface...4 5. Installation...5

More information

A PHOTOGRAMMETRIC APPRAOCH FOR AUTOMATIC TRAFFIC ASSESSMENT USING CONVENTIONAL CCTV CAMERA

A PHOTOGRAMMETRIC APPRAOCH FOR AUTOMATIC TRAFFIC ASSESSMENT USING CONVENTIONAL CCTV CAMERA A PHOTOGRAMMETRIC APPRAOCH FOR AUTOMATIC TRAFFIC ASSESSMENT USING CONVENTIONAL CCTV CAMERA N. Zarrinpanjeh a, F. Dadrassjavan b, H. Fattahi c * a Islamic Azad University of Qazvin - nzarrin@qiau.ac.ir

More information

Alberto Corrales-García, Rafael Rodríguez-Sánchez, José Luis Martínez, Gerardo Fernández-Escribano, José M. Claver and José Luis Sánchez

Alberto Corrales-García, Rafael Rodríguez-Sánchez, José Luis Martínez, Gerardo Fernández-Escribano, José M. Claver and José Luis Sánchez Alberto Corrales-García, Rafael Rodríguez-Sánchez, José Luis artínez, Gerardo Fernández-Escribano, José. Claver and José Luis Sánchez 1. Introduction 2. Technical Background 3. Proposed DVC to H.264/AVC

More information

Rackspace Cloud Databases and Container-based Virtualization

Rackspace Cloud Databases and Container-based Virtualization Rackspace Cloud Databases and Container-based Virtualization August 2012 J.R. Arredondo @jrarredondo Page 1 of 6 INTRODUCTION When Rackspace set out to build the Cloud Databases product, we asked many

More information

Clustering Billions of Data Points Using GPUs

Clustering Billions of Data Points Using GPUs Clustering Billions of Data Points Using GPUs Ren Wu ren.wu@hp.com Bin Zhang bin.zhang2@hp.com Meichun Hsu meichun.hsu@hp.com ABSTRACT In this paper, we report our research on using GPUs to accelerate

More information

Nutaq. PicoDigitizer 125-Series 16 or 32 Channels, 125 MSPS, FPGA-Based DAQ Solution PRODUCT SHEET. nutaq.com MONTREAL QUEBEC

Nutaq. PicoDigitizer 125-Series 16 or 32 Channels, 125 MSPS, FPGA-Based DAQ Solution PRODUCT SHEET. nutaq.com MONTREAL QUEBEC Nutaq PicoDigitizer 125-Series 16 or 32 Channels, 125 MSPS, FPGA-Based DAQ Solution PRODUCT SHEET QUEBEC I MONTREAL I N E W YO R K I nutaq.com Nutaq PicoDigitizer 125-Series The PicoDigitizer 125-Series

More information

Accelerating I/O- Intensive Applications in IT Infrastructure with Innodisk FlexiArray Flash Appliance. Alex Ho, Product Manager Innodisk Corporation

Accelerating I/O- Intensive Applications in IT Infrastructure with Innodisk FlexiArray Flash Appliance. Alex Ho, Product Manager Innodisk Corporation Accelerating I/O- Intensive Applications in IT Infrastructure with Innodisk FlexiArray Flash Appliance Alex Ho, Product Manager Innodisk Corporation Outline Innodisk Introduction Industry Trend & Challenge

More information

Mellanox Cloud and Database Acceleration Solution over Windows Server 2012 SMB Direct

Mellanox Cloud and Database Acceleration Solution over Windows Server 2012 SMB Direct Mellanox Cloud and Database Acceleration Solution over Windows Server 2012 Direct Increased Performance, Scaling and Resiliency July 2012 Motti Beck, Director, Enterprise Market Development Motti@mellanox.com

More information

Understanding Line Scan Camera Applications

Understanding Line Scan Camera Applications Understanding Line Scan Camera Applications Discover the benefits of line scan cameras, including perfect, high resolution images, and the ability to image large objects. A line scan camera has a single

More information

Cloud-Based Apps Drive the Need for Frequency-Flexible Clock Generators in Converged Data Center Networks

Cloud-Based Apps Drive the Need for Frequency-Flexible Clock Generators in Converged Data Center Networks Cloud-Based Apps Drive the Need for Frequency-Flexible Generators in Converged Data Center Networks Introduction By Phil Callahan, Senior Marketing Manager, Timing Products, Silicon Labs Skyrocketing network

More information

Data and Control Plane Interconnect solutions for SDN & NFV Networks Raghu Kondapalli August 2014

Data and Control Plane Interconnect solutions for SDN & NFV Networks Raghu Kondapalli August 2014 Data and Control Plane Interconnect solutions for SDN & NFV Networks Raghu Kondapalli August 2014 Title & Abstract Title: Data & Control Plane Interconnect for SDN & NFV networks Abstract: Software defined

More information

CHAPTER FIVE RESULT ANALYSIS

CHAPTER FIVE RESULT ANALYSIS CHAPTER FIVE RESULT ANALYSIS 5.1 Chapter Introduction 5.2 Discussion of Results 5.3 Performance Comparisons 5.4 Chapter Summary 61 5.1 Chapter Introduction This chapter outlines the results obtained from

More information

Data Center and Cloud Computing Market Landscape and Challenges

Data Center and Cloud Computing Market Landscape and Challenges Data Center and Cloud Computing Market Landscape and Challenges Manoj Roge, Director Wired & Data Center Solutions Xilinx Inc. #OpenPOWERSummit 1 Outline Data Center Trends Technology Challenges Solution

More information

Lots of Video on the Internet Random Thoughts. Dave Oran IAB Retreat May 28, 2006

Lots of Video on the Internet Random Thoughts. Dave Oran IAB Retreat May 28, 2006 Lots of Video on the Internet Random Thoughts Dave Oran IAB Retreat May 28, 2006 Voice all over Again? In early 1996 Steve Deering said to me: This VoIP stuff is going to destroy the Internet and it ll

More information

Chapter 11 I/O Management and Disk Scheduling

Chapter 11 I/O Management and Disk Scheduling Operating Systems: Internals and Design Principles, 6/E William Stallings Chapter 11 I/O Management and Disk Scheduling Dave Bremer Otago Polytechnic, NZ 2008, Prentice Hall I/O Devices Roadmap Organization

More information

VPX Implementation Serves Shipboard Search and Track Needs

VPX Implementation Serves Shipboard Search and Track Needs VPX Implementation Serves Shipboard Search and Track Needs By: Thierry Wastiaux, Senior Vice President Interface Concept Defending against anti-ship missiles is a problem for which high-performance computing

More information

3D MODEL DRIVEN DISTANT ASSEMBLY

3D MODEL DRIVEN DISTANT ASSEMBLY 3D MODEL DRIVEN DISTANT ASSEMBLY Final report Bachelor Degree Project in Automation Spring term 2012 Carlos Gil Camacho Juan Cana Quijada Supervisor: Abdullah Mohammed Examiner: Lihui Wang 1 Executive

More information

GPU Architecture. Michael Doggett ATI

GPU Architecture. Michael Doggett ATI GPU Architecture Michael Doggett ATI GPU Architecture RADEON X1800/X1900 Microsoft s XBOX360 Xenos GPU GPU research areas ATI - Driving the Visual Experience Everywhere Products from cell phones to super

More information

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage White Paper Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage A Benchmark Report August 211 Background Objectivity/DB uses a powerful distributed processing architecture to manage

More information

Database Management Systems

Database Management Systems 4411 Database Management Systems Acknowledgements and copyrights: these slides are a result of combination of notes and slides with contributions from: Michael Kiffer, Arthur Bernstein, Philip Lewis, Anestis

More information

Learn CUDA in an Afternoon: Hands-on Practical Exercises

Learn CUDA in an Afternoon: Hands-on Practical Exercises Learn CUDA in an Afternoon: Hands-on Practical Exercises Alan Gray and James Perry, EPCC, The University of Edinburgh Introduction This document forms the hands-on practical component of the Learn CUDA

More information

Lustre Networking BY PETER J. BRAAM

Lustre Networking BY PETER J. BRAAM Lustre Networking BY PETER J. BRAAM A WHITE PAPER FROM CLUSTER FILE SYSTEMS, INC. APRIL 2007 Audience Architects of HPC clusters Abstract This paper provides architects of HPC clusters with information

More information

GPGPU Computing. Yong Cao

GPGPU Computing. Yong Cao GPGPU Computing Yong Cao Why Graphics Card? It s powerful! A quiet trend Copyright 2009 by Yong Cao Why Graphics Card? It s powerful! Processor Processing Units FLOPs per Unit Clock Speed Processing Power

More information

GPU-Based Network Traffic Monitoring & Analysis Tools

GPU-Based Network Traffic Monitoring & Analysis Tools GPU-Based Network Traffic Monitoring & Analysis Tools Wenji Wu; Phil DeMar wenji@fnal.gov, demar@fnal.gov CHEP 2013 October 17, 2013 Coarse Detailed Background Main uses for network traffic monitoring

More information

Cisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database

Cisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database Cisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database Built up on Cisco s big data common platform architecture (CPA), a

More information

DICOM Correction Item

DICOM Correction Item Correction Number DICOM Correction Item CP-626 Log Summary: Type of Modification Clarification Rationale for Correction Name of Standard PS 3.3 2004 + Sup 83 The description of pixel spacing related attributes

More information

Integrated Sensor Analysis Tool (I-SAT )

Integrated Sensor Analysis Tool (I-SAT ) FRONTIER TECHNOLOGY, INC. Advanced Technology for Superior Solutions. Integrated Sensor Analysis Tool (I-SAT ) Core Visualization Software Package Abstract As the technology behind the production of large

More information

Parallel Computing. Benson Muite. benson.muite@ut.ee http://math.ut.ee/ benson. https://courses.cs.ut.ee/2014/paralleel/fall/main/homepage

Parallel Computing. Benson Muite. benson.muite@ut.ee http://math.ut.ee/ benson. https://courses.cs.ut.ee/2014/paralleel/fall/main/homepage Parallel Computing Benson Muite benson.muite@ut.ee http://math.ut.ee/ benson https://courses.cs.ut.ee/2014/paralleel/fall/main/homepage 3 November 2014 Hadoop, Review Hadoop Hadoop History Hadoop Framework

More information

FPGAs in Next Generation Wireless Networks

FPGAs in Next Generation Wireless Networks FPGAs in Next Generation Wireless Networks March 2010 Lattice Semiconductor 5555 Northeast Moore Ct. Hillsboro, Oregon 97124 USA Telephone: (503) 268-8000 www.latticesemi.com 1 FPGAs in Next Generation

More information

Data Centric Systems (DCS)

Data Centric Systems (DCS) Data Centric Systems (DCS) Architecture and Solutions for High Performance Computing, Big Data and High Performance Analytics High Performance Computing with Data Centric Systems 1 Data Centric Systems

More information

FPGA Accelerator Virtualization in an OpenPOWER cloud. Fei Chen, Yonghua Lin IBM China Research Lab

FPGA Accelerator Virtualization in an OpenPOWER cloud. Fei Chen, Yonghua Lin IBM China Research Lab FPGA Accelerator Virtualization in an OpenPOWER cloud Fei Chen, Yonghua Lin IBM China Research Lab Trend of Acceleration Technology Acceleration in Cloud is Taking Off Used FPGA to accelerate Bing search

More information

2-Megapixel Sony Progressive CMOS Sensor with Super Wide Dynamic Range and High Frame Rate

2-Megapixel Sony Progressive CMOS Sensor with Super Wide Dynamic Range and High Frame Rate SD-2020 2-Megapixel 20X Optical Zoom Speed Dome IP Camera 1/2.8" Sony Progressive CMOS Sensor Full HD 1080p + D1 Real-Time at Dual Streaming Up to 20x Optical Zoom Up to 30 fps @ 1080p Full HD Weather-Proof

More information

7 MEGAPIXEL 180 DEGREE IP VIDEO CAMERA

7 MEGAPIXEL 180 DEGREE IP VIDEO CAMERA Scallop Imaging is focused on developing, marketing and manufacturing its proprietary video imaging technology. All our activities are still proudly accomplished in Boston. We do product development, marketing

More information

BIG data big problems big opportunities Rudolf Dimper Head of Technical Infrastructure Division ESRF

BIG data big problems big opportunities Rudolf Dimper Head of Technical Infrastructure Division ESRF BIG data big problems big opportunities Rudolf Dimper Head of Technical Infrastructure Division ESRF Slide: 1 ! 6 GeV, 850m circonference Storage Ring! 42 public and CRG beamlines! 6000+ user visits/y!

More information

High Performance OpenStack Cloud. Eli Karpilovski Cloud Advisory Council Chairman

High Performance OpenStack Cloud. Eli Karpilovski Cloud Advisory Council Chairman High Performance OpenStack Cloud Eli Karpilovski Cloud Advisory Council Chairman Cloud Advisory Council Our Mission Development of next generation cloud architecture Providing open specification for cloud

More information

CUBIX ACCEL-APP SYSTEMS Linux2U Rackmount Elite

CUBIX ACCEL-APP SYSTEMS Linux2U Rackmount Elite CUBIX ACCEL-APP SYSTEMS Linux2U Rackmount Elite Linux2U Rackmount Elite is a Host Engine 2U computer connected to Xpander Rackmount Elite and is designed to run up to four GPUs plus 2x 8-channel slots

More information

Pentaho High-Performance Big Data Reference Configurations using Cisco Unified Computing System

Pentaho High-Performance Big Data Reference Configurations using Cisco Unified Computing System Pentaho High-Performance Big Data Reference Configurations using Cisco Unified Computing System By Jake Cornelius Senior Vice President of Products Pentaho June 1, 2012 Pentaho Delivers High-Performance

More information

NVIDIA IndeX Enabling Interactive and Scalable Visualization for Large Data Marc Nienhaus, NVIDIA IndeX Engineering Manager and Chief Architect

NVIDIA IndeX Enabling Interactive and Scalable Visualization for Large Data Marc Nienhaus, NVIDIA IndeX Engineering Manager and Chief Architect SIGGRAPH 2013 Shaping the Future of Visual Computing NVIDIA IndeX Enabling Interactive and Scalable Visualization for Large Data Marc Nienhaus, NVIDIA IndeX Engineering Manager and Chief Architect NVIDIA

More information