Prototyping a Programmable Computing/Networking Switch

Size: px
Start display at page:

Download "Prototyping a Programmable Computing/Networking Switch"

Transcription

1 Heterogenesous Computing Gwangju, Korea Sep. 20, 2012 Prototyping a Programmable Computing/Networking Switch JongWon Kim, Namgon Kim, and Namgon Lucas Kim, Gwangju Institute of Science and Technology () jongwon@nm.gist.ac.kr Sep. 20,

2 Toward Balanced Service Composition based on Programmable (and Virtualized) Resources

3 Multi-Screen Content Consumption Content servers Cloud Logically Centralized Controller Software (Logically centralized) VisualXCoordinator Visualization Sharing Content Sharing at home Screen capturing Switch Control Software Switch Control Software Network Switch Control Software Many flows Tiling VisualXSwitch (Com + Net) at airport lobby Transcoding Switch Control Software Switch Control Software Control Agent (having partial knowledge) Tiling at cafeteria screen-casting at street at office 3

4 Multi-party Visual Sharing OptIPortal Termination Device for the OptIPuter Global Backplane SAGE (Scalable Adaptive Graphics Environment) Visualcasting enables Integration of HD Streams into Tiled Displays (2008. Apr) 9/20/2012 KIISE

5 Programmable Computing/Networking Switch: Why? Emergence of various networkconnected consumer devices The demands for accessing video contents from heterogeneous devices with different screens and different capabilities are increasing Real-time content adaptation is challenging Transcoding the content while maintaining good video quality Maintaining appropriately short input-to-output latency We introduce a programmable switching node supporting innetwork processing with balanced use of computing/networking resources 5

6 Networking Service and SDN Networking Service The collection of network-centric services that assist the networking (e.g., transport) of diverse flows for computing centric services SDN (Software-Defined Networking) Restructures networking paradigm by exposing network APIs so that any software can program the networking nodes as they want By employing the programmability of SDN, we attempt to fill the gap caused by existing network services flows Networking Service SDN Network Substrate 6

7 Features of NetOpen Networking Service Extended flow-based networking: An extension of flow-based networking to reinforce the balanced utilization of both networking and computing resources Primitive-based loosely coupled creation of services Service primitives as linkage points between NetOpen networking services and programmable network substrates Identify and then link the key features (from resource capabilities) Interactive service operation via intuitive UIs 7

8 A Road Map to Build SmartX nodes SmartX Node (NetOpen + MediaX) MediaX Cloud Node NetOpen Node v2.0 + networking OF + Click + GPU + Storage + 10G NIC NetOpen Node v1.3 Cloud-based MediaX Node v1.2 OpenVSwitch+Cloud OF + Click + GPU OF + Click OF + NetFPGA NetOpen Node v1.1 NetOpen Node v1.0 NetOpen Node v1.2 NetOpen Switch Nodes Wireless NetOpen Node v1.1 (wallbox) Mobile SmartX Node v0.9 (tablet, netbook, laptop) Mobile MediaX Node with GPU (tablet, netbook, laptop) Mobile SmartX Nodes OF + Click PC-based MediaX Node with GPU v1.1 PC-based MediaX Node v1.0 MediaX Nodes 8

9 NetOpen Switch Node (v1.2) A Programmable Networking Switch Node with Innetwork Processing Support Design issues Independent processing module for a service functionality For each flow, a customized data plane can be built by selectively combining processing modules This buildup can be controlled by a logically centralized controller flows Networking Service SDN Network Substrate CPU GPU Commodity Processors 9

10 In-network Processing leveraging Heterogeneous Computing In-network Processing Additional processing on the packets before forwarding Middle-boxes provide required processing functionalities, e.g., packet caching, transcoding, network coding Leverage Heterogeneous Computing (with CPU/GPU/ ) for in-network processing Powerful CPUs with multiple cores (e.g. Intel Xeon E7-2820, AMD FX-8150) GPUs & GPGPUs (General Purpose Graphics Processing Units) with hundred cores (NVIDIA GTX580: 512 CUDA cores) 10

11 How to handle In-network Processing In-network processing for a flow is explained by a task, defined as a set of actions (provided by processing modules) mapped to a flow For example, a task for a video flow can be composed of decode, resize, and forward actions with segmentation and packetization Segmentation: Checks the application-layer packet header of each packet and batches the payload into current segment (Assuming the sequential transport of video frames) Packetization: Converts a segment back into packets Packet segment size size S Segment Segment Data size size P size(header) Packet P Header D (Data) S S P Header D (Data) D D D Segment <Segmentation> <Packetization> 11

12 Early Design of NetOpen Switch Node for In-network Processing We need a special purpose task dispatcher between task dispatchers and GPUs, shader, to avoid performance degradation due to multiple task dispatchers accessing GPU at the same time 12

13 Prototype Implementation Hardware CPU: 2 Intel Xeon quad-core CPUs (1.6Ghz) with 12GB memory GPU: NVIDIA GTX590 GPU 2 stream processors (each has 512 cores) 3GB GDDR5 memory NIC: Six 1G NICs, two 2port 1G NICs, and two WiFi NICs Software (Click + OpenFlow) Click: A toolkit for building a software modular router by combining Click elements (i.e., a small module that has a specific functionality) Each service primitive is implemented as a Click element that is controlled based on OpenFlow 13

14 Supported Primitives for In-networking Processing DXT compression (P DXTC ) /decompression (P DXTD ) DXT (i.e., S3 texture compression) is a light-weight compression scheme that compresses an image frame based on 4x4-pixel blocks The compression operation of each pixel block is completely independent, thus we significantly reduce the compression time by implementing DXT compression using hundreds of GPU cores Rate shaping (P rshape ) Control the rate of traffic received on a network interface Traffic that is less than or equal to the specified rate is sent, whereas traffic that exceeds the rate is dropped or delayed Implemented using BandwidthShaper element of Click YUV2RGB conversion (P Y2R ) Recent camera models produce YUV-format video content due to the small size of YUV format (i.e., half of the RGB format) DXT compressor only takes RGB-format pixel blocks, we need a conversion element to support various formats of pixel blocks Resize (P resize ) It changes the width and height of a video frame 14

15 Task Dispatcher Threads Main Thread Flow Batcher Forwarding Element Per-flow Customized Data Plane Flow Table Actions Packet (Forwarding) Packet (In-network Processing) Packet Segmentation Handler Input Task Queue Output Task Queue Segmentation Handler Task Task Task Task Dispatcher Task Segment Actions Task Dispatcher Segment Output Action Processing Element (CPU) Processing Element (CPU) Processing Element (GPU) Processing Element (GPU) 15

16 A Processing Element for GPUbased DXT Compression Task Dispatcher Pass segment Processing Element Return DXT compressed segment 2) Launch Kernel Kernel (DXT Compression) 1) Copy segment to GPU device memory 3) Copy DXT compressed segment from GPU device memory GPU GPU GPU Cores GPU Cores GPU Cores GPU Cores Cores Cores GPU Cores (e.g., 2x 512 cores for GTX 590) 4x4 pixel block DXT block segment segment DXT compressed segment Host Memory DXT compressed segment Device Memory GPU (Graphics Processing Unit) 16

17 DXT: DirectX Texture Compression S/W based light-weight lossy compression scheme 4x4 block of pixels (512-bit or 384-bit) to a 64-bit or 128-bit Fixed ratio per 4x4-pixel bolck quantity FOURCC Description Comp. ratio Texture DXT1 (BC1) Opaque / 1-bit Alpha 8:1 or 6:1 Simple non-alpha DXT3 (BC2) Explicit alpha 4:1 Sharp alpha DXT5 (BC3) Interpolated alpha 4:1 Gradient alpha Most of VGAs support H/W accelerated DXT-decompression DXT1 (BC1) Example Color0 > Color1 Color0: code 0 Color1: code 1 (2*Color0+Color1)/3: code 2 (Color0+2*Color1)/3: code 3 Color0 <= Color1 Color0: code 0 Color1: code 1 (Color0+Color1)/2: code 2 9/20/2012 Black: code 3 Color 0 Color 1 xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx < 4x4 RGB pixels > < DXT1 block > 32bit (2x16bit) 32bit (16x2bit) 17

18 DXT Compression: GPU-based parallelization 9/20/

19 Color Converter UYVY (YUV422) to RGB Each CUDA thread converter one UVYV packed pixel U Y2 V Y1 A B G R A B G R Pixel 1 Pixel 2 Conversion formula R: 1.164(Y - 16) (V - 128) G: 1.164(Y - 16) (V - 128) (U - 128) B: 1.164(Y - 16) (U - 128) A: 0 9/20/2012 KIISE

20 DXT Compression Apply FastDXT algorithm [2] 9/20/2012 KIISE

21 DXT Compression: Case #1 Separate Color conversion and DXT compression Different processing property in CUDA thread E.g.: 1920x1080 frame, 64 threads per block, Max blocks per kernel launch Color conversion launches kernel 6 times; DXT compression launches kernel once GPU: NVIDIA GeForce GTS450 (1.61Ghz, 192cores, 1GB RAM w/ 128bit bus) Experiment with CUDA Compute Visual Profiler Input Source: 4K 1 frame Total: 19ms Processing: 11.65ms (us) 9/20/

22 DXT Compression: Case #2 Pipelining based on Slices (Pipelining) Concurrent CUDA kernel execution with data copy (using pinned memory) Host to Device Memory Copy Performance 9/20/

23 DXT Compression: Case #2 (Tuning) ms ( 4.936ms) with 128 Threads (us) 9/20/

24 NetOpen Switch: Experimental Service 1 (Shortest-path Connection) Verification Service 2 (In-network DXT compression ) Service 3 (Rate-shaped connection) HD Camera NOX Core Flow-based forwarding GPU-based DXT compression Rate shaping HD Sender HD Receiver H01 Background Traffic Sender eth0 eth5 eth1 N01 eth0 eth4 eth3 N02 eth5 eth6 eth0 H03 Background Traffic Receiver H02 eth4 eth0 H04 24

25 Verification Results (Functionality) Frames rates in H03 with a 800Mbps background flow injection Frames rates in H03 with in-network DXT compression (Background: 500Mbps) Frames rates in H03 with in-network DXT compression (Background: 800Mbps). Frames rates in H03 with in-network DXT compression (Background traffic in-network rate-shaped from 800Mbps to 500Mbps). 25

26 The NetOpen switch node Conclusion A programmable network substrate supporting the concept of NetOpen networking service Design and prototype a NetOpen switch node providing in-network processing Provides service functionalities as programmable processing modules that can be enabled independently For each flow, a customized data plane can be built by selectively combining service functionalities (under the logically centralized control) Future work Enhancing the performance of in-network processing Continue the next versions of NetOpen switch node 26

27 Gwangju Institute of Science & Technology Thank you! Send Inquiry to 27

Graphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011

Graphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011 Graphics Cards and Graphics Processing Units Ben Johnstone Russ Martin November 15, 2011 Contents Graphics Processing Units (GPUs) Graphics Pipeline Architectures 8800-GTX200 Fermi Cayman Performance Analysis

More information

IP Video Rendering Basics

IP Video Rendering Basics CohuHD offers a broad line of High Definition network based cameras, positioning systems and VMS solutions designed for the performance requirements associated with critical infrastructure applications.

More information

GPGPU Computing. Yong Cao

GPGPU Computing. Yong Cao GPGPU Computing Yong Cao Why Graphics Card? It s powerful! A quiet trend Copyright 2009 by Yong Cao Why Graphics Card? It s powerful! Processor Processing Units FLOPs per Unit Clock Speed Processing Power

More information

NVIDIA GeForce GTX 580 GPU Datasheet

NVIDIA GeForce GTX 580 GPU Datasheet NVIDIA GeForce GTX 580 GPU Datasheet NVIDIA GeForce GTX 580 GPU Datasheet 3D Graphics Full Microsoft DirectX 11 Shader Model 5.0 support: o NVIDIA PolyMorph Engine with distributed HW tessellation engines

More information

Autodesk Revit 2016 Product Line System Requirements and Recommendations

Autodesk Revit 2016 Product Line System Requirements and Recommendations Autodesk Revit 2016 Product Line System Requirements and Recommendations Autodesk Revit 2016, Autodesk Revit Architecture 2016, Autodesk Revit MEP 2016, Autodesk Revit Structure 2016 Minimum: Entry-Level

More information

Introduction to GPGPU. Tiziano Diamanti t.diamanti@cineca.it

Introduction to GPGPU. Tiziano Diamanti t.diamanti@cineca.it t.diamanti@cineca.it Agenda From GPUs to GPGPUs GPGPU architecture CUDA programming model Perspective projection Vectors that connect the vanishing point to every point of the 3D model will intersecate

More information

Computer Graphics Hardware An Overview

Computer Graphics Hardware An Overview Computer Graphics Hardware An Overview Graphics System Monitor Input devices CPU/Memory GPU Raster Graphics System Raster: An array of picture elements Based on raster-scan TV technology The screen (and

More information

NVIDIA CUDA Software and GPU Parallel Computing Architecture. David B. Kirk, Chief Scientist

NVIDIA CUDA Software and GPU Parallel Computing Architecture. David B. Kirk, Chief Scientist NVIDIA CUDA Software and GPU Parallel Computing Architecture David B. Kirk, Chief Scientist Outline Applications of GPU Computing CUDA Programming Model Overview Programming in CUDA The Basics How to Get

More information

Networking Virtualization Using FPGAs

Networking Virtualization Using FPGAs Networking Virtualization Using FPGAs Russell Tessier, Deepak Unnikrishnan, Dong Yin, and Lixin Gao Reconfigurable Computing Group Department of Electrical and Computer Engineering University of Massachusetts,

More information

ultra fast SOM using CUDA

ultra fast SOM using CUDA ultra fast SOM using CUDA SOM (Self-Organizing Map) is one of the most popular artificial neural network algorithms in the unsupervised learning category. Sijo Mathew Preetha Joy Sibi Rajendra Manoj A

More information

Power Benefits Using Intel Quick Sync Video H.264 Codec With Sorenson Squeeze

Power Benefits Using Intel Quick Sync Video H.264 Codec With Sorenson Squeeze Power Benefits Using Intel Quick Sync Video H.264 Codec With Sorenson Squeeze Whitepaper December 2012 Anita Banerjee Contents Introduction... 3 Sorenson Squeeze... 4 Intel QSV H.264... 5 Power Performance...

More information

Introduction to GPU Architecture

Introduction to GPU Architecture Introduction to GPU Architecture Ofer Rosenberg, PMTS SW, OpenCL Dev. Team AMD Based on From Shader Code to a Teraflop: How GPU Shader Cores Work, By Kayvon Fatahalian, Stanford University Content 1. Three

More information

Cloud Gaming & Application Delivery with NVIDIA GRID Technologies. Franck DIARD, Ph.D. GRID Architect, NVIDIA

Cloud Gaming & Application Delivery with NVIDIA GRID Technologies. Franck DIARD, Ph.D. GRID Architect, NVIDIA Cloud Gaming & Application Delivery with NVIDIA GRID Technologies Franck DIARD, Ph.D. GRID Architect, NVIDIA What is GRID? Using efficient GPUS in efficient servers What is Streaming? Transporting pixels

More information

OpenFlow with Intel 82599. Voravit Tanyingyong, Markus Hidell, Peter Sjödin

OpenFlow with Intel 82599. Voravit Tanyingyong, Markus Hidell, Peter Sjödin OpenFlow with Intel 82599 Voravit Tanyingyong, Markus Hidell, Peter Sjödin Outline Background Goal Design Experiment and Evaluation Conclusion OpenFlow SW HW Open up commercial network hardware for experiment

More information

Open Source Network: Software-Defined Networking (SDN) and OpenFlow

Open Source Network: Software-Defined Networking (SDN) and OpenFlow Open Source Network: Software-Defined Networking (SDN) and OpenFlow Insop Song, Ericsson LinuxCon North America, Aug. 2012, San Diego CA Objectives Overview of OpenFlow Overview of Software Defined Networking

More information

The Evolution of Computer Graphics. SVP, Content & Technology, NVIDIA

The Evolution of Computer Graphics. SVP, Content & Technology, NVIDIA The Evolution of Computer Graphics Tony Tamasi SVP, Content & Technology, NVIDIA Graphics Make great images intricate shapes complex optical effects seamless motion Make them fast invent clever techniques

More information

The Future Of Animation Is Games

The Future Of Animation Is Games The Future Of Animation Is Games 王 銓 彰 Next Media Animation, Media Lab, Director cwang@1-apple.com.tw The Graphics Hardware Revolution ( 繪 圖 硬 體 革 命 ) : GPU-based Graphics Hardware Multi-core (20 Cores

More information

Interactive Level-Set Deformation On the GPU

Interactive Level-Set Deformation On the GPU Interactive Level-Set Deformation On the GPU Institute for Data Analysis and Visualization University of California, Davis Problem Statement Goal Interactive system for deformable surface manipulation

More information

An Efficient Application Virtualization Mechanism using Separated Software Execution System

An Efficient Application Virtualization Mechanism using Separated Software Execution System An Efficient Application Virtualization Mechanism using Separated Software Execution System Su-Min Jang, Won-Hyuk Choi and Won-Young Kim Cloud Computing Research Department, Electronics and Telecommunications

More information

Network Virtualization and Application Delivery Using Software Defined Networking

Network Virtualization and Application Delivery Using Software Defined Networking Network Virtualization and Application Delivery Using Software Defined Networking Project Leader: Subharthi Paul Washington University in Saint Louis Saint Louis, MO 63130 Jain@cse.wustl.edu Keynote at

More information

Lecture 11: Multi-Core and GPU. Multithreading. Integration of multiple processor cores on a single chip.

Lecture 11: Multi-Core and GPU. Multithreading. Integration of multiple processor cores on a single chip. Lecture 11: Multi-Core and GPU Multi-core computers Multithreading GPUs General Purpose GPUs Zebo Peng, IDA, LiTH 1 Multi-Core System Integration of multiple processor cores on a single chip. To provide

More information

GPU-based Decompression for Medical Imaging Applications

GPU-based Decompression for Medical Imaging Applications GPU-based Decompression for Medical Imaging Applications Al Wegener, CTO Samplify Systems 160 Saratoga Ave. Suite 150 Santa Clara, CA 95051 sales@samplify.com (888) LESS-BITS +1 (408) 249-1500 1 Outline

More information

Introduction to GP-GPUs. Advanced Computer Architectures, Cristina Silvano, Politecnico di Milano 1

Introduction to GP-GPUs. Advanced Computer Architectures, Cristina Silvano, Politecnico di Milano 1 Introduction to GP-GPUs Advanced Computer Architectures, Cristina Silvano, Politecnico di Milano 1 GPU Architectures: How do we reach here? NVIDIA Fermi, 512 Processing Elements (PEs) 2 What Can It Do?

More information

GPUs for Scientific Computing

GPUs for Scientific Computing GPUs for Scientific Computing p. 1/16 GPUs for Scientific Computing Mike Giles mike.giles@maths.ox.ac.uk Oxford-Man Institute of Quantitative Finance Oxford University Mathematical Institute Oxford e-research

More information

================================================================== CONTENTS ==================================================================

================================================================== CONTENTS ================================================================== Disney Epic Mickey 2 : The Power of Two Read Me File ( Disney) Thank you for purchasing Disney Epic Mickey 2 : The Power of Two. This readme file contains last minute information that did not make it into

More information

CLOUD GAMING WITH NVIDIA GRID TECHNOLOGIES Franck DIARD, Ph.D., SW Chief Software Architect GDC 2014

CLOUD GAMING WITH NVIDIA GRID TECHNOLOGIES Franck DIARD, Ph.D., SW Chief Software Architect GDC 2014 CLOUD GAMING WITH NVIDIA GRID TECHNOLOGIES Franck DIARD, Ph.D., SW Chief Software Architect GDC 2014 Introduction Cloud ification < 2013 2014+ Music, Movies, Books Games GPU Flops GPUs vs. Consoles 10,000

More information

Transcend the Vision. Embedded Graphic Solutions that Lead to New Territory. Embedded Graphic Solutions. www.advantech.com

Transcend the Vision. Embedded Graphic Solutions that Lead to New Territory. Embedded Graphic Solutions. www.advantech.com Transcend the Vision Embedded Graphic Solutions that Lead to New Territory Embedded Graphic Solutions www.advantech.com Compact Product Portfolio Designed for Compatibility One-slot Design Low Profile

More information

Xeon+FPGA Platform for the Data Center

Xeon+FPGA Platform for the Data Center Xeon+FPGA Platform for the Data Center ISCA/CARL 2015 PK Gupta, Director of Cloud Platform Technology, DCG/CPG Overview Data Center and Workloads Xeon+FPGA Accelerator Platform Applications and Eco-system

More information

Low power GPUs a view from the industry. Edvard Sørgård

Low power GPUs a view from the industry. Edvard Sørgård Low power GPUs a view from the industry Edvard Sørgård 1 ARM in Trondheim Graphics technology design centre From 2006 acquisition of Falanx Microsystems AS Origin of the ARM Mali GPUs Main activities today

More information

Qualified Apple Mac Workstations for Avid Media Composer v5.0.x

Qualified Apple Mac Workstations for Avid Media Composer v5.0.x Qualified Apple Mac Workstations for Media Composer v5.0.x Qualified Workstation Two 2.66GHz 6-Core Intel Xeon Westmere (12 cores) 6 GB Ram (6x1GB) ATI Radeon HD 5770 1GB ^ Nitris Mojo Mojo Mojo SDI or

More information

System requirements for Autodesk Building Design Suite 2017

System requirements for Autodesk Building Design Suite 2017 System requirements for Autodesk Building Design Suite 2017 For specific recommendations for a product within the Building Design Suite, please refer to that products system requirements for additional

More information

Applications to Computational Financial and GPU Computing. May 16th. Dr. Daniel Egloff +41 44 520 01 17 +41 79 430 03 61

Applications to Computational Financial and GPU Computing. May 16th. Dr. Daniel Egloff +41 44 520 01 17 +41 79 430 03 61 F# Applications to Computational Financial and GPU Computing May 16th Dr. Daniel Egloff +41 44 520 01 17 +41 79 430 03 61 Today! Why care about F#? Just another fashion?! Three success stories! How Alea.cuBase

More information

Introduction GPU Hardware GPU Computing Today GPU Computing Example Outlook Summary. GPU Computing. Numerical Simulation - from Models to Software

Introduction GPU Hardware GPU Computing Today GPU Computing Example Outlook Summary. GPU Computing. Numerical Simulation - from Models to Software GPU Computing Numerical Simulation - from Models to Software Andreas Barthels JASS 2009, Course 2, St. Petersburg, Russia Prof. Dr. Sergey Y. Slavyanov St. Petersburg State University Prof. Dr. Thomas

More information

Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data

Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data Amanda O Connor, Bryan Justice, and A. Thomas Harris IN52A. Big Data in the Geosciences:

More information

GPU Parallel Computing Architecture and CUDA Programming Model

GPU Parallel Computing Architecture and CUDA Programming Model GPU Parallel Computing Architecture and CUDA Programming Model John Nickolls Outline Why GPU Computing? GPU Computing Architecture Multithreading and Arrays Data Parallel Problem Decomposition Parallel

More information

~ Greetings from WSU CAPPLab ~

~ Greetings from WSU CAPPLab ~ ~ Greetings from WSU CAPPLab ~ Multicore with SMT/GPGPU provides the ultimate performance; at WSU CAPPLab, we can help! Dr. Abu Asaduzzaman, Assistant Professor and Director Wichita State University (WSU)

More information

GPU File System Encryption Kartik Kulkarni and Eugene Linkov

GPU File System Encryption Kartik Kulkarni and Eugene Linkov GPU File System Encryption Kartik Kulkarni and Eugene Linkov 5/10/2012 SUMMARY. We implemented a file system that encrypts and decrypts files. The implementation uses the AES algorithm computed through

More information

Real-time Visual Tracker by Stream Processing

Real-time Visual Tracker by Stream Processing Real-time Visual Tracker by Stream Processing Simultaneous and Fast 3D Tracking of Multiple Faces in Video Sequences by Using a Particle Filter Oscar Mateo Lozano & Kuzahiro Otsuka presented by Piotr Rudol

More information

Several tips on how to choose a suitable computer

Several tips on how to choose a suitable computer Several tips on how to choose a suitable computer This document provides more specific information on how to choose a computer that will be suitable for scanning and postprocessing of your data with Artec

More information

GPU Architecture. Michael Doggett ATI

GPU Architecture. Michael Doggett ATI GPU Architecture Michael Doggett ATI GPU Architecture RADEON X1800/X1900 Microsoft s XBOX360 Xenos GPU GPU research areas ATI - Driving the Visual Experience Everywhere Products from cell phones to super

More information

Recent Advances and Future Trends in Graphics Hardware. Michael Doggett Architect November 23, 2005

Recent Advances and Future Trends in Graphics Hardware. Michael Doggett Architect November 23, 2005 Recent Advances and Future Trends in Graphics Hardware Michael Doggett Architect November 23, 2005 Overview XBOX360 GPU : Xenos Rendering performance GPU architecture Unified shader Memory Export Texture/Vertex

More information

E6895 Advanced Big Data Analytics Lecture 14:! NVIDIA GPU Examples and GPU on ios devices

E6895 Advanced Big Data Analytics Lecture 14:! NVIDIA GPU Examples and GPU on ios devices E6895 Advanced Big Data Analytics Lecture 14: NVIDIA GPU Examples and GPU on ios devices Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science IBM Chief Scientist,

More information

Real-Time BC6H Compression on GPU. Krzysztof Narkowicz Lead Engine Programmer Flying Wild Hog

Real-Time BC6H Compression on GPU. Krzysztof Narkowicz Lead Engine Programmer Flying Wild Hog Real-Time BC6H Compression on GPU Krzysztof Narkowicz Lead Engine Programmer Flying Wild Hog Introduction BC6H is lossy block based compression designed for FP16 HDR textures Hardware supported since DX11

More information

Lecture Notes in Computer Science: Media-Oriented Service Overlay Network Architecture over Future Internet Research for Sustainable Testbed

Lecture Notes in Computer Science: Media-Oriented Service Overlay Network Architecture over Future Internet Research for Sustainable Testbed Lecture Notes in Computer Science: Media-Oriented Overlay Network Architecture over Future Internet Research for Sustainable Testbed Sungwon Lee 1, Sang Woo Han 2, Jong Won Kim 2, and Seung Gwan Lee 1.

More information

L20: GPU Architecture and Models

L20: GPU Architecture and Models L20: GPU Architecture and Models scribe(s): Abdul Khalifa 20.1 Overview GPUs (Graphics Processing Units) are large parallel structure of processing cores capable of rendering graphics efficiently on displays.

More information

Next Generation GPU Architecture Code-named Fermi

Next Generation GPU Architecture Code-named Fermi Next Generation GPU Architecture Code-named Fermi The Soul of a Supercomputer in the Body of a GPU Why is NVIDIA at Super Computing? Graphics is a throughput problem paint every pixel within frame time

More information

Silverlight for Windows Embedded Graphics and Rendering Pipeline 1

Silverlight for Windows Embedded Graphics and Rendering Pipeline 1 Silverlight for Windows Embedded Graphics and Rendering Pipeline 1 Silverlight for Windows Embedded Graphics and Rendering Pipeline Windows Embedded Compact 7 Technical Article Writers: David Franklin,

More information

The Lagopus SDN Software Switch. 3.1 SDN and OpenFlow. 3. Cloud Computing Technology

The Lagopus SDN Software Switch. 3.1 SDN and OpenFlow. 3. Cloud Computing Technology 3. The Lagopus SDN Software Switch Here we explain the capabilities of the new Lagopus software switch in detail, starting with the basics of SDN and OpenFlow. 3.1 SDN and OpenFlow Those engaged in network-related

More information

Boundless Security Systems, Inc.

Boundless Security Systems, Inc. Boundless Security Systems, Inc. sharper images with better access and easier installation Product Overview Product Summary Data Sheet Control Panel client live and recorded viewing, and search software

More information

Parallelization of video compressing with FFmpeg and OpenMP in supercomputing environment

Parallelization of video compressing with FFmpeg and OpenMP in supercomputing environment Proceedings of the 9 th International Conference on Applied Informatics Eger, Hungary, January 29 February 1, 2014. Vol. 1. pp. 231 237 doi: 10.14794/ICAI.9.2014.1.231 Parallelization of video compressing

More information

MIDeA: A Multi-Parallel Intrusion Detection Architecture

MIDeA: A Multi-Parallel Intrusion Detection Architecture MIDeA: A Multi-Parallel Intrusion Detection Architecture Giorgos Vasiliadis, FORTH-ICS, Greece Michalis Polychronakis, Columbia U., USA Sotiris Ioannidis, FORTH-ICS, Greece CCS 2011, 19 October 2011 Network

More information

Real-Time Realistic Rendering. Michael Doggett Docent Department of Computer Science Lund university

Real-Time Realistic Rendering. Michael Doggett Docent Department of Computer Science Lund university Real-Time Realistic Rendering Michael Doggett Docent Department of Computer Science Lund university 30-5-2011 Visually realistic goal force[d] us to completely rethink the entire rendering process. Cook

More information

Using Fuzzy Logic Control to Provide Intelligent Traffic Management Service for High-Speed Networks ABSTRACT:

Using Fuzzy Logic Control to Provide Intelligent Traffic Management Service for High-Speed Networks ABSTRACT: Using Fuzzy Logic Control to Provide Intelligent Traffic Management Service for High-Speed Networks ABSTRACT: In view of the fast-growing Internet traffic, this paper propose a distributed traffic management

More information

Overview. Lecture 1: an introduction to CUDA. Hardware view. Hardware view. hardware view software view CUDA programming

Overview. Lecture 1: an introduction to CUDA. Hardware view. Hardware view. hardware view software view CUDA programming Overview Lecture 1: an introduction to CUDA Mike Giles mike.giles@maths.ox.ac.uk hardware view software view Oxford University Mathematical Institute Oxford e-research Centre Lecture 1 p. 1 Lecture 1 p.

More information

Getting Started with RemoteFX in Windows Embedded Compact 7

Getting Started with RemoteFX in Windows Embedded Compact 7 Getting Started with RemoteFX in Windows Embedded Compact 7 Writers: Randy Ocheltree, Ryan Wike Technical Reviewer: Windows Embedded Compact RDP Team Applies To: Windows Embedded Compact 7 Published: January

More information

The Collaboratorium & Remote Visualization at SARA. Tijs de Kler SARA Visualization Group (tijs.dekler@sara.nl)

The Collaboratorium & Remote Visualization at SARA. Tijs de Kler SARA Visualization Group (tijs.dekler@sara.nl) The Collaboratorium & Remote Visualization at SARA Tijs de Kler SARA Visualization Group (tijs.dekler@sara.nl) The Collaboratorium! Goals Support collaboration, presentations and visualization for the

More information

HPC with Multicore and GPUs

HPC with Multicore and GPUs HPC with Multicore and GPUs Stan Tomov Electrical Engineering and Computer Science Department University of Tennessee, Knoxville CS 594 Lecture Notes March 4, 2015 1/18 Outline! Introduction - Hardware

More information

HIGH PERFORMANCE CONSULTING COURSE OFFERINGS

HIGH PERFORMANCE CONSULTING COURSE OFFERINGS Performance 1(6) HIGH PERFORMANCE CONSULTING COURSE OFFERINGS LEARN TO TAKE ADVANTAGE OF POWERFUL GPU BASED ACCELERATOR TECHNOLOGY TODAY 2006 2013 Nvidia GPUs Intel CPUs CONTENTS Acronyms and Terminology...

More information

GeoImaging Accelerator Pansharp Test Results

GeoImaging Accelerator Pansharp Test Results GeoImaging Accelerator Pansharp Test Results Executive Summary After demonstrating the exceptional performance improvement in the orthorectification module (approximately fourteen-fold see GXL Ortho Performance

More information

DATA/SPEC SHEET VIRTUAL MATRIX DISPLAY CONTROLLER VERSION 8

DATA/SPEC SHEET VIRTUAL MATRIX DISPLAY CONTROLLER VERSION 8 DATA/SPEC SHEET VIRTUAL MATRIX DISPLAY CONTROLLER VERSION 8 V920 - PRODUCT DESCRIPTION Virtual Matrix Display Controller The Virtual Matrix Display Controller (VMDC) is a selfcontained, matrix control

More information

Getting Started with the ZED 2 Introduction... 2

Getting Started with the ZED 2 Introduction... 2 Getting Started Contents Getting Started with the ZED 2 Introduction....................................................... 2 What s In The Box?.................................................. 2 System

More information

Brainlab Node TM Technical Specifications

Brainlab Node TM Technical Specifications Brainlab Node TM Technical Specifications BRAINLAB NODE TM HP ProLiant DL360p Gen 8 CPU: Chipset: RAM: HDD: RAID: Graphics: LAN: HW Monitoring: Height: Width: Length: Weight: Operating System: 2x Intel

More information

2020 Design Update 11.3. Release Notes November 10, 2015

2020 Design Update 11.3. Release Notes November 10, 2015 2020 Design Update 11.3 Release Notes November 10, 2015 Contents Introduction... 1 System Requirements... 2 Actively Supported Operating Systems... 2 Hardware Requirements (Minimum)... 2 Hardware Requirements

More information

Whitepaper. NVIDIA Miracast Wireless Display Architecture

Whitepaper. NVIDIA Miracast Wireless Display Architecture Whitepaper NVIDIA Miracast Wireless Display Architecture 1 Table of Content Miracast Wireless Display Background... 3 NVIDIA Miracast Architecture... 4 Benefits of NVIDIA Miracast Architecture... 5 Summary...

More information

Packet-based Network Traffic Monitoring and Analysis with GPUs

Packet-based Network Traffic Monitoring and Analysis with GPUs Packet-based Network Traffic Monitoring and Analysis with GPUs Wenji Wu, Phil DeMar wenji@fnal.gov, demar@fnal.gov GPU Technology Conference 2014 March 24-27, 2014 SAN JOSE, CALIFORNIA Background Main

More information

NVIDIA VIDEO ENCODER 5.0

NVIDIA VIDEO ENCODER 5.0 NVIDIA VIDEO ENCODER 5.0 NVENC_DA-06209-001_v06 November 2014 Application Note NVENC - NVIDIA Hardware Video Encoder 5.0 NVENC_DA-06209-001_v06 i DOCUMENT CHANGE HISTORY NVENC_DA-06209-001_v06 Version

More information

Introduction to GPU hardware and to CUDA

Introduction to GPU hardware and to CUDA Introduction to GPU hardware and to CUDA Philip Blakely Laboratory for Scientific Computing, University of Cambridge Philip Blakely (LSC) GPU introduction 1 / 37 Course outline Introduction to GPU hardware

More information

Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data

Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data Amanda O Connor, Bryan Justice, and A. Thomas Harris IN52A. Big Data in the Geosciences:

More information

Using MATLAB to Measure the Diameter of an Object within an Image

Using MATLAB to Measure the Diameter of an Object within an Image Using MATLAB to Measure the Diameter of an Object within an Image Keywords: MATLAB, Diameter, Image, Measure, Image Processing Toolbox Author: Matthew Wesolowski Date: November 14 th 2014 Executive Summary

More information

High-performance vswitch of the user, by the user, for the user

High-performance vswitch of the user, by the user, for the user A bird in cloud High-performance vswitch of the user, by the user, for the user Yoshihiro Nakajima, Wataru Ishida, Tomonori Fujita, Takahashi Hirokazu, Tomoya Hibi, Hitoshi Matsutahi, Katsuhiro Shimano

More information

Parallel Computing with MATLAB

Parallel Computing with MATLAB Parallel Computing with MATLAB Scott Benway Senior Account Manager Jiro Doke, Ph.D. Senior Application Engineer 2013 The MathWorks, Inc. 1 Acceleration Strategies Applied in MATLAB Approach Options Best

More information

Stream Processing on GPUs Using Distributed Multimedia Middleware

Stream Processing on GPUs Using Distributed Multimedia Middleware Stream Processing on GPUs Using Distributed Multimedia Middleware Michael Repplinger 1,2, and Philipp Slusallek 1,2 1 Computer Graphics Lab, Saarland University, Saarbrücken, Germany 2 German Research

More information

Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build 164009

Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build 164009 Performance Study Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build 164009 Introduction With more and more mission critical networking intensive workloads being virtualized

More information

Parallel Image Processing with CUDA A case study with the Canny Edge Detection Filter

Parallel Image Processing with CUDA A case study with the Canny Edge Detection Filter Parallel Image Processing with CUDA A case study with the Canny Edge Detection Filter Daniel Weingaertner Informatics Department Federal University of Paraná - Brazil Hochschule Regensburg 02.05.2011 Daniel

More information

Software Defined Networking and the design of OpenFlow switches

Software Defined Networking and the design of OpenFlow switches Software Defined Networking and the design of OpenFlow switches Paolo Giaccone Notes for the class on Packet Switch Architectures Politecnico di Torino December 2015 Outline 1 Introduction to SDN 2 OpenFlow

More information

Home Exam 3: Distributed Video Encoding using Dolphin PCI Express Networks. October 20 th 2015

Home Exam 3: Distributed Video Encoding using Dolphin PCI Express Networks. October 20 th 2015 INF5063: Programming heterogeneous multi-core processors because the OS-course is just to easy! Home Exam 3: Distributed Video Encoding using Dolphin PCI Express Networks October 20 th 2015 Håkon Kvale

More information

10 Gbit Hardware Packet Filtering Using Commodity Network Adapters. Luca Deri <deri@ntop.org> Joseph Gasparakis <joseph.gasparakis@intel.

10 Gbit Hardware Packet Filtering Using Commodity Network Adapters. Luca Deri <deri@ntop.org> Joseph Gasparakis <joseph.gasparakis@intel. 10 Gbit Hardware Packet Filtering Using Commodity Network Adapters Luca Deri Joseph Gasparakis 10 Gbit Monitoring Challenges [1/2] High number of packets to

More information

ARM Processors for Computer-On-Modules. Christian Eder Marketing Manager congatec AG

ARM Processors for Computer-On-Modules. Christian Eder Marketing Manager congatec AG ARM Processors for Computer-On-Modules Christian Eder Marketing Manager congatec AG COM Positioning Proprietary Modules Qseven COM Express Proprietary Modules Small Module Powerful Module No standard feature

More information

Software Defined Networking What is it, how does it work, and what is it good for?

Software Defined Networking What is it, how does it work, and what is it good for? Software Defined Networking What is it, how does it work, and what is it good for? slides stolen from Jennifer Rexford, Nick McKeown, Michael Schapira, Scott Shenker, Teemu Koponen, Yotam Harchol and David

More information

NVIDIA CUDA GETTING STARTED GUIDE FOR MICROSOFT WINDOWS

NVIDIA CUDA GETTING STARTED GUIDE FOR MICROSOFT WINDOWS NVIDIA CUDA GETTING STARTED GUIDE FOR MICROSOFT WINDOWS DU-05349-001_v6.0 February 2014 Installation and Verification on TABLE OF CONTENTS Chapter 1. Introduction...1 1.1. System Requirements... 1 1.2.

More information

Unifying the Programmability of Cloud and Carrier Infrastructure

Unifying the Programmability of Cloud and Carrier Infrastructure Unifying the Programmability of Cloud and Carrier Infrastructure Mario Kind EWSDN 2014, Budapest UNIFY is co-funded by the European Commission DG CONNECT in FP7 We might only have to knit the future. Operator

More information

SGRT: A Scalable Mobile GPU Architecture based on Ray Tracing

SGRT: A Scalable Mobile GPU Architecture based on Ray Tracing SGRT: A Scalable Mobile GPU Architecture based on Ray Tracing Won-Jong Lee, Shi-Hwa Lee, Jae-Ho Nah *, Jin-Woo Kim *, Youngsam Shin, Jaedon Lee, Seok-Yoon Jung SAIT, SAMSUNG Electronics, Yonsei Univ. *,

More information

GPU Hardware and Programming Models. Jeremy Appleyard, September 2015

GPU Hardware and Programming Models. Jeremy Appleyard, September 2015 GPU Hardware and Programming Models Jeremy Appleyard, September 2015 A brief history of GPUs In this talk Hardware Overview Programming Models Ask questions at any point! 2 A Brief History of GPUs 3 Once

More information

ADVANTAGES OF AV OVER IP. EMCORE Corporation

ADVANTAGES OF AV OVER IP. EMCORE Corporation ADVANTAGES OF AV OVER IP More organizations than ever before are looking for cost-effective ways to distribute large digital communications files. One of the best ways to achieve this is with an AV over

More information

HIGH PERFORMANCE VIDEO ENCODING WITH NVIDIA GPUS

HIGH PERFORMANCE VIDEO ENCODING WITH NVIDIA GPUS April 4-7, 2016 Silicon Valley HIGH PERFORMANCE VIDEO ENCODING WITH NVIDIA GPUS Abhijit Patait Eric Young April 4 th, 2016 NVIDIA GPU Video Technologies Video Hardware Capabilities AGENDA Video Software

More information

The Uintah Framework: A Unified Heterogeneous Task Scheduling and Runtime System

The Uintah Framework: A Unified Heterogeneous Task Scheduling and Runtime System The Uintah Framework: A Unified Heterogeneous Task Scheduling and Runtime System Qingyu Meng, Alan Humphrey, Martin Berzins Thanks to: John Schmidt and J. Davison de St. Germain, SCI Institute Justin Luitjens

More information

Performance of Software Switching

Performance of Software Switching Performance of Software Switching Based on papers in IEEE HPSR 2011 and IFIP/ACM Performance 2011 Nuutti Varis, Jukka Manner Department of Communications and Networking (COMNET) Agenda Motivation Performance

More information

Interactive Level-Set Segmentation on the GPU

Interactive Level-Set Segmentation on the GPU Interactive Level-Set Segmentation on the GPU Problem Statement Goal Interactive system for deformable surface manipulation Level-sets Challenges Deformation is slow Deformation is hard to control Solution

More information

This letter contains latest information about the above mentioned software version.

This letter contains latest information about the above mentioned software version. Release Letter Product: Version: MPEG-ActiveX 5.82.0052 This letter contains latest information about the above mentioned software version. MPEG-ActiveX 5.82.0052 is a feature release based on the former

More information

Outline. Institute of Computer and Communication Network Engineering. Institute of Computer and Communication Network Engineering

Outline. Institute of Computer and Communication Network Engineering. Institute of Computer and Communication Network Engineering Institute of Computer and Communication Network Engineering Institute of Computer and Communication Network Engineering Communication Networks Software Defined Networking (SDN) Prof. Dr. Admela Jukan Dr.

More information

ACANO SOLUTION VIRTUALIZED DEPLOYMENTS. White Paper. Simon Evans, Acano Chief Scientist

ACANO SOLUTION VIRTUALIZED DEPLOYMENTS. White Paper. Simon Evans, Acano Chief Scientist ACANO SOLUTION VIRTUALIZED DEPLOYMENTS White Paper Simon Evans, Acano Chief Scientist Updated April 2015 CONTENTS Introduction... 3 Host Requirements... 5 Sizing a VM... 6 Call Bridge VM... 7 Acano Edge

More information

Alberto Corrales-García, Rafael Rodríguez-Sánchez, José Luis Martínez, Gerardo Fernández-Escribano, José M. Claver and José Luis Sánchez

Alberto Corrales-García, Rafael Rodríguez-Sánchez, José Luis Martínez, Gerardo Fernández-Escribano, José M. Claver and José Luis Sánchez Alberto Corrales-García, Rafael Rodríguez-Sánchez, José Luis artínez, Gerardo Fernández-Escribano, José. Claver and José Luis Sánchez 1. Introduction 2. Technical Background 3. Proposed DVC to H.264/AVC

More information

Choosing a Computer for Running SLX, P3D, and P5

Choosing a Computer for Running SLX, P3D, and P5 Choosing a Computer for Running SLX, P3D, and P5 This paper is based on my experience purchasing a new laptop in January, 2010. I ll lead you through my selection criteria and point you to some on-line

More information

OpenFlow and Onix. OpenFlow: Enabling Innovation in Campus Networks. The Problem. We also want. How to run experiments in campus networks?

OpenFlow and Onix. OpenFlow: Enabling Innovation in Campus Networks. The Problem. We also want. How to run experiments in campus networks? OpenFlow and Onix Bowei Xu boweixu@umich.edu [1] McKeown et al., "OpenFlow: Enabling Innovation in Campus Networks," ACM SIGCOMM CCR, 38(2):69-74, Apr. 2008. [2] Koponen et al., "Onix: a Distributed Control

More information

HP Workstations graphics card options

HP Workstations graphics card options Family data sheet HP Workstations graphics card options Quick reference guide Leading-edge professional graphics February 2013 A full range of graphics cards to meet your performance needs compare features

More information

NVIDIA workstation 3D graphics card upgrade options deliver productivity improvements and superior image quality

NVIDIA workstation 3D graphics card upgrade options deliver productivity improvements and superior image quality Hardware Announcement ZG09-0170, dated March 31, 2009 NVIDIA workstation 3D graphics card upgrade options deliver productivity improvements and superior image quality Table of contents 1 At a glance 3

More information

Optimizing AAA Games for Mobile Platforms

Optimizing AAA Games for Mobile Platforms Optimizing AAA Games for Mobile Platforms Niklas Smedberg Senior Engine Programmer, Epic Games Who Am I A.k.a. Smedis Epic Games, Unreal Engine 15 years in the industry 30 years of programming C64 demo

More information

A GPU COMPUTING PLATFORM (SAGA) AND A CFD CODE ON GPU FOR AEROSPACE APPLICATIONS

A GPU COMPUTING PLATFORM (SAGA) AND A CFD CODE ON GPU FOR AEROSPACE APPLICATIONS A GPU COMPUTING PLATFORM (SAGA) AND A CFD CODE ON GPU FOR AEROSPACE APPLICATIONS SUDHAKARAN.G APCF, AERO, VSSC, ISRO 914712564742 g_suhakaran@vssc.gov.in THOMAS.C.BABU APCF, AERO, VSSC, ISRO 914712565833

More information

Feasibility Study of Searchable Image Encryption System of Streaming Service based on Cloud Computing Environment

Feasibility Study of Searchable Image Encryption System of Streaming Service based on Cloud Computing Environment Feasibility Study of Searchable Image Encryption System of Streaming Service based on Cloud Computing Environment JongGeun Jeong, ByungRae Cha, and Jongwon Kim Abstract In this paper, we sketch the idea

More information

Adoption of SDN: Progress Update

Adoption of SDN: Progress Update Adoption of SDN: Progress Update Stuart Elby VP, Network Architecture & Technology 17 April 2012 Services Migration to the Cloud Service intelligence distributed across dedicated network elements Opportunity:

More information