Power Parallel Programming for Mobile Devices

Size: px
Start display at page:

Download "Power Parallel Programming for Mobile Devices"

Transcription

1 Power Parallel Programming for Mobile Devices Calin Cascaval, Sept. 10, Qualcomm Technologies, Inc.

2 Outline Mobile compu-ng market - - what do users care about? Mobile compu-ng landscape - - how does the plahorm look like? Applica-ons and developers - - who is wri-ng code for mobile? Browser wars revisited - - why does it maler? Parallel heterogeneous programming made easier using MARE 2

3 Mobile Market 3

4 Mobile Market Size Billions of US $ Size of mobile market roughly doubling every two years > 20x (forecast) Mobile Desktop Servers HPC Source: IDC. HPC forecast Intersect360 4

5 What do users care about? 5 McKinsey s Consumer and Shopping Insights

6 What about next morning? 6 McKinsey s Consumer and Shopping Insights

7 The Internet of Everything Connec-vity Context Control An estimated 25B devices connected by 2020, talking to each other, making choices, and taking decisions Use context awareness to filter the data that reaches every person: time, location, taste Allow users to control and customize experience through new, mobile centric interfaces Enabling whole new classes of applica-ons, where power is even more cri-cal 7

8 What s in a smartphone?

9 Anatomy of a Mobile SoC: Qualcomm Snapdragon Source: hlp://

10 Mobile Constrains AA Energy Thermal 10

11 Mobile Applica-ons Connec-vity Social networking, sharing: Media rich, data heavy Gaming On device and multiplayer games with realistic graphics Browsing Information gateway Imaging Computational photography, search and processing of images and video 11

12 ahb2axi 16k I$ 8k D$ ARM926EJ-S AHB ARM Peripheral ahb2crif SG AP SP VBIF (DDR) DDR AXI FPB (AHB) VPE L1 $ / L2 $ Ctrl MEC Shared RAM pool (FIFO / Cache / Tile Buffer / Internal RAM) PPE OCMEM Data Mover & Controller TRE VBIF (OCMEM) OCMEM AXI memory PM4 Packet Buffers VS Buffer Index Buffers Vertex Objects Texture Objects Frame Buffer AXI FF (fix-function) External - other V B I F U C H E OXILI TOP arb arb arb arb L1 L1 L1 L1 2x TP 2x TP 2x TP 2x TP VSC VFD CP SP #3 PC SP #2 HLSQ RBBM CPU OXILI SS SP #1 SP #0 RB RB RB RB MA RB MA RB MA RB MA RB GMEM GMEM GMEM GMEM AHB Slave I/F V P C T S E / R A S Mobile Sodware Development Developers Think in terms of services and features Portable apps, performance not the main concern Use JavaScript frameworks Hardware Accel. DSP GPU CPU(s) VeNum VeNum CPU CPU VSP VPP L2 DMA Shared RAM Banks 4KB x 8 VSP-VPP I/F (FIFO) SS (shader system) CPU CPU VeNum VeNum 12 High%Power% Efficiency Low%Power% Efficiency

13 How can we easily write parallel applications that use all available cores in a battery-powered device? 13

14 Our Vision of a Mobile Sodware Stack Web Apps Na-ve Apps Performance: Domain Specific libraries - Exploit domain knowledge to provide composable libraries for all programmers - Hide hardware complexity JavaScript frameworks Portability: Zoomm Web App Engine - Use of concurrency to op-mize execu-on of Web Apps - Hardware exploita-on through the browser Browser Engine Domain Specific Parallel Libraries MARE Programmability: MARE - Parallel, heterogeneous programming made easier - Power and performance op-miza-ons 14

15 Browsers: why do we care?

16 Web Browser Usage Information access Web Applications 89% 85% Browsing 16 Source: Illuminas, 2013

17 Browser Execu-on Time Breakdown rendering javascript layout css Others parsing 5% 4% 19% 20% Insight: There is no one dominant component Paralleliza-on must address the en-re browser structure! 31% 21% ARM Cortex A9 ExecuDon Time Average over the top 30 Alexa sites (May 2010), WebKit browser on a 400MHz Cortex A9, Linux 17

18 Zoomm: Pervasive Concurrency URL User Interface Events Res. Manager Prefetching Image Decoding HTML code Per- Page DOM Per- Page Engine DOM Engine HTML Parsing Timers CSS Parsing CSS Parsing Events Styling JS code Layout Tree Per- Page JavaScript JavaScript Engine Engine Execu-on Execu-on Compila-on n Rendering Engine Layout Render 18 Cascaval et al.: Zoomm: a parallel web browser engine for mul-core mobile devices, PPoPP 2013

19 Zoomm: Page Load Performance WebKit Zoomm Seconds CNN BBC Yahoo Guardian NYT Facebook Engadget QQ HTC Jetstream, MSM8660, 2- core, 1.5GHz, Aug

20 Qualcomm MARE Mul-core Asynchronous Run-me Environment

21 The Smartphone of 2010 All applica-ons run on a single- core Qualcomm Snapdragon processor at 1GHz App 4 App 3 App 2 App 1 Core 21

22 The Smartphone of 2013 Mul-ple applica-ons can simultaneously run on a quad- core Snapdragon at 2+GHz App 1 App 2 App 3 App 4 Core 1 Core 2 Core 3 Core 4 22

23 What is Qualcomm MARE? MARE is a programming model and a run-me system that provides simple yet powerful abstrac-ons for parallel, power- efficient sodware Simple C++ API allows developers to express concurrency User- level library that runs on any Android device, and on Linux, Mac OS X, and Windows plahorms The goal of MARE is to reduce the effort required to write apps that fully u-lize heterogeneous SoCs 23

24 MARE Workflow Focus on your application and not on the hardware Understand algorithms Par--on algorithms into independent units of work Setup task dependencies Link with MARE Run-me Algorithms MARE Application MARE Runtime

25 Qualcomm MARE API Concepts Tasks are units of work that can be asynchronously executed Groups are sets of tasks that can be canceled or waited on 25

26 Advantages of using Qualcomm MARE Simple Produc-ve Efficient Tasks are a natural way to express parallelism Focus on application logic, not on thread management Task dependences allow the MARE runtime to perform more intelligent scheduling decisions 26

27 Hello World! #include <stdio.h> #include <mare/mare.h> int main() { mare::runtime::init(); // Initialize MARE auto hello = mare::create_task([] {printf("hello );}); // Create task auto world = mare::create_task([] {printf( World! );}); // Create task hello >> world; // Set dependency mare::launch(hello); mare::launch(world); // Launch hello task // Launch world task mare::wait_for(world); runtime::shutdown(); return 0; } // Wait to complete // Shutdown MARE 27

28 Parallel Programming PaLerns in MARE MARE offers several commonly used parallel palerns: mare::pfor_each processes the elements of a collec-on mare::pscan performs an in- place parallel prefix opera-on for the elements of a collec-on. mare::transform performs a map opera-on on the elements of a collec-on Programmers can also create their own palerns using the MARE API Parallel data structures: concurrent queues, hash tables, etc. 28

29 Bullet Physics Paralleliza-on using MARE Three major components: about 75% of total execu-on -me Constraint Solver Finds non- interac-ng groups of objects and solves as independent tasks Coarse- grained task- paralleliza-on Rendering Very coarse- grained task- paralleliza-on: offload to a separate thread to allow for con-nuous simula-on Narrow Phase Collision Detec-on Fine- grained data- paralleliza-on About 2x FPS improvement on typical mobile devices Drama-cally improves the smoothness of rendering and overall user experience 29

30 Bullet Physics Paralleliza-on MARE Serial Frames Per Second Frame Number 30

31 How can You try MARE? Stay tuned! Planning preview availability on Qualcomm Developer Network Fall 2013 Beta release for the mul-core version We Want Your Feedback! What scenarios will you use it for? Usability and features of the library 31

32 Acknowledgments Manticore team Pablo Montesinos Ortego, Michael Weber, Wayne Piekarski, Behnam Robatmili, Dario Suarez-Gracia, Vrajesh Bhavsar, Jimi Xenidis, Han Zhao, Kishore Puskuri Interns Madhukar Kedlaya, Christian Delozier, Freark Van Der Berg, Christoph Kershbaumer Former members Mehrdad Reshadi, Seth Fowler, Alex Shye, Dilma DaSilva Qualcomm Nayeem Islam, Mark Bapst, Charles Bergan, Matt Grob, Ben Gaster MulticoreWare Wen-Mei Hwu, Mark Allender, Kevin Wu 32

33 Challenges Power and performance Custom hardware for power efficiency. Heterogeneity is the game. Expertly designed and optimized frameworks that hide hardware complexity Programmability Tools and languages to enable programmers to understand and express concurrency and application semantics Composable building blocks Mobile has exci-ng applica-ons and a huge impact 33

34 Legal Disclaimers All opinions expressed in this presentation are mine and may not necessarily reflect those of my employer. Qualcomm and Snapdragon are trademarks of Qualcomm Incorporated, registered in the United States and other countries. All Qualcomm Incorporated trademarks are used with permission. Other products and brand names may be trademarks or registered trademarks of their respective owners. 34

35 35

OS/Run'me and Execu'on Time Produc'vity

OS/Run'me and Execu'on Time Produc'vity OS/Run'me and Execu'on Time Produc'vity Ron Brightwell, Technical Manager Scalable System SoAware Department Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation,

More information

USING RUST TO BUILD THE NEXT GENERATION WEB BROWSER

USING RUST TO BUILD THE NEXT GENERATION WEB BROWSER USING RUST TO BUILD THE NEXT GENERATION WEB BROWSER Lars Bergstrom Mozilla Research Mike Blumenkrantz Samsung R&D America Why a new web engine? Support new types of applications and new devices All modern

More information

Graphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011

Graphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011 Graphics Cards and Graphics Processing Units Ben Johnstone Russ Martin November 15, 2011 Contents Graphics Processing Units (GPUs) Graphics Pipeline Architectures 8800-GTX200 Fermi Cayman Performance Analysis

More information

big.little Technology Moves Towards Fully Heterogeneous Global Task Scheduling Improving Energy Efficiency and Performance in Mobile Devices

big.little Technology Moves Towards Fully Heterogeneous Global Task Scheduling Improving Energy Efficiency and Performance in Mobile Devices big.little Technology Moves Towards Fully Heterogeneous Global Task Scheduling Improving Energy Efficiency and Performance in Mobile Devices Brian Jeff November, 2013 Abstract ARM big.little processing

More information

Mobile Performance Testing Approaches and Challenges

Mobile Performance Testing Approaches and Challenges NOUS INFOSYSTEMS LEVERAGING INTELLECT Mobile Performance Testing Approaches and Challenges ABSTRACT Mobile devices are playing a key role in daily business functions as mobile devices are adopted by most

More information

Application Performance Analysis of the Cortex-A9 MPCore

Application Performance Analysis of the Cortex-A9 MPCore This project in ARM is in part funded by ICT-eMuCo, a European project supported under the Seventh Framework Programme (7FP) for research and technological development Application Performance Analysis

More information

Empowering Developers to Estimate App Energy Consumption. Radhika Mittal, UC Berkeley Aman Kansal & Ranveer Chandra, Microsoft Research

Empowering Developers to Estimate App Energy Consumption. Radhika Mittal, UC Berkeley Aman Kansal & Ranveer Chandra, Microsoft Research Empowering Developers to Estimate App Energy Consumption Radhika Mittal, UC Berkeley Aman Kansal & Ranveer Chandra, Microsoft Research Phone s battery life is critical performance and user experience metric

More information

Performance Optimization and Debug Tools for mobile games with PlayCanvas

Performance Optimization and Debug Tools for mobile games with PlayCanvas Performance Optimization and Debug Tools for mobile games with PlayCanvas Jonathan Kirkham, Senior Software Engineer, ARM Will Eastcott, CEO, PlayCanvas 1 Introduction Jonathan Kirkham, ARM Worked with

More information

What is an Adrenaline Web Browser?

What is an Adrenaline Web Browser? A Case for Parallelizing Web Pages Haohui Mai, Shuo Tang, Samuel T. King University of Illinois and Valkyrie Computer Systems Calin Cascaval, Pablo Montesinos Qualcomm Research ABSTRACT Mobile web browsing

More information

HTML5 & Digital Signage

HTML5 & Digital Signage HTML5 & Digital Signage An introduction to Content Development with the Modern Web standard. Presented by Jim Nista CEO / Creative Director at Insteo HTML5 - the Buzz HTML5 is an industry name for a collection

More information

Magento & Zend Benchmarks Version 1.2, 1.3 (with & without Flat Catalogs)

Magento & Zend Benchmarks Version 1.2, 1.3 (with & without Flat Catalogs) Magento & Zend Benchmarks Version 1.2, 1.3 (with & without Flat Catalogs) 1. Foreword Magento is a PHP/Zend application which intensively uses the CPU. Since version 1.1.6, each new version includes some

More information

High Performance or Cycle Accuracy?

High Performance or Cycle Accuracy? CHIP DESIGN High Performance or Cycle Accuracy? You can have both! Bill Neifert, Carbon Design Systems Rob Kaye, ARM ATC-100 AGENDA Modelling 101 & Programmer s View (PV) Models Cycle Accurate Models Bringing

More information

Stream Processing on GPUs Using Distributed Multimedia Middleware

Stream Processing on GPUs Using Distributed Multimedia Middleware Stream Processing on GPUs Using Distributed Multimedia Middleware Michael Repplinger 1,2, and Philipp Slusallek 1,2 1 Computer Graphics Lab, Saarland University, Saarbrücken, Germany 2 German Research

More information

Overview on Modern Accelerators and Programming Paradigms Ivan Giro7o igiro7o@ictp.it

Overview on Modern Accelerators and Programming Paradigms Ivan Giro7o igiro7o@ictp.it Overview on Modern Accelerators and Programming Paradigms Ivan Giro7o igiro7o@ictp.it Informa(on & Communica(on Technology Sec(on (ICTS) Interna(onal Centre for Theore(cal Physics (ICTP) Mul(ple Socket

More information

Power Benefits Using Intel Quick Sync Video H.264 Codec With Sorenson Squeeze

Power Benefits Using Intel Quick Sync Video H.264 Codec With Sorenson Squeeze Power Benefits Using Intel Quick Sync Video H.264 Codec With Sorenson Squeeze Whitepaper December 2012 Anita Banerjee Contents Introduction... 3 Sorenson Squeeze... 4 Intel QSV H.264... 5 Power Performance...

More information

IMCM: A Flexible Fine-Grained Adaptive Framework for Parallel Mobile Hybrid Cloud Applications

IMCM: A Flexible Fine-Grained Adaptive Framework for Parallel Mobile Hybrid Cloud Applications Open System Laboratory of University of Illinois at Urbana Champaign presents: Outline: IMCM: A Flexible Fine-Grained Adaptive Framework for Parallel Mobile Hybrid Cloud Applications A Fine-Grained Adaptive

More information

Open Source Open Possibilities. Vellamo. System Level Benchmarking October 2012. Open Source Open Possibilities PAGE 1

Open Source Open Possibilities. Vellamo. System Level Benchmarking October 2012. Open Source Open Possibilities PAGE 1 Vellamo System Level Benchmarking October 2012 PAGE 1 Disclaimer Nothing in these materials is an offer to sell any of the components or devices referenced herein. Certain components for use in the U.S.

More information

Example of Standard API

Example of Standard API 16 Example of Standard API System Call Implementation Typically, a number associated with each system call System call interface maintains a table indexed according to these numbers The system call interface

More information

Resource Utilization of Middleware Components in Embedded Systems

Resource Utilization of Middleware Components in Embedded Systems Resource Utilization of Middleware Components in Embedded Systems 3 Introduction System memory, CPU, and network resources are critical to the operation and performance of any software system. These system

More information

Hardware accelerated Virtualization in the ARM Cortex Processors

Hardware accelerated Virtualization in the ARM Cortex Processors Hardware accelerated Virtualization in the ARM Cortex Processors John Goodacre Director, Program Management ARM Processor Division ARM Ltd. Cambridge UK 2nd November 2010 Sponsored by: & & New Capabilities

More information

Petascale Software Challenges. Piyush Chaudhary piyushc@us.ibm.com High Performance Computing

Petascale Software Challenges. Piyush Chaudhary piyushc@us.ibm.com High Performance Computing Petascale Software Challenges Piyush Chaudhary piyushc@us.ibm.com High Performance Computing Fundamental Observations Applications are struggling to realize growth in sustained performance at scale Reasons

More information

INSTALLATION GUIDE ENTERPRISE DYNAMICS 9.0

INSTALLATION GUIDE ENTERPRISE DYNAMICS 9.0 INSTALLATION GUIDE ENTERPRISE DYNAMICS 9.0 PLEASE NOTE PRIOR TO INSTALLING On Windows 8, Windows 7 and Windows Vista you must have Administrator rights to install the software. Installing Enterprise Dynamics

More information

NVIDIA GeForce GTX 580 GPU Datasheet

NVIDIA GeForce GTX 580 GPU Datasheet NVIDIA GeForce GTX 580 GPU Datasheet NVIDIA GeForce GTX 580 GPU Datasheet 3D Graphics Full Microsoft DirectX 11 Shader Model 5.0 support: o NVIDIA PolyMorph Engine with distributed HW tessellation engines

More information

HTML5 : carrier grade

HTML5 : carrier grade HTML5 : carrier grade Alex Rutgers / CTO@Momac / February 2013. Introduction Since HTML5 became mainstream media around April 2010 and I decided to create an overview article on HTML5 in the mobile space,

More information

CLOUD GAMING WITH NVIDIA GRID TECHNOLOGIES Franck DIARD, Ph.D., SW Chief Software Architect GDC 2014

CLOUD GAMING WITH NVIDIA GRID TECHNOLOGIES Franck DIARD, Ph.D., SW Chief Software Architect GDC 2014 CLOUD GAMING WITH NVIDIA GRID TECHNOLOGIES Franck DIARD, Ph.D., SW Chief Software Architect GDC 2014 Introduction Cloud ification < 2013 2014+ Music, Movies, Books Games GPU Flops GPUs vs. Consoles 10,000

More information

Understanding the Performance of an X550 11-User Environment

Understanding the Performance of an X550 11-User Environment Understanding the Performance of an X550 11-User Environment Overview NComputing's desktop virtualization technology enables significantly lower computing costs by letting multiple users share a single

More information

L20: GPU Architecture and Models

L20: GPU Architecture and Models L20: GPU Architecture and Models scribe(s): Abdul Khalifa 20.1 Overview GPUs (Graphics Processing Units) are large parallel structure of processing cores capable of rendering graphics efficiently on displays.

More information

Highly parallel, lock- less, user- space TCP/IP networking stack based on FreeBSD. EuroBSDCon 2013 Malta

Highly parallel, lock- less, user- space TCP/IP networking stack based on FreeBSD. EuroBSDCon 2013 Malta Highly parallel, lock- less, user- space TCP/IP networking stack based on FreeBSD EuroBSDCon 2013 Malta Networking stack Requirements High throughput Low latency ConnecLon establishments and teardowns

More information

Behind the scene III Cloud computing

Behind the scene III Cloud computing Behind the scene III Cloud computing Athens, 15.11.2014 M. Dolenc / R. Klinc Why we do it? Engineering in the cloud is a combina3on of cloud based services and rich interac3ve applica3ons allowing engineers

More information

Step into the Future: HTML5 and its Impact on SSL VPNs

Step into the Future: HTML5 and its Impact on SSL VPNs Step into the Future: HTML5 and its Impact on SSL VPNs Aidan Gogarty HOB, Inc. Session ID: SPO - 302 Session Classification: General Interest What this is all about. All about HTML5 3 useful components

More information

Generate Android App

Generate Android App Generate Android App This paper describes how someone with no programming experience can generate an Android application in minutes without writing any code. The application, also called an APK file can

More information

SharePoint, Is IT Time to Move to the Cloud? Minnesota County IT Leadership Association July 15 th, 2015

SharePoint, Is IT Time to Move to the Cloud? Minnesota County IT Leadership Association July 15 th, 2015 SharePoint, Is IT Time to Move to the Cloud? Minnesota County IT Leadership Association July 15 th, 2015 Donald Donais Avtex Sr. Consultant - SharePoint ddonais@avtex.com Blog Tales from IT Side Twitter

More information

Intel Xeon +FPGA Platform for the Data Center

Intel Xeon +FPGA Platform for the Data Center Intel Xeon +FPGA Platform for the Data Center FPL 15 Workshop on Reconfigurable Computing for the Masses PK Gupta, Director of Cloud Platform Technology, DCG/CPG Overview Data Center and Workloads Xeon+FPGA

More information

About Parallels Desktop 10 for Mac

About Parallels Desktop 10 for Mac About Parallels Desktop 10 for Mac Parallels Desktop 10 for Mac is a major upgrade to Parallels award-winning software for running Windows on a Mac. About this Update This update for Parallels Desktop

More information

HTML5 the new. standard for Interactive Web

HTML5 the new. standard for Interactive Web WHITE PAPER HTML the new standard for Interactive Web by Gokul Seenivasan, Aspire Systems HTML is everywhere these days. Whether desktop or mobile, windows or Mac, or just about any other modern form factor

More information

Getting Started with RemoteFX in Windows Embedded Compact 7

Getting Started with RemoteFX in Windows Embedded Compact 7 Getting Started with RemoteFX in Windows Embedded Compact 7 Writers: Randy Ocheltree, Ryan Wike Technical Reviewer: Windows Embedded Compact RDP Team Applies To: Windows Embedded Compact 7 Published: January

More information

ipad, a revolutionary device - Apple

ipad, a revolutionary device - Apple Flash vs HTML5 ipad, a revolutionary device Apple Lightweight and portable Sufficient battery life Completely Wireless Convenient multitouch interface Huge number of apps (some of them are useful) No Flash

More information

So#ware Tools and Techniques for HPC, Clouds, and Server- Class SoCs Ron Brightwell

So#ware Tools and Techniques for HPC, Clouds, and Server- Class SoCs Ron Brightwell So#ware Tools and Techniques for HPC, Clouds, and Server- Class SoCs Ron Brightwell R&D Manager, Scalable System So#ware Department Sandia National Laboratories is a multi-program laboratory managed and

More information

Introduction to GPU hardware and to CUDA

Introduction to GPU hardware and to CUDA Introduction to GPU hardware and to CUDA Philip Blakely Laboratory for Scientific Computing, University of Cambridge Philip Blakely (LSC) GPU introduction 1 / 37 Course outline Introduction to GPU hardware

More information

Configuring a U170 Shared Computing Environment

Configuring a U170 Shared Computing Environment Configuring a U170 Shared Computing Environment NComputing Inc. March 09, 2010 Overview NComputing's desktop virtualization technology enables significantly lower computing costs by letting multiple users

More information

Informatica Data Director Performance

Informatica Data Director Performance Informatica Data Director Performance 2011 Informatica Abstract A variety of performance and stress tests are run on the Informatica Data Director to ensure performance and scalability for a wide variety

More information

Introduction to WebGL

Introduction to WebGL Introduction to WebGL Alain Chesnais Chief Scientist, TrendSpottr ACM Past President chesnais@acm.org http://www.linkedin.com/in/alainchesnais http://facebook.com/alain.chesnais Housekeeping If you are

More information

Energy-aware job scheduler for highperformance

Energy-aware job scheduler for highperformance Energy-aware job scheduler for highperformance computing 7.9.2011 Olli Mämmelä (VTT), Mikko Majanen (VTT), Robert Basmadjian (University of Passau), Hermann De Meer (University of Passau), André Giesler

More information

JavaFX Session Agenda

JavaFX Session Agenda JavaFX Session Agenda 1 Introduction RIA, JavaFX and why JavaFX 2 JavaFX Architecture and Framework 3 Getting Started with JavaFX 4 Examples for Layout, Control, FXML etc Current day users expect web user

More information

Inside the Erlang VM

Inside the Erlang VM Rev A Inside the Erlang VM with focus on SMP Prepared by Kenneth Lundin, Ericsson AB Presentation held at Erlang User Conference, Stockholm, November 13, 2008 1 Introduction The history of support for

More information

How To Test For Performance And Scalability On A Server With A Multi-Core Computer (For A Large Server)

How To Test For Performance And Scalability On A Server With A Multi-Core Computer (For A Large Server) Scalability Results Select the right hardware configuration for your organization to optimize performance Table of Contents Introduction... 1 Scalability... 2 Definition... 2 CPU and Memory Usage... 2

More information

GPU Hardware and Programming Models. Jeremy Appleyard, September 2015

GPU Hardware and Programming Models. Jeremy Appleyard, September 2015 GPU Hardware and Programming Models Jeremy Appleyard, September 2015 A brief history of GPUs In this talk Hardware Overview Programming Models Ask questions at any point! 2 A Brief History of GPUs 3 Once

More information

ATI Radeon 4800 series Graphics. Michael Doggett Graphics Architecture Group Graphics Product Group

ATI Radeon 4800 series Graphics. Michael Doggett Graphics Architecture Group Graphics Product Group ATI Radeon 4800 series Graphics Michael Doggett Graphics Architecture Group Graphics Product Group Graphics Processing Units ATI Radeon HD 4870 AMD Stream Computing Next Generation GPUs 2 Radeon 4800 series

More information

CS 377: Operating Systems. Outline. A review of what you ve learned, and how it applies to a real operating system. Lecture 25 - Linux Case Study

CS 377: Operating Systems. Outline. A review of what you ve learned, and how it applies to a real operating system. Lecture 25 - Linux Case Study CS 377: Operating Systems Lecture 25 - Linux Case Study Guest Lecturer: Tim Wood Outline Linux History Design Principles System Overview Process Scheduling Memory Management File Systems A review of what

More information

x64 Servers: Do you want 64 or 32 bit apps with that server?

x64 Servers: Do you want 64 or 32 bit apps with that server? TMurgent Technologies x64 Servers: Do you want 64 or 32 bit apps with that server? White Paper by Tim Mangan TMurgent Technologies February, 2006 Introduction New servers based on what is generally called

More information

Fujisoft solves graphics acceleration for the Android platform

Fujisoft solves graphics acceleration for the Android platform DESIGN SOLUTION: A C U S T O M E R S U C C E S S S T O R Y Fujisoft solves graphics acceleration for the Android platform by Hiroyuki Ito, Senior Engineer Embedded Core Technology Department, Solution

More information

How To Develop For A Powergen 2.2 (Tegra) With Nsight) And Gbd (Gbd) On A Quadriplegic (Powergen) Powergen 4.2.2 Powergen 3

How To Develop For A Powergen 2.2 (Tegra) With Nsight) And Gbd (Gbd) On A Quadriplegic (Powergen) Powergen 4.2.2 Powergen 3 Profiling and Debugging Tools for High-performance Android Applications Stephen Jones, Product Line Manager, NVIDIA (sjones@nvidia.com) Android By The Numbers 1.3M Android activations per day Android activations

More information

How To Build A Cloud Computer

How To Build A Cloud Computer Introducing the Singlechip Cloud Computer Exploring the Future of Many-core Processors White Paper Intel Labs Jim Held Intel Fellow, Intel Labs Director, Tera-scale Computing Research Sean Koehl Technology

More information

Chapter 1 Computer System Overview

Chapter 1 Computer System Overview Operating Systems: Internals and Design Principles Chapter 1 Computer System Overview Eighth Edition By William Stallings Operating System Exploits the hardware resources of one or more processors Provides

More information

Concept Engineering Adds JavaScript-based Web Capabilities to Nlview at DAC 2016

Concept Engineering Adds JavaScript-based Web Capabilities to Nlview at DAC 2016 KAL - Large IP Cores: Memory Controllers: SD/SDIO 2.0/3.0 Controller SDRAM Controller DDR/DDR2/DDR3 SDRAM Controller NAND Flash Controller Flash/EEPROM/SRAM Controller Dear , Concept Engineering

More information

RICE UNIVERSITY. Speeding Up Mobile Browsers without Infrastructure Support. Zhen Wang. Master of Science

RICE UNIVERSITY. Speeding Up Mobile Browsers without Infrastructure Support. Zhen Wang. Master of Science RICE UNIVERSITY Speeding Up Mobile Browsers without Infrastructure Support by Zhen Wang A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE Master of Science APPROVED, THESIS COMMITTEE

More information

LBM BASED FLOW SIMULATION USING GPU COMPUTING PROCESSOR

LBM BASED FLOW SIMULATION USING GPU COMPUTING PROCESSOR LBM BASED FLOW SIMULATION USING GPU COMPUTING PROCESSOR Frédéric Kuznik, frederic.kuznik@insa lyon.fr 1 Framework Introduction Hardware architecture CUDA overview Implementation details A simple case:

More information

Data Center Evolu.on and the Cloud. Paul A. Strassmann George Mason University November 5, 2008, 7:20 to 10:00 PM

Data Center Evolu.on and the Cloud. Paul A. Strassmann George Mason University November 5, 2008, 7:20 to 10:00 PM Data Center Evolu.on and the Cloud Paul A. Strassmann George Mason University November 5, 2008, 7:20 to 10:00 PM 1 Hardware Evolu.on 2 Where is hardware going? x86 con(nues to move upstream Massive compute

More information

THE BUSINESS CASE FOR HYBRID HTML5 MOBILE APPS

THE BUSINESS CASE FOR HYBRID HTML5 MOBILE APPS Exploring the business case for building hybrid HTML5 mobile applications for enterprise mobility projects compared to implementing with a purely native development approach. THE BUSINESS CASE FOR HYBRID

More information

Distance Examination using Ajax to Reduce Web Server Load and Student s Data Transfer

Distance Examination using Ajax to Reduce Web Server Load and Student s Data Transfer Distance Examination using Ajax to Reduce Web Server Load and Student s Data Transfer Distance Examination using Ajax to Reduce Web Server Load and Student s Data Transfer Ridwan Sanjaya Soegijapranata

More information

Silverlight for Windows Embedded Graphics and Rendering Pipeline 1

Silverlight for Windows Embedded Graphics and Rendering Pipeline 1 Silverlight for Windows Embedded Graphics and Rendering Pipeline 1 Silverlight for Windows Embedded Graphics and Rendering Pipeline Windows Embedded Compact 7 Technical Article Writers: David Franklin,

More information

Managing Adaptability in Heterogeneous Architectures through Performance Monitoring and Prediction

Managing Adaptability in Heterogeneous Architectures through Performance Monitoring and Prediction Managing Adaptability in Heterogeneous Architectures through Performance Monitoring and Prediction Cristina Silvano cristina.silvano@polimi.it Politecnico di Milano HiPEAC CSW Athens 2014 Motivations System

More information

Trends in HTML5. Matt Spencer UI & Browser Marketing Manager

Trends in HTML5. Matt Spencer UI & Browser Marketing Manager Trends in HTML5 Matt Spencer UI & Browser Marketing Manager 6 Where to focus? Chrome is the worlds leading browser - by a large margin 7 Chrome or Chromium, what s the difference Chromium is an open source

More information

Autodesk Revit 2016 Product Line System Requirements and Recommendations

Autodesk Revit 2016 Product Line System Requirements and Recommendations Autodesk Revit 2016 Product Line System Requirements and Recommendations Autodesk Revit 2016, Autodesk Revit Architecture 2016, Autodesk Revit MEP 2016, Autodesk Revit Structure 2016 Minimum: Entry-Level

More information

SUBJECT: SOLIDWORKS HARDWARE RECOMMENDATIONS - 2013 UPDATE

SUBJECT: SOLIDWORKS HARDWARE RECOMMENDATIONS - 2013 UPDATE SUBJECT: SOLIDWORKS RECOMMENDATIONS - 2013 UPDATE KEYWORDS:, CORE, PROCESSOR, GRAPHICS, DRIVER, RAM, STORAGE SOLIDWORKS RECOMMENDATIONS - 2013 UPDATE Below is a summary of key components of an ideal SolidWorks

More information

Oracle Primavera P6 Enterprise Project Portfolio Management Performance and Sizing Guide. An Oracle White Paper October 2010

Oracle Primavera P6 Enterprise Project Portfolio Management Performance and Sizing Guide. An Oracle White Paper October 2010 Oracle Primavera P6 Enterprise Project Portfolio Management Performance and Sizing Guide An Oracle White Paper October 2010 Disclaimer The following is intended to outline our general product direction.

More information

Intro to Intel Galileo - IoT Apps GERARDO CARMONA

Intro to Intel Galileo - IoT Apps GERARDO CARMONA Intro to Intel Galileo - IoT Apps GERARDO CARMONA IRVING LLAMAS Welcome! Campus Party Guadalajara 2015 Introduction In this course we will focus on how to get started with the Intel Galileo Gen 2 development

More information

SierraVMI Sizing Guide

SierraVMI Sizing Guide SierraVMI Sizing Guide July 2015 SierraVMI Sizing Guide This document provides guidelines for choosing the optimal server hardware to host the SierraVMI gateway and the Android application server. The

More information

Operating Systems 4 th Class

Operating Systems 4 th Class Operating Systems 4 th Class Lecture 1 Operating Systems Operating systems are essential part of any computer system. Therefore, a course in operating systems is an essential part of any computer science

More information

Ch 1: What is Game Programming Really Like? Ch 2: What s in a Game? Quiz #1 Discussion

Ch 1: What is Game Programming Really Like? Ch 2: What s in a Game? Quiz #1 Discussion Ch 1: What is Game Programming Really Like? Ch 2: What s in a Game? Quiz #1 Discussion Developing a Game Game Architecture Resources: Chapter 2 (Game Coding Complete) What was your last game architecture

More information

CSE597a - Cell Phone OS Security. Cellphone Hardware. William Enck Prof. Patrick McDaniel

CSE597a - Cell Phone OS Security. Cellphone Hardware. William Enck Prof. Patrick McDaniel CSE597a - Cell Phone OS Security Cellphone Hardware William Enck Prof. Patrick McDaniel CSE597a - Cellular Phone Operating Systems Security - Spring 2009 - Instructors McDaniel and Enck 1 2 Embedded Systems

More information

WEB, HYBRID, NATIVE EXPLAINED CRAIG ISAKSON. June 2013 MOBILE ENGINEERING LEAD / SOFTWARE ENGINEER

WEB, HYBRID, NATIVE EXPLAINED CRAIG ISAKSON. June 2013 MOBILE ENGINEERING LEAD / SOFTWARE ENGINEER WEB, HYBRID, NATIVE EXPLAINED June 2013 CRAIG ISAKSON MOBILE ENGINEERING LEAD / SOFTWARE ENGINEER 701.235.5525 888.sundog fax: 701.235.8941 2000 44th St. S Floor 6 Fargo, ND 58103 www.sundoginteractive.com

More information

JSR proposal: Enhanced Hybrid APIs

JSR proposal: Enhanced Hybrid APIs JSR proposal: Enhanced Hybrid APIs Introduc;on HTML5 is not the future of apps. While developers dream of 'write once run everywhere' the fragmented support for and limited APIs within HTML5 make this

More information

The End of Personal Computer

The End of Personal Computer The End of Personal Computer Siddartha Reddy N Computer Science Department San Jose State University San Jose, CA 95112 408-668-5452 siddartha.nagireddy@gmail.com ABSTRACT Today, the dominance of the PC

More information

Advanced Rendering for Engineering & Styling

Advanced Rendering for Engineering & Styling Advanced Rendering for Engineering & Styling Prof. B.Brüderlin Brüderlin,, M Heyer 3Dinteractive GmbH & TU-Ilmenau, Germany SGI VizDays 2005, Rüsselsheim Demands in Engineering & Styling Engineering: :

More information

159.735. Final Report. Cluster Scheduling. Submitted by: Priti Lohani 04244354

159.735. Final Report. Cluster Scheduling. Submitted by: Priti Lohani 04244354 159.735 Final Report Cluster Scheduling Submitted by: Priti Lohani 04244354 1 Table of contents: 159.735... 1 Final Report... 1 Cluster Scheduling... 1 Table of contents:... 2 1. Introduction:... 3 1.1

More information

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture Last Class: OS and Computer Architecture System bus Network card CPU, memory, I/O devices, network card, system bus Lecture 3, page 1 Last Class: OS and Computer Architecture OS Service Protection Interrupts

More information

Dynamic Resolution Rendering

Dynamic Resolution Rendering Dynamic Resolution Rendering Doug Binks Introduction The resolution selection screen has been one of the defining aspects of PC gaming since the birth of games. In this whitepaper and the accompanying

More information

MEng, BSc Applied Computer Science

MEng, BSc Applied Computer Science School of Computing FACULTY OF ENGINEERING MEng, BSc Applied Computer Science Year 1 COMP1212 Computer Processor Effective programming depends on understanding not only how to give a machine instructions

More information

Connecting Software. CB Mobile CRM Windows Phone 8. User Manual

Connecting Software. CB Mobile CRM Windows Phone 8. User Manual CB Mobile CRM Windows Phone 8 User Manual Summary This document describes the Windows Phone 8 Mobile CRM app functionality and available features. The document is intended for end users as user manual

More information

An Overview of Amazon Silk Amazon s new cloud-powered browser

An Overview of Amazon Silk Amazon s new cloud-powered browser An Overview of Amazon Silk Amazon s new cloud-powered browser Jon Jenkins Twitter - @jonjenk jjenkin@amazon.com O Reilly Velocity EU November 8, 2011 Web Page Complexity is Steadily Increasing Amazon s

More information

Manjrasoft Market Oriented Cloud Computing Platform

Manjrasoft Market Oriented Cloud Computing Platform Manjrasoft Market Oriented Cloud Computing Platform Innovative Solutions for 3D Rendering Aneka is a market oriented Cloud development and management platform with rapid application development and workload

More information

Motivation: Smartphone Market

Motivation: Smartphone Market Motivation: Smartphone Market Smartphone Systems External Display Device Display Smartphone Systems Smartphone-like system Main Camera Front-facing Camera Central Processing Unit Device Display Graphics

More information

Responsive Web Design. vs. Mobile Web App: What s Best for Your Enterprise? A WhitePaper by RapidValue Solutions

Responsive Web Design. vs. Mobile Web App: What s Best for Your Enterprise? A WhitePaper by RapidValue Solutions Responsive Web Design vs. Mobile Web App: What s Best for Your Enterprise? A WhitePaper by RapidValue Solutions The New Design Trend: Build a Website; Enable Self-optimization Across All Mobile De vices

More information

owncloud Enterprise Edition on IBM Infrastructure

owncloud Enterprise Edition on IBM Infrastructure owncloud Enterprise Edition on IBM Infrastructure A Performance and Sizing Study for Large User Number Scenarios Dr. Oliver Oberst IBM Frank Karlitschek owncloud Page 1 of 10 Introduction One aspect of

More information

HPC with Multicore and GPUs

HPC with Multicore and GPUs HPC with Multicore and GPUs Stan Tomov Electrical Engineering and Computer Science Department University of Tennessee, Knoxville CS 594 Lecture Notes March 4, 2015 1/18 Outline! Introduction - Hardware

More information

Radeon GPU Architecture and the Radeon 4800 series. Michael Doggett Graphics Architecture Group June 27, 2008

Radeon GPU Architecture and the Radeon 4800 series. Michael Doggett Graphics Architecture Group June 27, 2008 Radeon GPU Architecture and the series Michael Doggett Graphics Architecture Group June 27, 2008 Graphics Processing Units Introduction GPU research 2 GPU Evolution GPU started as a triangle rasterizer

More information

FLOATING-POINT ARITHMETIC IN AMD PROCESSORS MICHAEL SCHULTE AMD RESEARCH JUNE 2015

FLOATING-POINT ARITHMETIC IN AMD PROCESSORS MICHAEL SCHULTE AMD RESEARCH JUNE 2015 FLOATING-POINT ARITHMETIC IN AMD PROCESSORS MICHAEL SCHULTE AMD RESEARCH JUNE 2015 AGENDA The Kaveri Accelerated Processing Unit (APU) The Graphics Core Next Architecture and its Floating-Point Arithmetic

More information

LOOKING FOR AN AMAZING PROCESSOR. Product Brief 6th Gen Intel Core Processors for Desktops: S-series

LOOKING FOR AN AMAZING PROCESSOR. Product Brief 6th Gen Intel Core Processors for Desktops: S-series Product Brief 6th Gen Intel Core Processors for Desktops: Sseries LOOKING FOR AN AMAZING PROCESSOR for your next desktop PC? Look no further than 6th Gen Intel Core processors. With amazing performance

More information

Using Mobile Processors for Cost Effective Live Video Streaming to the Internet

Using Mobile Processors for Cost Effective Live Video Streaming to the Internet Using Mobile Processors for Cost Effective Live Video Streaming to the Internet Hans-Joachim Gelke Tobias Kammacher Institute of Embedded Systems Source: Apple Inc. Agenda 1. Typical Application 2. Available

More information

Performance monitoring at CERN openlab. July 20 th 2012 Andrzej Nowak, CERN openlab

Performance monitoring at CERN openlab. July 20 th 2012 Andrzej Nowak, CERN openlab Performance monitoring at CERN openlab July 20 th 2012 Andrzej Nowak, CERN openlab Data flow Reconstruction Selection and reconstruction Online triggering and filtering in detectors Raw Data (100%) Event

More information

Computer Graphics Hardware An Overview

Computer Graphics Hardware An Overview Computer Graphics Hardware An Overview Graphics System Monitor Input devices CPU/Memory GPU Raster Graphics System Raster: An array of picture elements Based on raster-scan TV technology The screen (and

More information

All ju The State of Software Development Today: A Parallel View. June 2012

All ju The State of Software Development Today: A Parallel View. June 2012 All ju The State of Software Development Today: A Parallel View June 2012 2 What is Parallel Programming? When students study computer programming, the normal approach is to learn to program sequentially.

More information

ICN based Scalable Video Conferencing on Virtual Edge Service Routers (VSER) Platform

ICN based Scalable Video Conferencing on Virtual Edge Service Routers (VSER) Platform Security Level: 客 户 伙 伴 在 右 35pt B0 体 : ium rial 32pt B0 体 based Scalable Video Conferencing on Virtual Edge Service Routers (VSER) Platform 配 色 建 议 不 超 以 下 方 案 内 只 用 22pt 色 体 : ular rial Asit Chakraborti,

More information

Bringing Big Data Modelling into the Hands of Domain Experts

Bringing Big Data Modelling into the Hands of Domain Experts Bringing Big Data Modelling into the Hands of Domain Experts David Willingham Senior Application Engineer MathWorks david.willingham@mathworks.com.au 2015 The MathWorks, Inc. 1 Data is the sword of the

More information

separate the content technology display or delivery technology

separate the content technology display or delivery technology Good Morning. In the mobile development space, discussions are often focused on whose winning the mobile technology wars how Android has the greater share of the mobile market or how Apple is has the greatest

More information

Avira Secure Backup INSTALLATION GUIDE. HowTo

Avira Secure Backup INSTALLATION GUIDE. HowTo Avira Secure Backup INSTALLATION GUIDE HowTo Table of contents 1. Introduction... 3 2. System Requirements... 3 2.1 Windows...3 2.2 Mac...4 2.3 ios (iphone, ipad and ipod touch)...4 3. Avira Secure Backup

More information

Performance Analysis of Web-browsing Speed in Smart Mobile Devices

Performance Analysis of Web-browsing Speed in Smart Mobile Devices Performance Analysis of Web-browsing Speed in Smart Mobile Devices Yu-Doo Kim and Il-Young Moon Korea University of Technology and Education, kydman@koreatech.ac.kr Abstract The rapid growth of telecommunication

More information