Computer Architecture
|
|
- Amie Carpenter
- 8 years ago
- Views:
Transcription
1 Computer Architecture Slide Sets WS 2013/2014 Prof. Dr. Uwe Brinkschulte M.Sc. Benjamin Betting Part 6 Fundamentals in Performance Evaluation Computer Architecture Part 6 page 1 of 22 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting
2 Why performance evaluation? Comparison of computers Selection of a computer Changes in the configuration of an existing computer (tuning) Design of computers Verification or validation of design desicions Methods for performance evaluation: (1) analytical methods (2) measurements Computer Architecture Part 6 page 2 of 22 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting
3 Aspects for evaluation modularity orthogonality adequacy virtuality symmetry transparency Is the system composed of mostly independent parts, so called modules? Does every module offer an own set of functions to the system? Is one particular function not offered by different modules? Do performance and cost of a module meet its weight for the whole system? Are the physical limits of the hardware modules been repealed to the user? (Examples: virtual memory) It is possible to derive the function of unknown parts from the properties of some known parts of the architecture, e.g. parts of the ISA? Are nonrelevant parts of the architecture been hidden to the user? (Example: transparent coprocessor) Computer Architecture Part 6 page 3 of 22 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting
4 Analytical methods Performance measures: (hypothetical maximaum performance!!) MIPS (Millions of Instructions per Second) MFLOPS (Millions of Floating Point Operations per Sec.) Mix: (as well calculated, not measured) In a mix, the average execution time for each instruction is calculated and scaled by a characteristical weight. Core-Programs: Typical application programs, written for the evaluated computer No measurements, the overall execution time is calculated using the execution times of the single machine instructions Computer Architecture Part 6 page 4 of 22 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting
5 Performance measures runtime = # clock cycles * clock period MIPS (million instruction per second) MIPS = instruction count runtime 10 6 MIPS = instruction count = instruction count clock frequency # clock cycles clock period 10 6 # clock cycles 10 6 MIPS = clock frequency = clock frequency IPC CPI CPI (cycles per instruction) # clock cycles CPI = instruction count MFLOPS (million floating point operations per second) # executed floating point instruction MFLOPS = runtime 10 6 IPC (instructions per cycle) ICP = 1 / CPI Computer Architecture Part 6 page 5 of 22 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting
6 Drawbacks of performance measures CPI, IPC, MIPS and MFLOPS are dependent on the instruction set. CPI, IPC, MIPS and MFLOPS are dependent on the program. CPI, IPC, MIPS and MFLOPS are dependent on the microarchitecture Conclusions: Greater MIPS or MFLOPS ratings do not implicitly mean more performance! It is of vital importance to chose well-suited test applications (benchmarks)! Computer Architecture Part 6 page 6 of 22 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting
7 Measurements Benchmarks Use of existing or synthetic programs to measure the performance These programs are translated and executed on the evaluated computer Therefore, not only the computer hardware, but as well the compiler influences the outcome of a benchmark Monitoring: Monitors are used to observe parts of the computer at run-time Therefore, interesting quantities inside the computer can be measured beside the overall outcome of a benchmark (e.g. cache utilization, network traffic, ) Monitoring can be done by hardware or software Computer Architecture Part 6 page 7 of 22 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting
8 Benchmark terminology benchmark A test program. benchmark suite A set of benchmarks. synthetic benchmark A test program only useful as benchmark. kernel benchmark A very small synthetic benchmark. Usually a time intensive part of a real program is chosen. Kernel benchmarks are well suited for design and simulation but normally unqualified to compare complete systems. benchmark application A complete program additionally used as benchmark. Opposite to synthetic benchmark. Computer Architecture Part 6 page 8 of 22 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting
9 SPEC-Benchmarks SPEC Standard Performance Evaluation Corporation since 1989, consortium of different manufacturer, general purpose computer applications, mainly to measure speed and throughput Several benchmark suites, e.g. SPEC95, SPECweb96, SPEC JVM98 SPEC JBB2000 SPEC CINT 2006 SPEC CFP 2006 Computer Architecture Part 6 page 9 of 22 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting
10 SPECmarks Goal: comparable values for different systems But: single values don't always reflect real relations, therefore only a first indication to select or judge a computer CPU performance plus cache, memory and compiler is measured, the operating system and IO is less relevant Integer test-programs (ANSI C) Floating-point test-programs (Fortran77) SPECmark : this characteristic is the geometric mean of the individual program characteristics contained in the suite Computer Architecture Part 6 page 10 of 22 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting
11 SPEC-CINT2006: 12 Integer test programs (C, C++) name perlbench bzip2 description PERL interpreter bzip compressionsprogram gcc GNU-C-Compiler version 3.2 mcf gobmk hmmer Simplex algorithm for traffic planning AI implementation of the game Go Protein sequence analysis based on a hidden Markov model sjeng libquantum h264ref omnetpp astar xalancbmk Chess program Quantum computer simulator H.264 codec OMNET++ discrete event simulator Route planning XML translator Computer Architecture Part 6 page 11 of 22 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting
12 SPEC-CFP2006: 17 Floating-point test programs (C, C++, FORTRAN) name description bwaves gamess milc zeusmp gromacs cactusadm Fluid dynamics algorithm Quantum chemistry algorithm Physics algorithm Fluid dynamics algorithm Newton's equations of motion Equation solver for Einstein's evolutionary equation leslie3d namd dealll soplex povray calculix GemsFDTD Fluid dynamics algorithm Biomolecular simulation Finite-Elements Simplex algorithm Image rendering Finite-Elements Maxwell equation solver tonto lbm wrf Shinx3 Quantum chemistry Lattice-Bolzmann-simulator Weather modeling Speach recognition Computer Architecture Part 6 page 12 of 22 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting
13 More popular benchmark suites Basic Linear Algebra Subprograms (BLAS): For numerical applications Core of the LINPACK software package to solve lienar equation systems TOP 500 list of the fastest parallel computers Whetstone-Benchmark: Developed in the seventies, a single program with lot of floating-point calculations Dhrystone-Benchmark: Improvement of Whetstone, developed in the eighties Powerstone-Benchmark-Suite: To compare the energy consumption of microprocessors and microcontrollers Computer Architecture Part 6 page 13 of 22 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting
14 Powerstone benchmark suite name description auto bilv bilt compress crc des dhry engine fir_int Vehicle control Logical and shift operations Graphical application UNIX compression program CRC error detection Data encryption Dhrystone Engine control Integer FIR filter g3fax FAX group 3 g721 jpeg pocsag servo summin ucbqsort v42bits whet Audio compression JPEG 24-Bit compression Communication protocol for pagers Hard disc control Hand writing recognition Quick sort Modem operation Whetstone Computer Architecture Part 6 page 14 of 22 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting
15 Monitoring Monitors are components recording the states of a system during its normal operation. Contents of registers, flags, buffers and traffic in data paths are recorded. Monitors are used to observe and debug systems. Computer Architecture Part 6 page 15 of 22 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting
16 Monitoring Generally, monitors can be classified in: a) Hardware monitors A hardware monitor is a separate component which is physically connected to the locations of the target system where measurements take place. Hardware monitors typically consist of comparators and counters to create data, memories to store it and busses for data transport. Thus, hardware monitors use its own resources. Computer Architecture Part 6 page 16 of 22 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting
17 Monitoring b) Software monitors A software monitor is a program, implemented to collect measuring data through interfaces provided by the operation system, the programming languages or application program. A software monitor uses the resources of the observed system to collect, transport and store data. c) Hybrid monitors A hybrid monitor is a mixed hardware and software monitor. Often simple elements like counters and memories are implemented in hardware while more complex observation functions are implemented in software. Computer Architecture Part 6 page 17 of 22 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting
18 Monitoring constraints 1. Accessing information Ideally monitoring is integrated into the hardware and software components of a system during design. Software monitors are cheaper than hardware monitors but they may influence the systems run time behavior. 2. Reaction less monitoring Hardware and most hybrid monitors store the recorded data in their own memories. Software monitors have to use the memories of the observed system. Thus, hardware monitors are more reaction less than software monitors. Computer Architecture Part 6 page 18 of 22 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting
19 Monitoring constraints: 3. Amount of recorded data and its further processing Most purposes, especially debugging, require observations with high resolution. For the accurate analysis of program errors the causing machine instruction has to be identified. For other purposes, e.g. a global performance analysis, a coarser resolution is sufficient. Although it often seems necessary to record observable data on the level of machine instruction execution, this would generate traces much greater than the memory usage of the observed application. Thus, the cost to store this high amount of data and the general difficulties of processing the trace data prohibit a complete recording of traces at machine instruction level. Computer Architecture Part 6 page 19 of 22 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting
20 Instrumentation One way of software monitoring is to insert measuring commands into program code e.g. loop or time counters. This is called instrumentation. Instrumentation can be performed by the user, the compiler, the class library or the operation system. instrumented program computer measure system results measure results Computer Architecture Part 6 page 20 of 22 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting
21 Montitoring overview method direct instrumentation trace driven simulation system state accuracy tools hardware very high Hardware monitor hardware high instrumented program hard- and satisfactory simulation program software + hardware Trace simulation software sufficient simulation program Computer Architecture Part 6 page 21 of 22 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting
22 Typical load-dependent parameters throughput Defines the average number of jobs completed per time unit. A job may be: execution of an instruction or a program, saving a data block or sending a message. utilization Defines the throughput (average number of jobs completed) divided by the maximum possible throughput. response time Defines the average time needed to complete a job. utilization ratio Defines the time spent working on the jobs divided by whole operating time. Computer Architecture Part 6 page 22 of 22 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting
Achieving QoS in Server Virtualization
Achieving QoS in Server Virtualization Intel Platform Shared Resource Monitoring/Control in Xen Chao Peng (chao.p.peng@intel.com) 1 Increasing QoS demand in Server Virtualization Data center & Cloud infrastructure
More informationHow Much Power Oversubscription is Safe and Allowed in Data Centers?
How Much Power Oversubscription is Safe and Allowed in Data Centers? Xing Fu 1,2, Xiaorui Wang 1,2, Charles Lefurgy 3 1 EECS @ University of Tennessee, Knoxville 2 ECE @ The Ohio State University 3 IBM
More informationAn OS-oriented performance monitoring tool for multicore systems
An OS-oriented performance monitoring tool for multicore systems J.C. Sáez, J. Casas, A. Serrano, R. Rodríguez-Rodríguez, F. Castro, D. Chaver, M. Prieto-Matias Department of Computer Architecture Complutense
More informationCPU Performance Evaluation: Cycles Per Instruction (CPI) Most computers run synchronously utilizing a CPU clock running at a constant clock rate:
CPU Performance Evaluation: Cycles Per Instruction (CPI) Most computers run synchronously utilizing a CPU clock running at a constant clock rate: Clock cycle where: Clock rate = 1 / clock cycle f = 1 /C
More informationPerformance Characterization of SPEC CPU2006 Integer Benchmarks on x86-64 64 Architecture
Performance Characterization of SPEC CPU2006 Integer Benchmarks on x86-64 64 Architecture Dong Ye David Kaeli Northeastern University Joydeep Ray Christophe Harle AMD Inc. IISWC 2006 1 Outline Motivation
More informationCompiler-Assisted Binary Parsing
Compiler-Assisted Binary Parsing Tugrul Ince tugrul@cs.umd.edu PD Week 2012 26 27 March 2012 Parsing Binary Files Binary analysis is common for o Performance modeling o Computer security o Maintenance
More informationAnalysis of Memory Sensitive SPEC CPU2006 Integer Benchmarks for Big Data Benchmarking
Analysis of Memory Sensitive SPEC CPU2006 Integer Benchmarks for Big Data Benchmarking Kathlene Hurt and Eugene John Department of Electrical and Computer Engineering University of Texas at San Antonio
More informationTypes of Workloads. Raj Jain. Washington University in St. Louis
Types of Workloads Raj Jain Washington University in Saint Louis Saint Louis, MO 63130 Jain@cse.wustl.edu These slides are available on-line at: http://www.cse.wustl.edu/~jain/cse567-08/ 4-1 Overview!
More informationSchedulability Analysis for Memory Bandwidth Regulated Multicore Real-Time Systems
Schedulability for Memory Bandwidth Regulated Multicore Real-Time Systems Gang Yao, Heechul Yun, Zheng Pei Wu, Rodolfo Pellizzoni, Marco Caccamo, Lui Sha University of Illinois at Urbana-Champaign, USA.
More informationReducing Dynamic Compilation Latency
LLVM 12 - European Conference, London Reducing Dynamic Compilation Latency Igor Böhm Processor Automated Synthesis by iterative Analysis The University of Edinburgh LLVM 12 - European Conference, London
More informationQuiz for Chapter 1 Computer Abstractions and Technology 3.10
Date: 3.10 Not all questions are of equal difficulty. Please review the entire quiz first and then budget your time carefully. Name: Course: Solutions in Red 1. [15 points] Consider two different implementations,
More informationChapter 2. Why is some hardware better than others for different programs?
Chapter 2 1 Performance Measure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding underlying organizational motivation Why is some hardware better than
More informationEEM 486: Computer Architecture. Lecture 4. Performance
EEM 486: Computer Architecture Lecture 4 Performance EEM 486 Performance Purchasing perspective Given a collection of machines, which has the» Best performance?» Least cost?» Best performance / cost? Design
More informationCloud Performance Benchmark Series
Cloud Performance Benchmark Series Amazon EC2 CPU Speed Benchmarks Kalpit Sarda Sumit Sanghrajka Radu Sion ver..7 C l o u d B e n c h m a r k s : C o m p u t i n g o n A m a z o n E C 2 2 1. Overview We
More informationWhen Prefetching Works, When It Doesn t, and Why
When Prefetching Works, When It Doesn t, and Why JAEKYU LEE, HYESOON KIM, and RICHARD VUDUC, Georgia Institute of Technology In emerging and future high-end processor systems, tolerating increasing cache
More informationSecure Cloud Computing: The Monitoring Perspective
Secure Cloud Computing: The Monitoring Perspective Peng Liu Penn State University 1 Cloud Computing is Less about Computer Design More about Use of Computing (UoC) CPU, OS, VMM, PL, Parallel computing
More informationLeistungsanalyse von Rechnersystemen
Center for Information Services and High Performance Computing (ZIH) Leistungsanalyse von Rechnersystemen 29. Oktober 2008 Nöthnitzer Straße 46 Raum 1026 Tel. +49 351-463 - 35048 Holger Brunst (holger.brunst@tu-dresden.de)
More informationCS 147: Computer Systems Performance Analysis
CS 147: Computer Systems Performance Analysis CS 147: Computer Systems Performance Analysis 1 / 39 Overview Overview Overview What is a Workload? Instruction Workloads Synthetic Workloads Exercisers and
More informationCharacterizing the Unique and Diverse Behaviors in Existing and Emerging General-Purpose and Domain-Specific Benchmark Suites
Characterizing the Unique and Diverse Behaviors in Existing and Emerging General-Purpose and Domain-Specific Benchmark Suites Kenneth Hoste Lieven Eeckhout ELIS Department, Ghent University Sint-Pietersnieuwstraat
More informationHow Much Power Oversubscription is Safe and Allowed in Data Centers?
How Much Power Oversubscription is Safe and Allowed in Data Centers? Xing Fu, Xiaorui Wang University of Tennessee, Knoxville, TN 37996 The Ohio State University, Columbus, OH 43210 {xfu1, xwang}@eecs.utk.edu
More informationfind model parameters, to validate models, and to develop inputs for models. c 1994 Raj Jain 7.1
Monitors Monitor: A tool used to observe the activities on a system. Usage: A system programmer may use a monitor to improve software performance. Find frequently used segments of the software. A systems
More informationArchitectures and Platforms
Hardware/Software Codesign Arch&Platf. - 1 Architectures and Platforms 1. Architecture Selection: The Basic Trade-Offs 2. General Purpose vs. Application-Specific Processors 3. Processor Specialisation
More informationA-DRM: Architecture-aware Distributed Resource Management of Virtualized Clusters
A-DRM: Architecture-aware Distributed Resource Management of Virtualized Clusters Hui Wang, Canturk Isci, Lavanya Subramanian, Jongmoo Choi, Depei Qian, Onur Mutlu Beihang University, IBM Thomas J. Watson
More informationInstruction Set Architecture (ISA)
Instruction Set Architecture (ISA) * Instruction set architecture of a machine fills the semantic gap between the user and the machine. * ISA serves as the starting point for the design of a new machine
More informationThe Effect of Input Data on Program Vulnerability
The Effect of Input Data on Program Vulnerability Vilas Sridharan and David R. Kaeli Department of Electrical and Computer Engineering Northeastern University {vilas, kaeli}@ece.neu.edu I. INTRODUCTION
More informationSTAILIZER and Its Effectiveness
STABILIZER: Statistically Sound Performance Evaluation Charlie Curtsinger Emery D. Berger Department of Computer Science University of Massachusetts Amherst Amherst, MA 01003 {charlie,emery}@cs.umass.edu
More informationsecubt : Hacking the Hackers with User-Space Virtualization
secubt : Hacking the Hackers with User-Space Virtualization Mathias Payer Department of Computer Science ETH Zurich Abstract In the age of coordinated malware distribution and zero-day exploits security
More informationLecture 3: Evaluating Computer Architectures. Software & Hardware: The Virtuous Cycle?
Lecture 3: Evaluating Computer Architectures Announcements - Reminder: Homework 1 due Thursday 2/2 Last Time technology back ground Computer elements Circuits and timing Virtuous cycle of the past and
More informationCache Capacity and Memory Bandwidth Scaling Limits of Highly Threaded Processors
Cache Capacity and Memory Bandwidth Scaling Limits of Highly Threaded Processors Jeff Stuecheli 12 Lizy Kurian John 1 1 Department of Electrical and Computer Engineering, University of Texas at Austin
More informationPractical Memory Checking with Dr. Memory
Practical Memory Checking with Dr. Memory Derek Bruening Google bruening@google.com Qin Zhao Massachusetts Institute of Technology qin zhao@csail.mit.edu Abstract Memory corruption, reading uninitialized
More informationUnit 4: Performance & Benchmarking. Performance Metrics. This Unit. CIS 501: Computer Architecture. Performance: Latency vs.
This Unit CIS 501: Computer Architecture Unit 4: Performance & Benchmarking Metrics Latency and throughput Speedup Averaging CPU Performance Performance Pitfalls Slides'developed'by'Milo'Mar0n'&'Amir'Roth'at'the'University'of'Pennsylvania'
More information! Metrics! Latency and throughput. ! Reporting performance! Benchmarking and averaging. ! CPU performance equation & performance trends
This Unit CIS 501 Computer Architecture! Metrics! Latency and throughput! Reporting performance! Benchmarking and averaging Unit 2: Performance! CPU performance equation & performance trends CIS 501 (Martin/Roth):
More informationChapter 3 Operating-System Structures
Contents 1. Introduction 2. Computer-System Structures 3. Operating-System Structures 4. Processes 5. Threads 6. CPU Scheduling 7. Process Synchronization 8. Deadlocks 9. Memory Management 10. Virtual
More informationKerMon: Framework for in-kernel performance and energy monitoring
1 KerMon: Framework for in-kernel performance and energy monitoring Diogo Antão Abstract Accurate on-the-fly characterization of application behavior requires assessing a set of execution related parameters
More informationA Comparison of Capacity Management Schemes for Shared CMP Caches
A Comparison of Capacity Management Schemes for Shared CMP Caches Carole-Jean Wu and Margaret Martonosi Department of Electrical Engineering Princeton University {carolewu, mrm}@princeton.edu Abstract
More informationHQEMU: A Multi-Threaded and Retargetable Dynamic Binary Translator on Multicores
H: A Multi-Threaded and Retargetable Dynamic Binary Translator on Multicores Ding-Yong Hong National Tsing Hua University Institute of Information Science Academia Sinica, Taiwan dyhong@sslab.cs.nthu.edu.tw
More informationGPU Hardware and Programming Models. Jeremy Appleyard, September 2015
GPU Hardware and Programming Models Jeremy Appleyard, September 2015 A brief history of GPUs In this talk Hardware Overview Programming Models Ask questions at any point! 2 A Brief History of GPUs 3 Once
More informationPerformance Impacts of Non-blocking Caches in Out-of-order Processors
Performance Impacts of Non-blocking Caches in Out-of-order Processors Sheng Li; Ke Chen; Jay B. Brockman; Norman P. Jouppi HP Laboratories HPL-2011-65 Keyword(s): Non-blocking cache; MSHR; Out-of-order
More informationBenchmarking the Amazon Elastic Compute Cloud (EC2)
Benchmarking the Amazon Elastic Compute Cloud (EC2) A Major Qualifying Project Report submitted to the Faculty of the WORCESTER POLYTECHNIC INSTITUTE in partial fulfillment of the requirements for the
More informationBest Practises for LabVIEW FPGA Design Flow. uk.ni.com ireland.ni.com
Best Practises for LabVIEW FPGA Design Flow 1 Agenda Overall Application Design Flow Host, Real-Time and FPGA LabVIEW FPGA Architecture Development FPGA Design Flow Common FPGA Architectures Testing and
More informationSchool of Computer Science
School of Computer Science Computer Science - Honours Level - 2014/15 October 2014 General degree students wishing to enter 3000- level modules and non- graduating students wishing to enter 3000- level
More informationVoltage Smoothing: Characterizing and Mitigating Voltage Noise in Production Processors via Software-Guided Thread Scheduling
Voltage Smoothing: Characterizing and Mitigating Voltage Noise in Production Processors via Software-Guided Thread Scheduling Vijay Janapa Reddi, Svilen Kanev, Wonyoung Kim, Simone Campanoni, Michael D.
More informationTHE NAS KERNEL BENCHMARK PROGRAM
THE NAS KERNEL BENCHMARK PROGRAM David H. Bailey and John T. Barton Numerical Aerodynamic Simulations Systems Division NASA Ames Research Center June 13, 1986 SUMMARY A benchmark test program that measures
More informationon an system with an infinite number of processors. Calculate the speedup of
1. Amdahl s law Three enhancements with the following speedups are proposed for a new architecture: Speedup1 = 30 Speedup2 = 20 Speedup3 = 10 Only one enhancement is usable at a time. a) If enhancements
More information64-Bit versus 32-Bit CPUs in Scientific Computing
64-Bit versus 32-Bit CPUs in Scientific Computing Axel Kohlmeyer Lehrstuhl für Theoretische Chemie Ruhr-Universität Bochum March 2004 1/25 Outline 64-Bit and 32-Bit CPU Examples
More informationProfessional Organization Checklist for the Computer Science Curriculum Updates. Association of Computing Machinery Computing Curricula 2008
Professional Organization Checklist for the Computer Science Curriculum Updates Association of Computing Machinery Computing Curricula 2008 The curriculum guidelines can be found in Appendix C of the report
More informationComputer Architecture
Computer Architecture Slide Sets WS 2013/2014 Prof. Dr. Uwe Brinkschulte M.Sc. Benjamin Betting Part 11 Memory Management Computer Architecture Part 11 page 1 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin
More informationFine-Grained User-Space Security Through Virtualization. Mathias Payer and Thomas R. Gross ETH Zurich
Fine-Grained User-Space Security Through Virtualization Mathias Payer and Thomas R. Gross ETH Zurich Motivation Applications often vulnerable to security exploits Solution: restrict application access
More informationDynamic Virtual Machine Scheduling in Clouds for Architectural Shared Resources
Dynamic Virtual Machine Scheduling in Clouds for Architectural Shared Resources JeongseobAhn,Changdae Kim, JaeungHan,Young-ri Choi,and JaehyukHuh KAIST UNIST {jeongseob, cdkim, juhan, and jhuh}@calab.kaist.ac.kr
More informationOn the Importance of Thread Placement on Multicore Architectures
On the Importance of Thread Placement on Multicore Architectures HPCLatAm 2011 Keynote Cordoba, Argentina August 31, 2011 Tobias Klug Motivation: Many possibilities can lead to non-deterministic runtimes...
More informationArchitectural Support for Software-Defined Metadata Processing
Architectural Support for Software-Defined Metadata Processing Udit Dhawan 1 Cătălin Hriţcu 2 Raphael Rubin 1 Nikos Vasilakis 1 Silviu Chiricescu 3 Jonathan M. Smith 1 Thomas F. Knight Jr. 4 Benjamin C.
More informationFACT: a Framework for Adaptive Contention-aware Thread migrations
FACT: a Framework for Adaptive Contention-aware Thread migrations Kishore Kumar Pusukuri Department of Computer Science and Engineering University of California, Riverside, CA 92507. kishore@cs.ucr.edu
More informationLinear-time Modeling of Program Working Set in Shared Cache
Linear-time Modeling of Program Working Set in Shared Cache Xiaoya Xiang, Bin Bao, Chen Ding, Yaoqing Gao Computer Science Department, University of Rochester IBM Toronto Software Lab {xiang,bao,cding}@cs.rochester.edu,
More informationADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-12: ARM
ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-12: ARM 1 The ARM architecture processors popular in Mobile phone systems 2 ARM Features ARM has 32-bit architecture but supports 16 bit
More informationChapter 1 Computer System Overview
Operating Systems: Internals and Design Principles Chapter 1 Computer System Overview Eighth Edition By William Stallings Operating System Exploits the hardware resources of one or more processors Provides
More informationDELL VS. SUN SERVERS: R910 PERFORMANCE COMPARISON SPECint_rate_base2006
DELL VS. SUN SERVERS: R910 PERFORMANCE COMPARISON OUR FINDINGS The latest, most powerful Dell PowerEdge servers deliver better performance than Sun SPARC Enterprise servers. In Principled Technologies
More informationMemory Bandwidth Management for Efficient Performance Isolation in Multi-core Platforms
.9/TC.5.5889, IEEE Transactions on Computers Memory Bandwidth Management for Efficient Performance Isolation in Multi-core Platforms Heechul Yun, Gang Yao, Rodolfo Pellizzoni, Marco Caccamo, Lui Sha University
More informationA Survey on ARM Cortex A Processors. Wei Wang Tanima Dey
A Survey on ARM Cortex A Processors Wei Wang Tanima Dey 1 Overview of ARM Processors Focusing on Cortex A9 & Cortex A15 ARM ships no processors but only IP cores For SoC integration Targeting markets:
More informationMemory Access Control in Multiprocessor for Real-time Systems with Mixed Criticality
Memory Access Control in Multiprocessor for Real-time Systems with Mixed Criticality Heechul Yun +, Gang Yao +, Rodolfo Pellizzoni *, Marco Caccamo +, Lui Sha + University of Illinois at Urbana and Champaign
More informationSubject knowledge requirements for entry into computer science teacher training. Expert group s recommendations
Subject knowledge requirements for entry into computer science teacher training Expert group s recommendations Introduction To start a postgraduate primary specialist or secondary ITE course specialising
More informationFive Families of ARM Processor IP
ARM1026EJ-S Synthesizable ARM10E Family Processor Core Eric Schorn CPU Product Manager ARM Austin Design Center Five Families of ARM Processor IP Performance ARM preserves SW & HW investment through code
More informationMEng, BSc Computer Science with Artificial Intelligence
School of Computing FACULTY OF ENGINEERING MEng, BSc Computer Science with Artificial Intelligence Year 1 COMP1212 Computer Processor Effective programming depends on understanding not only how to give
More informationOperating Systems, 6 th ed. Test Bank Chapter 7
True / False Questions: Chapter 7 Memory Management 1. T / F In a multiprogramming system, main memory is divided into multiple sections: one for the operating system (resident monitor, kernel) and one
More informationMEng, BSc Applied Computer Science
School of Computing FACULTY OF ENGINEERING MEng, BSc Applied Computer Science Year 1 COMP1212 Computer Processor Effective programming depends on understanding not only how to give a machine instructions
More informationLast Class: OS and Computer Architecture. Last Class: OS and Computer Architecture
Last Class: OS and Computer Architecture System bus Network card CPU, memory, I/O devices, network card, system bus Lecture 3, page 1 Last Class: OS and Computer Architecture OS Service Protection Interrupts
More informationGEDAE TM - A Graphical Programming and Autocode Generation Tool for Signal Processor Applications
GEDAE TM - A Graphical Programming and Autocode Generation Tool for Signal Processor Applications Harris Z. Zebrowitz Lockheed Martin Advanced Technology Laboratories 1 Federal Street Camden, NJ 08102
More informationMasters in Human Computer Interaction
Masters in Human Computer Interaction Programme Requirements Taught Element, and PG Diploma in Human Computer Interaction: 120 credits: IS5101 CS5001 CS5040 CS5041 CS5042 or CS5044 up to 30 credits from
More informationMasters in Advanced Computer Science
Masters in Advanced Computer Science Programme Requirements Taught Element, and PG Diploma in Advanced Computer Science: 120 credits: IS5101 CS5001 up to 30 credits from CS4100 - CS4450, subject to appropriate
More informationLattice QCD Performance. on Multi core Linux Servers
Lattice QCD Performance on Multi core Linux Servers Yang Suli * Department of Physics, Peking University, Beijing, 100871 Abstract At the moment, lattice quantum chromodynamics (lattice QCD) is the most
More informationFigure 1: Graphical example of a mergesort 1.
CSE 30321 Computer Architecture I Fall 2011 Lab 02: Procedure Calls in MIPS Assembly Programming and Performance Total Points: 100 points due to its complexity, this lab will weight more heavily in your
More informationMasters in Artificial Intelligence
Masters in Artificial Intelligence Programme Requirements Taught Element, and PG Diploma in Artificial Intelligence: 120 credits: IS5101 CS5001 CS5010 CS5011 CS4402 or CS5012 in total, up to 30 credits
More informationİSTANBUL AYDIN UNIVERSITY
İSTANBUL AYDIN UNIVERSITY FACULTY OF ENGİNEERİNG SOFTWARE ENGINEERING THE PROJECT OF THE INSTRUCTION SET COMPUTER ORGANIZATION GÖZDE ARAS B1205.090015 Instructor: Prof. Dr. HASAN HÜSEYİN BALIK DECEMBER
More informationCloud Computing. Adam Barker
Cloud Computing Adam Barker 1 Overview Introduction to Cloud computing Enabling technologies Different types of cloud: IaaS, PaaS and SaaS Cloud terminology Interacting with a cloud: management consoles
More informationIs there any alternative to Exadata X5? March 2015
Is there any alternative to Exadata X5? March 2015 Contents 1 About Benchware Ltd. 2 Licensing 3 Scalability 4 Exadata Specifics 5 Performance 6 Costs 7 Myths 8 Conclusion copyright 2015 by benchware.ch
More informationCSEE W4824 Computer Architecture Fall 2012
CSEE W4824 Computer Architecture Fall 2012 Lecture 2 Performance Metrics and Quantitative Principles of Computer Design Luca Carloni Department of Computer Science Columbia University in the City of New
More information2: Computer Performance
2: Computer Performance http://people.sc.fsu.edu/ jburkardt/presentations/ fdi 2008 lecture2.pdf... John Information Technology Department Virginia Tech... FDI Summer Track V: Parallel Programming 10-12
More informationAchieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging
Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging In some markets and scenarios where competitive advantage is all about speed, speed is measured in micro- and even nano-seconds.
More informationSolution: start more than one instruction in the same clock cycle CPI < 1 (or IPC > 1, Instructions per Cycle) Two approaches:
Multiple-Issue Processors Pipelining can achieve CPI close to 1 Mechanisms for handling hazards Static or dynamic scheduling Static or dynamic branch handling Increase in transistor counts (Moore s Law):
More informationTesting & Assuring Mobile End User Experience Before Production. Neotys
Testing & Assuring Mobile End User Experience Before Production Neotys Agenda Introduction The challenges Best practices NeoLoad mobile capabilities Mobile devices are used more and more At Home In 2014,
More informationMaximizing Hadoop Performance and Storage Capacity with AltraHD TM
Maximizing Hadoop Performance and Storage Capacity with AltraHD TM Executive Summary The explosion of internet data, driven in large part by the growth of more and more powerful mobile devices, has created
More informationOKLAHOMA SUBJECT AREA TESTS (OSAT )
CERTIFICATION EXAMINATIONS FOR OKLAHOMA EDUCATORS (CEOE ) OKLAHOMA SUBJECT AREA TESTS (OSAT ) FIELD 081: COMPUTER SCIENCE September 2008 Subarea Range of Competencies I. Computer Use in Educational Environments
More informationA Lab Course on Computer Architecture
A Lab Course on Computer Architecture Pedro López José Duato Depto. de Informática de Sistemas y Computadores Facultad de Informática Universidad Politécnica de Valencia Camino de Vera s/n, 46071 - Valencia,
More informationMaximize Performance and Scalability of RADIOSS* Structural Analysis Software on Intel Xeon Processor E7 v2 Family-Based Platforms
Maximize Performance and Scalability of RADIOSS* Structural Analysis Software on Family-Based Platforms Executive Summary Complex simulations of structural and systems performance, such as car crash simulations,
More informationApplications to Computational Financial and GPU Computing. May 16th. Dr. Daniel Egloff +41 44 520 01 17 +41 79 430 03 61
F# Applications to Computational Financial and GPU Computing May 16th Dr. Daniel Egloff +41 44 520 01 17 +41 79 430 03 61 Today! Why care about F#? Just another fashion?! Three success stories! How Alea.cuBase
More informationA NOVEL RESOURCE EFFICIENT DMMS APPROACH
A NOVEL RESOURCE EFFICIENT DMMS APPROACH FOR NETWORK MONITORING AND CONTROLLING FUNCTIONS Golam R. Khan 1, Sharmistha Khan 2, Dhadesugoor R. Vaman 3, and Suxia Cui 4 Department of Electrical and Computer
More informationOnline Adaptation for Application Performance and Efficiency
Online Adaptation for Application Performance and Efficiency A Dissertation Proposal by Jason Mars 20 November 2009 Submitted to the graduate faculty of the Department of Computer Science at the University
More informationApplying Data Analysis to Big Data Benchmarks. Jazmine Olinger
Applying Data Analysis to Big Data Benchmarks Jazmine Olinger Abstract This paper describes finding accurate and fast ways to simulate Big Data benchmarks. Specifically, using the currently existing simulation
More informationE6895 Advanced Big Data Analytics Lecture 14:! NVIDIA GPU Examples and GPU on ios devices
E6895 Advanced Big Data Analytics Lecture 14: NVIDIA GPU Examples and GPU on ios devices Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science IBM Chief Scientist,
More informationPerformance evaluation
Performance evaluation Arquitecturas Avanzadas de Computadores - 2547021 Departamento de Ingeniería Electrónica y de Telecomunicaciones Facultad de Ingeniería 2015-1 Bibliography and evaluation Bibliography
More informationWiggins/Redstone: An On-line Program Specializer
Wiggins/Redstone: An On-line Program Specializer Dean Deaver Rick Gorton Norm Rubin {dean.deaver,rick.gorton,norm.rubin}@compaq.com Hot Chips 11 Wiggins/Redstone 1 W/R is a Software System That: u Makes
More informationDigitale Signalverarbeitung mit FPGA (DSF) Soft Core Prozessor NIOS II Stand Mai 2007. Jens Onno Krah
(DSF) Soft Core Prozessor NIOS II Stand Mai 2007 Jens Onno Krah Cologne University of Applied Sciences www.fh-koeln.de jens_onno.krah@fh-koeln.de NIOS II 1 1 What is Nios II? Altera s Second Generation
More informationPrecise and Accurate Processor Simulation
Precise and Accurate Processor Simulation Harold Cain, Kevin Lepak, Brandon Schwartz, and Mikko H. Lipasti University of Wisconsin Madison http://www.ece.wisc.edu/~pharm Performance Modeling Analytical
More informationWhat is LOG Storm and what is it useful for?
What is LOG Storm and what is it useful for? LOG Storm is a high-speed digital data logger used for recording and analyzing the activity from embedded electronic systems digital bus and data lines. It
More informationComputing Performance Benchmarks among CPU, GPU, and FPGA
Computing Performance Benchmarks among CPU, GPU, and FPGA MathWorks Authors: Christopher Cullinan Christopher Wyant Timothy Frattesi Advisor: Xinming Huang Abstract In recent years, the world of high performance
More informationSIPAC. Signals and Data Identification, Processing, Analysis, and Classification
SIPAC Signals and Data Identification, Processing, Analysis, and Classification Framework for Mass Data Processing with Modules for Data Storage, Production and Configuration SIPAC key features SIPAC is
More informationBenchmarking Large Scale Cloud Computing in Asia Pacific
2013 19th IEEE International Conference on Parallel and Distributed Systems ing Large Scale Cloud Computing in Asia Pacific Amalina Mohamad Sabri 1, Suresh Reuben Balakrishnan 1, Sun Veer Moolye 1, Chung
More informationLast Class: OS and Computer Architecture. Last Class: OS and Computer Architecture
Last Class: OS and Computer Architecture System bus Network card CPU, memory, I/O devices, network card, system bus Lecture 3, page 1 Last Class: OS and Computer Architecture OS Service Protection Interrupts
More informationUnderstanding applications using the BSC performance tools
Understanding applications using the BSC performance tools Judit Gimenez (judit@bsc.es) German Llort(german.llort@bsc.es) Humans are visual creatures Films or books? Two hours vs. days (months) Memorizing
More information