Unified Architectural Support for Soft-Error Protection or Software Bug Detection
|
|
|
- Daniel Curtis
- 10 years ago
- Views:
Transcription
1 Unified Architectural Support for Soft-Error Protection or Software Bug Detection Martin Dimitrov and Huiyang Zhou School of Electrical Engineering and Computer Science
2 Motivation It is a great challenge to build reliable computer systems with buggy software and unreliable hardware Software bugs account for 40% of system failures With technology scaling, transient faults are predicted to grow at least in proportion to the number of devices
3 Observation: Proposed Approach Soft-errors and software bugs manifest in similar ways. program localities can be used to detect abnormal behavior (soft-errors or bugs) during program execution. Unified Architectural Support: Utilize a type of value locality Limited Variance in Data Values (LVDV) to detect abnormal behavior A cache-like structure tracks LVDV locality and detects violations
4 Presentation Outline Using localities to detect abnormal behavior Limited variance in data values (LDVD) locality Proposed architectural design Locality based soft-error protection Locality based software-bug detection Summary
5 Using Locality to Detect Abnormal Behavior Execution results of instruction A: 20, 20, 20, 20,..., Execution results of instruction B: 1, 2, 3 1, 2, , 2, 3, 4 Violation of the constant or stride locality hints the possibility of an error, or a software bug.
6 Limited Variance in Data Values (LVDV) Variance between two values is defined as a simple XOR The variance specifies the fraction of bits which vary between the two values the rest of the bits do not change and can be protected
7 An Example of LVDV Locality Execution results of instruction A: 1, The result of 1 XOR 60
8 An Example of LVDV Locality Execution results of instruction A: 1, 60, The result of 60 XOR 12
9 An Example of LVDV Locality Execution results of instruction A: 1, 60, 12, The result of 12 XOR 40
10 An Example of LVDV Locality Execution results of instruction A: 1, 60, 12, 40, The result of 40 XOR 2
11 An Example of LVDV Locality Execution results of instruction A: 1, 60, 12, 40, 2,... No apparent pattern! However, the values are usually within a certain small range portion of result bits protected by LVDV
12 An Example of LVDV Locality Execution results of instruction A: 1, 60, 12, 40, 2,..., The result of 2 XOR
13 LVDV Captures Region Locality Memory addresses produced by instruction A: 0x , 0x x , 0x , 0x Heap memory addresses produced by instruction A exhibit no stride locality portion of result bits protected by LVDV
14 Proposed Architectural Support LVDV table PC Tag Confidence Encoded Variance Last Value The main structure in our architecture is an LVDV table.
15 Interaction of the LVDV table with the processor pipeline
16 Locality-Based Soft-Error Detection When an anomaly is detected, squash the pipeline as in a branch misprediction The anomaly is detected promptly and re-execution fixes the soft-error A single LVDV table can catch soft-errors in the computational logic, control logic such as decoder, renaming table, issue queue and operand selection logic
17 Benefits of Locality-Based Soft-Error Detection Provides information redundancy (no redundant execution required) Provides efficient, opportunistic protection of multiple hardware structures
18 Experimental Methodology Error injection into the Issue Queue and Functional Units Compare our approach to: Implicit Redundancy Through Reuse (IRTR) [Gomaa et al., ISCA 2005] Squash on L2-cache miss (SL2) [Weaver et al., ISCA 2004] Squash on Branch misprediction (BR-squash) [Wang et al., DSN 2005]
19 Strength of the LVDV Locality 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Percentage of Protected Result Bits 1K entries 4k entries 2K entries 8K entries 50 % gap bzip2 gzip perl twolf vpr gcc mcf parser vortex ammp art equake mesa swim wupwise Average
20 Performance Overhead 35.00% 30.00% 25.00% 20.00% 17 % 0.3% 0.0% 0.0% 2.3% 0.5% 0.0% Performance Overheads 0.7% 31 % 0.0% 0.6% 0.0% 0.0% 0.0% 2.1% 2.1% 0.7% LVDV SL2 BR-squash 15.00% 0.0% 0.0% 10.00% 5.00% 0.00% gap bzip2 gzip perl twolf vpr gcc mcf parser vortex ammp art equake mesa swim wupwise H_Mean
21 Soft-Error Protection of Issue Queues 50% 45% 40% 35% Protection of Issue Queue Percent Errors Removed by LVDV Percent Errors Removed by SL2 Percent Errors Removed by BR-squash 28 % 30% 25% 20% 15% 10% 5% 0% gap bzip2 gzip perl twolf vpr gcc mcf parser vortex ammp art equake mesa swim wupwise G_Mean LVDV removes 28% of critical errors MTTF improvement of 39%
22 Soft-Error Protection of Functional Units 70% 60% 50% Protection of Functional Units Percent errors removed by LVDV Percent error removed by IRTR 42 % 40% 30% 20% 10% 0% gap bzip2 gzip perl twolf vpr gcc mcf parser vortex ammp art equake mesa swim wupwise G_Mean LVDV removes up to 61% of critical errors and 42% on average MTTF improvement of up to 156% and 72% on average
23 Locality-Based Software Bug Detection Training phase LVDV table learns invariance information from multiple successful program runs OR from a single long run. Bug detection phase LVDV table detects anomalies during the failing run Anomalies are reported as possible root-causes of a software bug Misses in the LVDV table are also signaled as new-code anomalies We used a 4K entries table and tracked the variance in every store instruction
24 Locality-Based Software Bug Detection Shown to pinpoint latent program bugs Can detect bugs, which do not violate any programming rules Our approach can be viewed as a hardware implementation of DIDUCE [Hangal et al, ICSE 2002] Hardware offers the following advantages: Performance efficiency Binary compatibility Runtime monitoring
25 Case Study Incorrect Bounds Checking char newname[max]; /* convert the filename */ for(i=0;i<strlen(original);i++){ /*bug*/ if( isupper( original[i] ) ){ newname[i]= tolower(original[i]); continue; } newname[i] = original[i]; /*detection*/ } newname[i] = '\0'; /*detection*/ A buffer overflow in polymorph-0.4.0
26 Case Study Warm-up Phase char newname[max]; /* convert the filename */ for(i=0;i<strlen(original);i++){ /*bug*/ if( isupper( original[i] ) ){ newname[i]= tolower(original[i]); continue; } newname[i] = original[i]; /*detection*/ } newname[i] = '\0'; /*detection*/ &newname[3]: 7fffaf33 &newname[6]: 7fffaf36 &newname[11]: 7fffaf3b &newname[2]: 7fffaf32 &newname[17]: 7fffaf41 &newname[9]: 7fffaf39 &newname[5]: 7fffaf35 &newname[16]: 7fffaf40 conf: 15 prev_result: 7fffaf40 encoded variance: 7
27 Case Study Detection Phase char newname[max]; &newname[85]: 7fffaf85 /* convert the filename */ for(i=0;i<strlen(original);i++){ /*bug*/ if( isupper( original[i] ) ){ newname[i]= tolower(original[i]); continue; } newname[i] = original[i]; /*detection*/ } newname[i] = '\0'; /*detection*/ conf: 15 encoded variance: 7 variance: 8
28 Summary of Results 4K LVDV provides similar bug detection capabilities as software approach DIDUCE Also similar number of false-positives as DIDUCE Polymorph bc Ncompress (input 1) ncompress (input 2) gzip DIDUCE K LVDV
29 Summary Soft-errors and software bugs manifest in similar ways during execution We can use localities to detect both Proposed mechanism protects multiple structures from soft-errors ( 39% and 72% MTTF of IQ and FU) Similar bug detection capabilities to DIDUCE (a software based bug detection scheme)
Intel Pentium 4 Processor on 90nm Technology
Intel Pentium 4 Processor on 90nm Technology Ronak Singhal August 24, 2004 Hot Chips 16 1 1 Agenda Netburst Microarchitecture Review Microarchitecture Features Hyper-Threading Technology SSE3 Intel Extended
Performance Characterization of SPEC CPU2006 Integer Benchmarks on x86-64 64 Architecture
Performance Characterization of SPEC CPU2006 Integer Benchmarks on x86-64 64 Architecture Dong Ye David Kaeli Northeastern University Joydeep Ray Christophe Harle AMD Inc. IISWC 2006 1 Outline Motivation
Solution: start more than one instruction in the same clock cycle CPI < 1 (or IPC > 1, Instructions per Cycle) Two approaches:
Multiple-Issue Processors Pipelining can achieve CPI close to 1 Mechanisms for handling hazards Static or dynamic scheduling Static or dynamic branch handling Increase in transistor counts (Moore s Law):
Code Coverage Testing Using Hardware Performance Monitoring Support
Code Coverage Testing Using Hardware Performance Monitoring Support Alex Shye Matthew Iyer Vijay Janapa Reddi Daniel A. Connors Department of Electrical and Computer Engineering University of Colorado
Categories and Subject Descriptors C.1.1 [Processor Architecture]: Single Data Stream Architectures. General Terms Performance, Design.
Enhancing Memory Level Parallelism via Recovery-Free Value Prediction Huiyang Zhou Thomas M. Conte Department of Electrical and Computer Engineering North Carolina State University 1-919-513-2014 {hzhou,
OpenMP in Multicore Architectures
OpenMP in Multicore Architectures Venkatesan Packirisamy, Harish Barathvajasankar Abstract OpenMP is an API (application program interface) used to explicitly direct multi-threaded, shared memory parallelism.
Precise and Accurate Processor Simulation
Precise and Accurate Processor Simulation Harold Cain, Kevin Lepak, Brandon Schwartz, and Mikko H. Lipasti University of Wisconsin Madison http://www.ece.wisc.edu/~pharm Performance Modeling Analytical
INSTRUCTION LEVEL PARALLELISM PART VII: REORDER BUFFER
Course on: Advanced Computer Architectures INSTRUCTION LEVEL PARALLELISM PART VII: REORDER BUFFER Prof. Cristina Silvano Politecnico di Milano [email protected] Prof. Silvano, Politecnico di Milano
Scalable Distributed Register File
Scalable Distributed Register File Ruben Gonzalez, Adrian Cristal, Miquel Pericàs, Alex Veidenbaum and Mateo Valero Computer Architecture Department Technical University of Catalonia (UPC), Barcelona,
CAVA: Using Checkpoint-Assisted Value Prediction to Hide L2 Misses
CAVA: Using Checkpoint-Assisted Value Prediction to Hide L2 Misses LUIS CEZE, KARIN STRAUSS, JAMES TUCK, and JOSEP TORRELLAS University of Illinois at Urbana Champaign and JOSE RENAU University of California,
Chapter 4 Register Transfer and Microoperations. Section 4.1 Register Transfer Language
Chapter 4 Register Transfer and Microoperations Section 4.1 Register Transfer Language Digital systems are composed of modules that are constructed from digital components, such as registers, decoders,
High Efficiency Counter Mode Security Architecture via Prediction and Precomputation
High Efficiency Counter Mode Security Architecture via Prediction and Precomputation Weidong Shi Hsien-Hsin S. Lee Mrinmoy Ghosh Chenghuai Lu Alexandra Boldyreva School of Electrical and Computer Engineering
Overview. CISC Developments. RISC Designs. CISC Designs. VAX: Addressing Modes. Digital VAX
Overview CISC Developments Over Twenty Years Classic CISC design: Digital VAX VAXÕs RISC successor: PRISM/Alpha IntelÕs ubiquitous 80x86 architecture Ð 8086 through the Pentium Pro (P6) RJS 2/3/97 Philosophy
A single register, called the accumulator, stores the. operand before the operation, and stores the result. Add y # add y from memory to the acc
Other architectures Example. Accumulator-based machines A single register, called the accumulator, stores the operand before the operation, and stores the result after the operation. Load x # into acc
Computer Organization and Components
Computer Organization and Components IS5, fall 25 Lecture : Pipelined Processors ssociate Professor, KTH Royal Institute of Technology ssistant Research ngineer, University of California, Berkeley Slides
Run-Time Type Checking for Binary Programs
Run-Time Type Checking for Binary Programs Michael Burrows 1, Stephen N. Freund 2, and Janet L. Wiener 3 1 Microsoft Corporation, 1065 La Avenida, Mountain View, CA 94043 2 Department of Computer Science,
Lecture: Pipelining Extensions. Topics: control hazards, multi-cycle instructions, pipelining equations
Lecture: Pipelining Extensions Topics: control hazards, multi-cycle instructions, pipelining equations 1 Problem 6 Show the instruction occupying each stage in each cycle (with bypassing) if I1 is R1+R2
Fine-Grained User-Space Security Through Virtualization. Mathias Payer and Thomas R. Gross ETH Zurich
Fine-Grained User-Space Security Through Virtualization Mathias Payer and Thomas R. Gross ETH Zurich Motivation Applications often vulnerable to security exploits Solution: restrict application access
COMP 303 MIPS Processor Design Project 4: MIPS Processor Due Date: 11 December 2009 23:59
COMP 303 MIPS Processor Design Project 4: MIPS Processor Due Date: 11 December 2009 23:59 Overview: In the first projects for COMP 303, you will design and implement a subset of the MIPS32 architecture
More on Pipelining and Pipelines in Real Machines CS 333 Fall 2006 Main Ideas Data Hazards RAW WAR WAW More pipeline stall reduction techniques Branch prediction» static» dynamic bimodal branch prediction
Software Obfuscation Scheme based on XOR Encoding Scheme
Software Obfuscation Scheme based on XOR Encoding Scheme KDDI R&D Laboratories, Inc. 27 th Mar. 2008 2008 2008 KDDI R&D Laboratories, Inc. Inc. All All right right reserved 1 My Research Area Software
NetFlow probe on NetFPGA
Verze #1.00, 2008-12-12 NetFlow probe on NetFPGA Introduction With ever-growing volume of data being transferred over the Internet, the need for reliable monitoring becomes more urgent. Monitoring devices
I Control Your Code Attack Vectors Through the Eyes of Software-based Fault Isolation. Mathias Payer, ETH Zurich
I Control Your Code Attack Vectors Through the Eyes of Software-based Fault Isolation Mathias Payer, ETH Zurich Motivation Applications often vulnerable to security exploits Solution: restrict application
Advanced Computer Architecture-CS501. Computer Systems Design and Architecture 2.1, 2.2, 3.2
Lecture Handout Computer Architecture Lecture No. 2 Reading Material Vincent P. Heuring&Harry F. Jordan Chapter 2,Chapter3 Computer Systems Design and Architecture 2.1, 2.2, 3.2 Summary 1) A taxonomy of
A Performance Counter Architecture for Computing Accurate CPI Components
A Performance Counter Architecture for Computing Accurate CPI Components Stijn Eyerman Lieven Eeckhout Tejas Karkhanis James E. Smith ELIS, Ghent University, Belgium ECE, University of Wisconsin Madison
Verification and Validation of Software Components and Component Based Software Systems
Chapter 5 29 Verification and Validation of Software Components and Component Based Christina Wallin Industrial Information Technology Software Engineering Processes ABB Corporate Research [email protected]
1/20/2016 INTRODUCTION
INTRODUCTION 1 Programming languages have common concepts that are seen in all languages This course will discuss and illustrate these common concepts: Syntax Names Types Semantics Memory Management We
CPU Organization and Assembly Language
COS 140 Foundations of Computer Science School of Computing and Information Science University of Maine October 2, 2015 Outline 1 2 3 4 5 6 7 8 Homework and announcements Reading: Chapter 12 Homework:
Systems I: Computer Organization and Architecture
Systems I: Computer Organization and Architecture Lecture : Microprogrammed Control Microprogramming The control unit is responsible for initiating the sequence of microoperations that comprise instructions.
CLASS: Combined Logic Architecture Soft Error Sensitivity Analysis
CLASS: Combined Logic Architecture Soft Error Sensitivity Analysis Mojtaba Ebrahimi, Liang Chen, Hossein Asadi, and Mehdi B. Tahoori INSTITUTE OF COMPUTER ENGINEERING (ITEC) CHAIR FOR DEPENDABLE NANO COMPUTING
CS:APP Chapter 4 Computer Architecture. Wrap-Up. William J. Taffe Plymouth State University. using the slides of
CS:APP Chapter 4 Computer Architecture Wrap-Up William J. Taffe Plymouth State University using the slides of Randal E. Bryant Carnegie Mellon University Overview Wrap-Up of PIPE Design Performance analysis
Chapter 01: Introduction. Lesson 02 Evolution of Computers Part 2 First generation Computers
Chapter 01: Introduction Lesson 02 Evolution of Computers Part 2 First generation Computers Objective Understand how electronic computers evolved during the first generation of computers First Generation
Oracle Solaris Studio Code Analyzer
Oracle Solaris Studio Code Analyzer The Oracle Solaris Studio Code Analyzer ensures application reliability and security by detecting application vulnerabilities, including memory leaks and memory access
PROBLEMS #20,R0,R1 #$3A,R2,R4
506 CHAPTER 8 PIPELINING (Corrisponde al cap. 11 - Introduzione al pipelining) PROBLEMS 8.1 Consider the following sequence of instructions Mul And #20,R0,R1 #3,R2,R3 #$3A,R2,R4 R0,R2,R5 In all instructions,
HOMEWORK # 2 SOLUTIO
HOMEWORK # 2 SOLUTIO Problem 1 (2 points) a. There are 313 characters in the Tamil language. If every character is to be encoded into a unique bit pattern, what is the minimum number of bits required to
On the Importance of Thread Placement on Multicore Architectures
On the Importance of Thread Placement on Multicore Architectures HPCLatAm 2011 Keynote Cordoba, Argentina August 31, 2011 Tobias Klug Motivation: Many possibilities can lead to non-deterministic runtimes...
Computer Architecture TDTS10
why parallelism? Performance gain from increasing clock frequency is no longer an option. Outline Computer Architecture TDTS10 Superscalar Processors Very Long Instruction Word Processors Parallel computers
LSN 2 Computer Processors
LSN 2 Computer Processors Department of Engineering Technology LSN 2 Computer Processors Microprocessors Design Instruction set Processor organization Processor performance Bandwidth Clock speed LSN 2
Instruction Set Architecture (ISA)
Instruction Set Architecture (ISA) * Instruction set architecture of a machine fills the semantic gap between the user and the machine. * ISA serves as the starting point for the design of a new machine
Between Mutual Trust and Mutual Distrust: Practical Fine-grained Privilege Separation in Multithreaded Applications
Between Mutual Trust and Mutual Distrust: Practical Fine-grained Privilege Separation in Multithreaded Applications Jun Wang, Xi Xiong, Peng Liu Penn State Cyber Security Lab 1 An inherent security limitation
CS 6290 I/O and Storage. Milos Prvulovic
CS 6290 I/O and Storage Milos Prvulovic Storage Systems I/O performance (bandwidth, latency) Bandwidth improving, but not as fast as CPU Latency improving very slowly Consequently, by Amdahl s Law: fraction
On Demand Loading of Code in MMUless Embedded System
On Demand Loading of Code in MMUless Embedded System Sunil R Gandhi *. Chetan D Pachange, Jr.** Mandar R Vaidya***, Swapnilkumar S Khorate**** *Pune Institute of Computer Technology, Pune INDIA (Mob- 8600867094;
CPU Performance Evaluation: Cycles Per Instruction (CPI) Most computers run synchronously utilizing a CPU clock running at a constant clock rate:
CPU Performance Evaluation: Cycles Per Instruction (CPI) Most computers run synchronously utilizing a CPU clock running at a constant clock rate: Clock cycle where: Clock rate = 1 / clock cycle f = 1 /C
The Data Quality Continuum*
The Data Quality Continuum* * Adapted from the KDD04 tutorial by Theodore Johnson e Tamraparni Dasu, AT&T Labs Research The Data Quality Continuum Data and information is not static, it flows in a data
The Behavior of Efficient Virtual Machine Interpreters on Modern Architectures
The Behavior of Efficient Virtual Machine Interpreters on Modern Architectures M. Anton Ertl and David Gregg 1 Institut für Computersprachen, TU Wien, A-1040 Wien [email protected] 2 Department
11.1 inspectit. 11.1. inspectit
11.1. inspectit Figure 11.1. Overview on the inspectit components [Siegl and Bouillet 2011] 11.1 inspectit The inspectit monitoring tool (website: http://www.inspectit.eu/) has been developed by NovaTec.
Selection of Techniques and Metrics
Selection of Techniques and Metrics Raj Jain Washington University in Saint Louis Saint Louis, MO 63130 [email protected] These slides are available on-line at: http://www.cse.wustl.edu/~jain/cse567-08/
Cloud Computing. Up until now
Cloud Computing Lecture 11 Virtualization 2011-2012 Up until now Introduction. Definition of Cloud Computing Grid Computing Content Distribution Networks Map Reduce Cycle-Sharing 1 Process Virtual Machines
Compiler-Assisted Binary Parsing
Compiler-Assisted Binary Parsing Tugrul Ince [email protected] PD Week 2012 26 27 March 2012 Parsing Binary Files Binary analysis is common for o Performance modeling o Computer security o Maintenance
Týr: a dependent type system for spatial memory safety in LLVM
Týr: a dependent type system for spatial memory safety in LLVM Vítor De Araújo Álvaro Moreira (orientador) Rodrigo Machado (co-orientador) August 13, 2015 Vítor De Araújo Álvaro Moreira (orientador) Týr:
Stream Processing on GPUs Using Distributed Multimedia Middleware
Stream Processing on GPUs Using Distributed Multimedia Middleware Michael Repplinger 1,2, and Philipp Slusallek 1,2 1 Computer Graphics Lab, Saarland University, Saarbrücken, Germany 2 German Research
CSE 141L Computer Architecture Lab Fall 2003. Lecture 2
CSE 141L Computer Architecture Lab Fall 2003 Lecture 2 Pramod V. Argade CSE141L: Computer Architecture Lab Instructor: TA: Readers: Pramod V. Argade ([email protected]) Office Hour: Tue./Thu. 9:30-10:30
CHAPTER 7: The CPU and Memory
CHAPTER 7: The CPU and Memory The Architecture of Computer Hardware, Systems Software & Networking: An Information Technology Approach 4th Edition, Irv Englander John Wiley and Sons 2010 PowerPoint slides
CS 61C: Great Ideas in Computer Architecture Virtual Memory Cont.
CS 61C: Great Ideas in Computer Architecture Virtual Memory Cont. Instructors: Vladimir Stojanovic & Nicholas Weaver http://inst.eecs.berkeley.edu/~cs61c/ 1 Bare 5-Stage Pipeline Physical Address PC Inst.
A Taxonomy to Enable Error Recovery and Correction in Software
A Taxonomy to Enable Error Recovery and Correction in Software Vilas Sridharan ECE Department rtheastern University 360 Huntington Ave. Boston, MA 02115 [email protected] Dean A. Liberty Advanced Micro
UNIVERSITY OF CALIFORNIA, DAVIS Department of Electrical and Computer Engineering. EEC180B Lab 7: MISP Processor Design Spring 1995
UNIVERSITY OF CALIFORNIA, DAVIS Department of Electrical and Computer Engineering EEC180B Lab 7: MISP Processor Design Spring 1995 Objective: In this lab, you will complete the design of the MISP processor,
picojava TM : A Hardware Implementation of the Java Virtual Machine
picojava TM : A Hardware Implementation of the Java Virtual Machine Marc Tremblay and Michael O Connor Sun Microelectronics Slide 1 The Java picojava Synergy Java s origins lie in improving the consumer
Data Memory Alternatives for Multiscalar Processors
Data Memory Alternatives for Multiscalar Processors Scott E. Breach, T. N. Vijaykumar, Sridhar Gopal, James E. Smith, Gurindar S. Sohi Computer Sciences Department University of Wisconsin-Madison 1210
LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation
: A Compilation Framework for Lifelong Program Analysis & Transformation Chris Lattner Vikram Adve University of Illinois at Urbana-Champaign {lattner,vadve}@cs.uiuc.edu http://llvm.cs.uiuc.edu/ ABSTRACT
BugBench: Benchmarks for Evaluating Bug Detection Tools
BugBench: Benchmarks for Evaluating Bug Detection Tools Shan Lu, Zhenmin Li, Feng Qin, Lin Tan, Pin Zhou and Yuanyuan Zhou Department of Computer Science University of Illinois at Urbana Champaign, Urbana,
