CLASS: Combined Logic Architecture Soft Error Sensitivity Analysis

Similar documents
Verification of Triple Modular Redundancy (TMR) Insertion for Reliable and Trusted Systems

Implementation Details

9/14/ :38

Digital Systems Design! Lecture 1 - Introduction!!

ETEC 2301 Programmable Logic Devices. Chapter 10 Counters. Shawnee State University Department of Industrial and Engineering Technologies

Memory Elements. Combinational logic cannot remember

High Performance Computing in CST STUDIO SUITE

路 論 Chapter 15 System-Level Physical Design

Fault Modeling. Why model faults? Some real defects in VLSI and PCB Common fault models Stuck-at faults. Transistor faults Summary

Aims and Objectives. E 3.05 Digital System Design. Course Syllabus. Course Syllabus (1) Programmable Logic

INTRODUCTION TO DIGITAL SYSTEMS. IMPLEMENTATION: MODULES (ICs) AND NETWORKS IMPLEMENTATION OF ALGORITHMS IN HARDWARE

Slide Set 8. for ENCM 369 Winter 2015 Lecture Section 01. Steve Norman, PhD, PEng

Distributed communication-aware load balancing with TreeMatch in Charm++

Testing of Digital System-on- Chip (SoC)

Agenda. Michele Taliercio, Il circuito Integrato, Novembre 2001

Introduction to Programmable Logic Devices. John Coughlan RAL Technology Department Detector & Electronics Division

Seeking Opportunities for Hardware Acceleration in Big Data Analytics

Chapter 2 Logic Gates and Introduction to Computer Architecture

Experiment # 9. Clock generator circuits & Counters. Eng. Waleed Y. Mousa

System Requirements Table of contents

CMS Level 1 Track Trigger

The Shortcut Guide to Balancing Storage Costs and Performance with Hybrid Storage

HP reference configuration for entry-level SAS Grid Manager solutions

Sequential Circuit Design

EE361: Digital Computer Organization Course Syllabus

FPGA-based MapReduce Framework for Machine Learning

Lecture-3 MEMORY: Development of Memory:

Latches, the D Flip-Flop & Counter Design. ECE 152A Winter 2012

What is a System on a Chip?

7a. System-on-chip design and prototyping platforms

Lecture 11: Sequential Circuit Design

MICROPROCESSOR. Exclusive for IACE Students iacehyd.blogspot.in Ph: /422 Page 1

Testing & Verification of Digital Circuits ECE/CS 5745/6745. Hardware Verification using Symbolic Computation

Solution: start more than one instruction in the same clock cycle CPI < 1 (or IPC > 1, Instructions per Cycle) Two approaches:

Lecture 10: Sequential Circuits

Quiz for Chapter 1 Computer Abstractions and Technology 3.10

ESP-CV Custom Design Formal Equivalence Checking Based on Symbolic Simulation

Modeling Sequential Elements with Verilog. Prof. Chien-Nan Liu TEL: ext: Sequential Circuit

Architectures and Platforms

Digitale Signalverarbeitung mit FPGA (DSF) Soft Core Prozessor NIOS II Stand Mai Jens Onno Krah

Best Practises for LabVIEW FPGA Design Flow. uk.ni.com ireland.ni.com

Microsoft SQL Server 2012 on Cisco UCS with iscsi-based Storage Access in VMware ESX Virtualization Environment: Performance Study

These help quantify the quality of a design from different perspectives: Cost Functionality Robustness Performance Energy consumption

Chapter 6. Inside the System Unit. What You Will Learn... Computers Are Your Future. What You Will Learn... Describing Hardware Performance

Chapter 4 Register Transfer and Microoperations. Section 4.1 Register Transfer Language

NAND Flash FAQ. Eureka Technology. apn5_87. NAND Flash FAQ

Information Technology Hardware Technician

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Question Bank Subject Name: EC Microprocessor & Microcontroller Year/Sem : II/IV

Topics of Chapter 5 Sequential Machines. Memory elements. Memory element terminology. Clock terminology

Logical Operations. Control Unit. Contents. Arithmetic Operations. Objectives. The Central Processing Unit: Arithmetic / Logic Unit.

what operations can it perform? how does it perform them? on what kind of data? where are instructions and data stored?

Module 2. Embedded Processors and Memory. Version 2 EE IIT, Kharagpur 1

CHAPTER 2: HARDWARE BASICS: INSIDE THE BOX

Use-it or Lose-it: Wearout and Lifetime in Future Chip-Multiprocessors

EXPERIMENT 8. Flip-Flops and Sequential Circuits

Oracle Primavera P6 Enterprise Project Portfolio Management Performance and Sizing Guide. An Oracle White Paper October 2010

CprE 588 Embedded Computer Systems Homework #1 Assigned: February 5 Due: February 15

Photonic Networks for Data Centres and High Performance Computing

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

Introduction to Digital System Design

ON SUITABILITY OF FPGA BASED EVOLVABLE HARDWARE SYSTEMS TO INTEGRATE RECONFIGURABLE CIRCUITS WITH HOST PROCESSING UNIT

Concept of Cache in web proxies

CHAPTER 3 Boolean Algebra and Digital Logic

數 位 積 體 電 路 Digital Integrated Circuits

CS:APP Chapter 4 Computer Architecture. Wrap-Up. William J. Taffe Plymouth State University. using the slides of

Computer Performance. Topic 3. Contents. Prerequisite knowledge Before studying this topic you should be able to:

Extended Boundary Scan Test breaching the analog ban. Marcel Swinnen, teamleader test engineering

Digital Logic Design. Basics Combinational Circuits Sequential Circuits. Pu-Jen Cheng

ASYNCHRONOUS COUNTERS

Lecture 10 Sequential Circuit Design Zhuo Feng. Z. Feng MTU EE4800 CMOS Digital IC Design & Analysis 2010

COMBINATIONAL and SEQUENTIAL LOGIC CIRCUITS Hardware implementation and software design

To Improve Register File Integrity against Soft Errors By using self - Immunity Technique

Accelerating Microsoft Exchange Servers with I/O Caching

EECS 583 Class 11 Instruction Scheduling Software Pipelining Intro

Design Cycle for Microprocessors

Power Reduction Techniques in the SoC Clock Network. Clock Power

Application Note. Line Card Redundancy Design With the XRT83SL38 T1/E1 SH/LH LIU ICs

Lesson 7: SYSTEM-ON. SoC) AND USE OF VLSI CIRCUIT DESIGN TECHNOLOGY. Chapter-1L07: "Embedded Systems - ", Raj Kamal, Publs.: McGraw-Hill Education

PERFORMANCE ENHANCEMENTS IN TreeAge Pro 2014 R1.0

With respect to the way of data access we can classify memories as:

CS311 Lecture: Sequential Circuits

The 104 Duke_ACC Machine

A Study on the Scalability of Hybrid LS-DYNA on Multicore Architectures

IC-EMC Simulation of Electromagnetic Compatibility of Integrated Circuits

Icepak High-Performance Computing at Rockwell Automation: Benefits and Benchmarks

Chapter 4 System Unit Components. Discovering Computers Your Interactive Guide to the Digital World

Chapter 13: Verification

Introduction to CMOS VLSI Design (E158) Lecture 8: Clocking of VLSI Systems

Life Cycle of a Memory Request. Ring Example: 2 requests for lock 17

Design of a High Speed Communications Link Using Field Programmable Gate Arrays

An Ultra-low low energy asynchronous processor for Wireless Sensor Networks

Best Practices for Deploying SSDs in a Microsoft SQL Server 2008 OLTP Environment with Dell EqualLogic PS-Series Arrays

System on Chip Design. Michael Nydegger

Lecture 7: Clocking of VLSI Systems

CHAPTER 7: The CPU and Memory

System-on. on-chip Design Flow. Prof. Jouni Tomberg Tampere University of Technology Institute of Digital and Computer Systems.

Example-driven Interconnect Synthesis for Heterogeneous Coarse-Grain Reconfigurable Logic

Transcription:

CLASS: Combined Logic Architecture Soft Error Sensitivity Analysis Mojtaba Ebrahimi, Liang Chen, Hossein Asadi, and Mehdi B. Tahoori INSTITUTE OF COMPUTER ENGINEERING (ITEC) CHAIR FOR DEPENDABLE NANO COMPUTING (CDNC) 0 KIT University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz Association www.kit.edu

Soft Error Rate (Normalized) Soft Errors Soft Errors caused by radiation-induced particle strikes Strike releases electron & hole pairs Absorbed by source & drain Alter transistor state source + + - + + - - - Particle Strike drain Soft error at logic-level and higher: Bit-flip in flip-flops Transient pulses in combinational gates Transistor Device SER vs. Design Rule [adapted from Ibe10TED] 5 System Soft Error Rate (SER) grows exponentially with Moore s law [Dixit11IRPS][Ibe10TED] Number of transistors per system is proportional to SER 4 3 2 1 0 250 180 130 90 65 45 32 22 Design Rule (nm) 1

Motivation: Importance of SER Estimation Not all soft errors cause system failures Circuit level masking (logical, timing, and electrical) Architecture level masking Application level masking Nodes not equally susceptible to soft errors Finding most susceptible modules Applying selective protection schemes Balancing reliability, cost, and performance SER Estimation technique is needed Rank the nodes according to their SER 2

Motivation: What is missing? Previous Soft Error Rate (SER) estimation techniques Fault injection (time consuming) Analytical (accuracy) Architecture-level techniques regular structures (register-file, cache) Circuit-level techniques irregular structures (controller unit, pipeline) Microprocessors include both regular and irregular structures Independent analysis is not accurate Proposed: CLASS: Combined Logic Architecture Soft error Sensitivity analysis Combines a circuit-level technique with an architecture-level analysis Discrete time Markov chains for handling different error propagation and masking scenarios 3

Outline Preliminaries CLASS Experimental Results Conclusions 4

Outline Preliminaries CLASS Experimental Results Conclusions 5

Analytical Architecture Level SER Estimation Techniques ACE (Architecturally Correct Execution) analysis is the most common technique [Mukherjee03Micro] a.k.a Life-time analysis Lifetime of a bit is divided into : ACE e.g. Program counter almost always in ACE state un-ace (unnecessary for ACE) e.g. Branch predictor always in un-ace state AVF (Architecture Vulnerability Factor) Fraction of time that system is on ACE state Program counter ~ 100%, Branch predictor = 0% ACE analysis overestimates failure rate In some cases up to 7x [George10DSN] Overdesign 6

Analytical Circuit Level SER Estimation Techniques Binary/Algebraic Decision Diagrams (BDD/ADD)-based techniques [Zhang06ISQED] [Norman05TCAD] [Krishnaswamy05DATE] [Miskov08TCAD] Based on decision diagrams Highly accurate Scalability problem Error Probability Propagation (EPP)-based techniques [Asadi06ISCAS] [Fazeli10DSN] [Chen12IOLTS] Based on propagation rules Reasonable accuracy Scalable 7

Shortcoming of Previous techniques Circuit-level techniques consider: A circuit What seen by circuit level techniques ACE cannot completely model effect of irregular structures in error propagation from regular structures 8

Overview Preliminaries CLASS Experimental Results Conclusions 9

CLASS: A Simple Example Primary Inputs D SET CLR 0.4 0.3 Q Q D SET CLR Q Q Primary Ouputs 0.1 0.2 0.6 Memory 1 1 Error at Error Site Error at Error Site 0.2 0.5 0.3 0.2 0.5 0.3 0.4 Masked Masked Propagation from memory error site output contents (Iterative) to its output Saturates in few iterations Latent in Memory Latent in Memory 0.5 0.6 Error at Memory 0.1 Outputs 0.4 Error at Output (Failure) Error at Output (Failure) Discrete Time Markov Chain (DTMC) 10

Overview of CLASS Phase I System Decomposition Phase II Independent Analysis Phase III Joint Analysis Irregular Structures Irregular Structures (Combinational and Sequential Elements) Enhanced EPP Analysis Regular Structures Memory 1 Memory 2 Memory m Memory ACE Analysis (MACE) DTMC 11

Important Error Propagations Propagation from irregular structures to regular structures and vice versa : 1 2 3 Short-term effect: error in memory inputs affects its outputs immediately Long-term effect: error in memory inputs contaminates its contents MVF (Memory Vulnerability Factor): propagation probability of an error from memory contents to its output Primary Inputs Irregular Structure Primary Ouputs Memory Inputs Memory 2 3 1 Memory Outputs 1 Short-term Effect 2 Long-term Effect 3 MVF 12

Computation of Short- and Long-term Effects Error at Port None Probability of Error Output Contents (Short-term) (Long-term) Both A simple memory Address 0 1-SP(WEn) SP(WEn) 0 Data-in 1-SP(WEn) 0 0 SP(Wen) WEn 0 0 0 1 Probability of having short- and long-term effect for each input port of memory Short-term effect Long-term effect 13

Memory ACE (MACE) Looking at memory boundaries Intervals leading to a read access are ACE MVF Word : Fraction of time that it has an ACE state MVF Memory : Average of all MVF Word Write Read Write Read Read Write Write Time ACE unace ACE unace 14

Enhanced Circuit-level Technique CLASS is compatible with different circuit-level techniques It inherits characteristics of employed circuit-level technique Multi-cycle 4-value EPP [Fazeli10DSN] is employed Based on topological traverse of netlist O(n 2 ), Inaccuracy within 2% Circuit-level technique should be enhanced to: EP=1 D SET CLR Model short-term effect of error in memories Compute long-term effect probability (treated as independent error) Q Q 1 0.8=0.8 SP=0.8 A Error Probability (EP) EP=0.8 Address Data-in WEn SP=0.25 Short-term 0.8 0.75=0.6 Memory Lang-term 0.8 0.25=0.2 SP=0.5 B EP=0.3 EP=0.6 CLR Q 0.6 0.5=0.3 MVF EPP(Memory output to primary output) D SET Q EP=0.3 15

Integration of MACE and EPP Using DTMC 1 Primary Inputs Logic Primary Ouputs Independent of error site P ELT 1-MVF Error at Error Site Masked 1-P ELT -P EO P EO Memory A simple system Latent in Memory MVF P MLT Error at Memory Outputs 1-P MLT -P MO P MO Error at Output (Failure) Steady state solution of DTMC Symbol P EO P ELT P MO P MLT MVF Error propagation DTMC Definition P(Error site Primary Outputs) P(Error site Latent in Memory) P(Memory Output Primary Outputs) P(Memory Output Latent in Memory) P(Latent in Memory Memory Outputs) 16

Overall Flow System Specification System Specification System Specification Done by Commercial Tools Done by Commercial Tools CLASS MACE Analysis Application Application Application Post-synthesis Simulation Post-synthesis Simulation VCD File VCD File VCD file Analysis Signal Probabilites HDL Discritpion HDL Discritpion HDL Discritpion Synthesis Synthesis Gate-level Netlist Gate-level Netlist EPP Analysis MVF DTMC Matrix EPPs Vulnerability of Cells Solve Matrix for Each Cell 17

Outline Preliminaries CLASS Experimental Results Conclusions 18

Experimental Setup OpenRISC 1200 processor Synthesis results: 44,480 gates 4,960 flip-flops 4 Memory Units Overall: 49,444 error sites Benchmarks are selected from Mibench: QSort, CRC32, Stringsearch [www.opencores.org] 19

Comparison with Other Techniques EPP [Asadi06ISCAS][Fazeli10DSN] ACE[Biswas05ISCA] implemented for caches and their tags Simulation-based Statistical Fault Injection (SFI) Faults are injected during post-synthesis simulation One fault per running entire application Each FI on average takes 87 seconds Only logic masking is considered Comparable with functional simulation System configuration: Intel Xeon E5520 16 GB RAM 20

Failure Probability Inaccuracy Analysis Regular structures: OpenRISC Regular Structures CLASS ACE Instruction Cache (IC) Instruction Cache Tag (ICT) Data Cache (DC) Data Cache Tag (DCT) Average inaccuracy: 3.4% Maximum inaccuracy : 9.2% Average inaccuracy : 11.7% Maximum inaccuracy : 23.3% CLASS may overestimate or underestimate Due to inaccuracy of 4-value EPP SFI CLASS ACE 21

Failure Probability Inaccuracy Analysis Irregular structures: Three representative blocks of OpenRISC: ALU Instruction Cache FSM (IC FSM) Pipeline fetch unit CLASS EPP Average inaccuracy : 2.1% Maximum inaccuracy : 4.0% Average inaccuracy : 7.2% Maximum inaccuracy : 26.3% Average inaccuracy of individual error sites is less than 7%. SFI CLASS EPP 22

Runtime Individual inaccuracy of 7% needs 196 faults at each error site 49,444 possible error sites 49,444 x 196 x 87 more than 27 years CLASS is 5 orders of magnitude faster than simulation-based SFI Less than one hour Runtime Breakdown: VCD Analysis 85% EPP 14% Solving DTMCs 1% 23

Outline Preliminaries CLASS Experimental Results Conclusions 24

Conclusion CLASS: SER estimation technique for entire systems Unlike existing techniques which only focused on either regular or irregular structures Interactions between regular and irregular structures are considered Compared to SFI: Inaccuracy is less than 7% 5 orders of magnitude faster Multiple times more accurate than EPP (4x) and ACE (3x) 25

Thanks for your attention! Q & A 26

Backup Slides 27

Sources of Inaccuracy Sources of inaccuracy in current implementation Propagation of an error to several memory units Multiple propagation of an error to memory outputs 28

ACE Analysis: Instruction Cache Access Types: Read Miss (RM) and Read Hit (RH) Un-ACE Intervals * RM ACE Intervals: * RH 29 How to Estimate the Soft Error Rate of an Embedded Processor?

ACE Analysis: Data Cache (Write-back) Access types: Read Miss (RM), Read Hit (RH), Write Miss (WM), Write Hit (WH) ACE intervals: * RH Un-ACE intervals: * WH Potentially ACE intervals: * WM, * RM 30

Enhanced EPP Short-term effect of error in memories Long-term effect are considered as propagated as independent error Enhanced to calculate the probability of error propagation to memory inputs Only logic masking is considered Multi-cycle 4-value EPP [Fazeli10DSN] is employed CLASS has no limitation regarding circuit-level techniques 31