Chapters 3 and 4 Fault Tree Analysis

Similar documents
Chapter 9 Reliability Centered Maintenance

Fault Tree Analysis (FTA) and Event Tree Analysis (ETA)

Failure Analysis Methods What, Why and How. MEEG 466 Special Topics in Design Jim Glancey Spring, 2006

Controlling Risks Risk Assessment

University of Ljubljana. Faculty of Computer and Information Science

University of Paderborn Software Engineering Group II-25. Dr. Holger Giese. University of Paderborn Software Engineering Group. External facilities

FMEA and FTA Analysis

RISK ASSESMENT: FAULT TREE ANALYSIS

An Introduction to Fault Tree Analysis (FTA)

3.0 Risk Assessment and Analysis Techniques and Tools

Reliability Reliability Models. Chapter 26 Page 1

Independent analysis methods for Data Centers

Risk Management Tools

Value Paper Author: Edgar C. Ramirez. Diverse redundancy used in SIS technology to achieve higher safety integrity

Fault Tree Analysis (FTA): Concepts and Applications

Mary Ann Lundteigen. Doctoral theses at NTNU, 2009:9 Mary Ann Lundteigen. Doctoral theses at NTNU, 2009:9

Table F-1 NEI MSO IDENTIFICATION CHECKLIST

Generic Reliability Evaluation Method for Industrial Grids with Variable Frequency Drives

IAEA Training Course on Safety Assessment of NPPs to Assist Decision Making. System Analysis. Lecturer. Workshop Information IAEA Workshop

ASSESSMENT OF HUMAN ERROR IMPORTANCE IN PWR PSA

A Methodology for Safety Critical Software Systems Planning

Algebra I Notes Relations and Functions Unit 03a

Occupational safety risk management in Australian mining

Safety Integrity Level (SIL) Assessment as key element within the plant design

RISK MANAGEMENT FOR INFRASTRUCTURE

SIL manual. Structure. Structure

Mitigating safety risk and maintaining operational reliability

Engineering Risk Benefit Analysis

5.1 Identifying the Target Parameter

6.4 Normal Distribution

System Specification. Objectives

Collated Food Requirements. Received orders. Resolved orders. 4 Check for discrepancies * Unmatched orders

Quality Risk Management Tools Quality Risk Management Tool Selection When to Select FMEA: QRM Tool Selection Matrix

FUNCTIONAL SAFETY CERTIFICATE

Calculation of Risk Factor Using the Excel spreadsheet Calculation of Risk Factor.xls

Binary Adders: Half Adders and Full Adders

Experiment # 9. Clock generator circuits & Counters. Eng. Waleed Y. Mousa

Heat. Investigating the function of the expansion valve of the heat pump. LD Physics Leaflets P Thermodynamic cycle Heat pump

Risk Assessment & Management

On-Site Risk Management Audit Checklist for Program Level 3 Process

Systems Assurance Management in Railway through the Project Life Cycle

Viewpoint on ISA TR Simplified Methods and Fault Tree Analysis Angela E. Summers, Ph.D., P.E., President

Safety of Nuclear Power Plant : Human Reliability Analysis (HRA) Prof. Wolfgang Kröger

Absolute Value Equations and Inequalities

SFTA, SFMECA AND STPA APPLIED TO BRAZILIAN SPACE SOFTWARE

4. Critical success factors/objectives of the activity/proposal/project being risk assessed

Product Excellence using 6 Sigma (PEUSS)

Managing Design Changes using Safety-Guided Design for a Safety Critical Automotive System

Prepared for NASA Office of Safety and Mission Assurance NASA Headquarters Washington, DC 20546

Creating, Solving, and Graphing Systems of Linear Equations and Linear Inequalities

Lesson 12 Sequential Circuits: Flip-Flops

Embedded Real-Time Systems (TI-IRTS) Safety and Reliability Patterns B.D. Chapter

Greatest Common Factor and Least Common Multiple

Directory. Maintenance Engineering

Programming Logic controllers

Phase A Aleutian Islands Risk Assessment. Options and Recommended Risk Matrix Approach. April 27, 2010

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences

An Introduction to. Metrics. used during. Software Development

(Refer Slide Time 00:56)

Basic Logic Gates. Logic Gates. andgate: accepts two binary inputs x and y, emits x & y. orgate: accepts two binary inputs x and y, emits x y

HEALTH & SAFETY EXECUTIVE NUCLEAR DIRECTORATE ASSESSMENT REPORT. New Reactor Build. EDF/AREVA EPR Step 2 PSA Assessment

USING INSTRUMENTED SYSTEMS FOR OVERPRESSURE PROTECTION. Dr. Angela E. Summers, PE. SIS-TECH Solutions, LLC Houston, TX

Risk. Risk Categories. Project Risk (aka Development Risk) Technical Risk Business Risk. Lecture 5, Part 1: Risk

Reliability Block Diagram RBD

Human Reliability Analysis. Workshop Information IAEA Workshop

Fleetwide Monitoring State of the Art Collaboration Meets Predictive Analytics for Optimizing the Entire Fleet

IAEA Training in level 1 PSA and PSA applications. PSA applications. PSA-based evaluation and rating of operational events

Basic Fundamentals Of Safety Instrumented Systems

CHAPTER 15 NOMINAL MEASURES OF CORRELATION: PHI, THE CONTINGENCY COEFFICIENT, AND CRAMER'S V

Overview of IEC Design of electrical / electronic / programmable electronic safety-related systems

Module 3: Floyd, Digital Fundamental

6-1. Process Modeling

Chapter 9: Analysis Techniques

Root cause analysis. Chartered Institute of Internal Auditors

ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS

3.4.4 Description of risk management plan Unofficial Translation Only the Thai version of the text is legally binding.

C1-108 ASSESSING THE IMPACT OF MAINTENANCE STRATEGIES ON SUPPLY RELIABILITY IN ASSET MANAGEMENT METHODS

DNV GL Assessment Checklist ISO 9001:2015

INCIDENT INVESTIGATION BASED ON CAUSALITY NETWORKS

ISO 9001:2015 Internal Audit Checklist

Safety Integrity Levels

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

L.S. de Carvalho, J. M de Oliveira Neto 1. INTRODUCTION IAEA-CN-164-5P01

5.1 Radical Notation and Rational Exponents

SECOND EDITION THE SECURITY RISK ASSESSMENT HANDBOOK. A Complete Guide for Performing Security Risk Assessments DOUGLAS J. LANDOLL

Improved Utilization of Self-Inspection Programs within the GMP Environment A Quality Risk Management Approach

PROGRAMMABLE LOGIC CONTROL

Item ToolKit Technical Support Notes

Software Safety Tutorial (Status Update)

Effects of Ultra-High Turndown in Hydronic Boilers By Sean Lobdell and Brian Huibregtse January 2014

IT Disaster Recovery and Business Resumption Planning Standards

Techneau, 10. December Risk Evaluation and Decision Support for Drinking Water Systems

QA GLP audits. Alain Piton. Antipolis,, France.

Electrical Safety Tester Verification

Nuclear power plant systems, structures and components and their safety classification. 1 General 3. 2 Safety classes 3. 3 Classification criteria 3

PRA Application to Offshore Drilling Critical Systems

Operating instructions Diffuse reflection sensor. OJ50xx / / 2004

Developing software which should never compromise the overall safety of a system

Transcription:

Chapters 3 and 4 Fault Tree Analysis Marvin Rausand marvin.rausand@ntnu.no RAMS Group Department of Production and Quality Engineering NTNU (Version 0.1) Marvin Rausand (RAMS Group) System Reliability Theory (Version 0.1) 1 / 29

Slides related to the book System Reliability Theory Models, Statistical Methods, and Applications Wiley, 2004 Homepage of the book: http://www.ntnu.edu/ross/ books/srt Marvin Rausand (RAMS Group) System Reliability Theory (Version 0.1) 2 / 29

Learning objectives To learn the key terms and concepts related to fault tree analysis (FTA). To learn how to construct a fault tree. To understand how FTA can be used to identify possible cause of a specified undesired event (i.e., system failure or accident) in a system. To become familiar with the concept of minimal cut sets and understand the significance of a minimal cut set. To become familiar with the input data required for a quantitative FTA. To understand how a quantitative FTA is carried out. Marvin Rausand (RAMS Group) System Reliability Theory (Version 0.1) 3 / 29

What is fault tree analysis? Fault tree analysis (FTA) is a top-down approach to failure analysis, starting with a potential undesirable event (accident) called a TOP event, and then determining all the ways it can happen. The analysis proceeds by determining how the TOP event can be caused by individual or combined lower level failures or events. The causes of the TOP event are connected through logic gates In this book we only consider AND-gates and OR-gates FTA is the most commonly used technique for causal analysis in risk and reliability studies. Marvin Rausand (RAMS Group) System Reliability Theory (Version 0.1) 4 / 29

History FTA was first used by Bell Telephone Laboratories in connection with the safety analysis of the Minuteman missile launch control system in 1962 Technique improved by Boeing Company Extensively used and extended during the Reactor safety study (WASH 1400) Marvin Rausand (RAMS Group) System Reliability Theory (Version 0.1) 5 / 29

FTA main steps Definition of the system, the TOP event (the potential accident), and the boundary conditions Construction of the fault tree Identification of the minimal cut sets Qualitative analysis of the fault tree Quantitative analysis of the fault tree Reporting of results Marvin Rausand (RAMS Group) System Reliability Theory (Version 0.1) 6 / 29

Preparation for FTA The starting point of an FTA is often an existing FMECA and a system block diagram The FMECA is an essential first step in understanding the system The design, operation, and environment of the system must be evaluated The cause and effect relationships leading to the TOP event must be identified and understood Marvin Rausand (RAMS Group) System Reliability Theory (Version 0.1) 7 / 29

Preparation for FTA System block diagram FMECA Fault tree Marvin Rausand (RAMS Group) System Reliability Theory (Version 0.1) 8 / 29

Boundary conditions The physical boundaries of the system (Which parts of the system are included in the analysis, and which parts are not?) The initial conditions (What is the operational stat of the system when the TOP event is occurring?) Boundary conditions with respect to external stresses (What type of external stresses should be included in the analysis war, sabotage, earthquake, lightning, etc?) The level of resolution (How detailed should the analysis be?) Marvin Rausand (RAMS Group) System Reliability Theory (Version 0.1) 9 / 29

Fault tree construction Define the TOP event in a clear and unambiguous way. Should always answer: What e.g., Fire Where e.g., in the process oxidation reactor When e.g., during normal operation What are the immediate, necessary, and sufficient events and conditions causing the TOP event? Connect via AND- or OR-gate Proceed in this way to an appropriate level (= basic events) Appropriate level: Independent basic events Events for which we have failure data Marvin Rausand (RAMS Group) System Reliability Theory (Version 0.1) 10 / 29

Fault tree symbols The OR-gate indicates that the output event occurs if any of the input events occur Logic gates Input events (states) Description of state OR-gate AND-gate The AND-gate indicates that the output event occurs only if all the input events occur at the same time The basic event represents a basic equipment failure that requires no further development of failure causes The undeveloped event represents an event that is not examined further because information is unavailable or because its consequences are insignificant The comment rectangle is for supplementary information Transfer symbols Transfer out Transfer in The transfer-out symbol indicates that the fault tree is developed further at the occurrence of the corresponding transfer-in symbol Marvin Rausand (RAMS Group) System Reliability Theory (Version 0.1) 11 / 29

Example: Redundant fire pumps 1 Valve TOP event = No water from fire water system Fire pump 1 FP1 Fire pump 2 FP2 Engine Causes for TOP event: VF = Valve failure G1 = No output from any of the fire pumps G2 = No water from FP1 G3 = No water from FP2 FP1 = failure of FP1 EF = Failure of engine FP2 = Failure of FP2 Marvin Rausand (RAMS Group) System Reliability Theory (Version 0.1) 12 / 29

Example: Redundant fire pumps 2 No water from fire pump system TOP Valve Valve blocked, or fail to open VF No water from the two pumps G1 No water from pump 1 No water from pump 2 Fire pump 1 FP1 Fire pump 2 FP2 Engine G2 G3 Failure of pump 1 Failure of engine Failure of pump 2 Failure of engine FP1 EF FP2 EF Marvin Rausand (RAMS Group) System Reliability Theory (Version 0.1) 13 / 29

Example: Redundant fire pumps 3 No water from fire pump system TOP Valve blocked, or fail to open VF No water from the two pumps G1 No water from fire pump system TOP No water from pump 1 No water from pump 2 Valve blocked, or fail to open No water from the two pumps Failure of engine G2 G3 VF G1 EF Failure of pump 1 Failure of engine Failure of pump 2 Failure of engine Failure of pump 1 Failure of pump 2 FP1 EF FP2 EF FP1 FP2 The two fault trees above are logically identical. They give the same information. Marvin Rausand (RAMS Group) System Reliability Theory (Version 0.1) 14 / 29

Cut Sets A cut set in a fault tree is a set of basic events whose (simultaneous) occurrence ensures that the TOP event occurs A cut set is said to be minimal if the set cannot be reduced without loosing its status as a cut set The TOP event will therefore occur if all the basic events in a minimal cut set occur at the same time. Marvin Rausand (RAMS Group) System Reliability Theory (Version 0.1) 15 / 29

Qualitative assessment Qualitative assessment by investigating the minimal cut sets: Order of the cut sets Ranking based on the type of basic events involved 1. Human error (most critical) 2. Failure of active equipment 3. Failure of passive equipment Also look for large cut sets with dependent items Rank Basic event 1 Basic event 2 1 Human error Human error 2 Human error Failure of active unit 3 Human error Failure of passive unit 4 Failure of active unit Failure of active unit 5 Failure of active unit Failure of passive unit 6 Failure of passive unit Failure of passive unit Marvin Rausand (RAMS Group) System Reliability Theory (Version 0.1) 16 / 29

Notation Q 0 (t) = Pr(The TOP event occurs at time t) q i (t) = Pr(Basic event i occurs at time t) ˇQ j (t) = Pr(Minimal cut set j fails at time t) Let E i (t) denote that basic event i occurs at time t. E i (t) may, for example, be that component i is in a failed state at time t. Note that E i (t) does not mean that component i fails exactly at time t, but that component i is in a failed state at time t A minimal cut set is said to fail when all the basic events occur (are present) at the same time. The formulas for q i (t) will be discussed later in this presentation. Marvin Rausand (RAMS Group) System Reliability Theory (Version 0.1) 17 / 29

Single AND-gate S TOP E 1 E 2 Event 1 occurs Event 2 occurs E 1 E 2 Let E i (t) denote that event E i occurs at time t, and let q i (t) = Pr(E i (t)) for i = 1,2. When the basic events are independent, the TOP event probability Q 0 (t) is Q 0 (t) = Pr(E 1 (t) E 2 (t)) = Pr(E 1 (t)) Pr(E 2 (t)) = q 1 (t) q 2 (t) When we have a single AND-gate with m basic events, we get m Q 0 (t) = q j (t) j=1 Marvin Rausand (RAMS Group) System Reliability Theory (Version 0.1) 18 / 29

Single OR-gate S TOP E 1 E 2 Event 1 occurs Event 2 occurs E 1 E 2 When the basic events are independent, the TOP event probability Q 0 (t) is Q 0 (t) = Pr(E 1 (t) E 2 (t)) = Pr(E 1 (t)) + Pr(E 2 (t)) Pr(E 1 (t) E 2 (t)) = q 1 (t) + q 2 (t) q 1 (t) q 2 (t) = 1 (1 q 1 (t))(1 q 2 (t)) When we have a single OR-gate with m basic events, we get m Q 0 (t) = 1 (1 q j (t)) j=1 Marvin Rausand (RAMS Group) System Reliability Theory (Version 0.1) 19 / 29

Cut set assessment Min. cut set j fails Basic event j1 occurs E j1 Basic event j2 occurs E j2 Basic event j,r occurs E jr A minimal cut set fails if and only if all the basic events in the set fail at the same time. The probability that cut set j fails at time t is ˇQ j (t) = r q j,i (t) i=1 where we assume that all the r basic events in the minimal cut set j are independent. Marvin Rausand (RAMS Group) System Reliability Theory (Version 0.1) 20 / 29

TOP event probability TOP Min. cut set 1 fails C 1 Min. cut set 2 fails C 2 Min. cut set k fails C k The TOP event occurs if at least one of the minimal cut sets fails. The TOP event probability is k ( Q 0 (t) 1 1 ˇQ j (t) ) (1) j=1 The reason for the inequality sign is that the minimal cut sets are not always independent. The same basic event may be member of several cut sets. Formula (1) is called the Upper Bound Approximation. Marvin Rausand (RAMS Group) System Reliability Theory (Version 0.1) 21 / 29

Types of events Five different types of events are normally used: Non-repairable unit Repairable unit (repaired when failure occurs) Periodically tested unit (hidden failures) Frequency of events On demand probability Basic event probability: q i (t) = Pr(Basic event i occurs at time t) Marvin Rausand (RAMS Group) System Reliability Theory (Version 0.1) 22 / 29

Non-repairable unit Unit i is not repaired when a failure occurs. Input data: Failure rate λ i Basic event probability: q i (t) = 1 e λ it λ i t Marvin Rausand (RAMS Group) System Reliability Theory (Version 0.1) 23 / 29

Repairable unit Unit i is repaired when a failure occurs. The unit is assumed to be as good as new after a repair. Input data: Failure rate λ i Mean time to repair, MTTR i Basic event probability: q i (t) λ i MTTR i Marvin Rausand (RAMS Group) System Reliability Theory (Version 0.1) 24 / 29

Periodic testing Unit i is tested periodically with test interval τ. A failure may occur at any time in the test interval, but the failure is only detected in a test or if a demand for the unit occurs. After a test/repair, the unit is assumed to be as good as new. This is a typical situation for many safety-critical units, like sensors, and safety valves. Input data: Failure rate λ i Test interval τ i Basic event probability: q i (t) λ i τ i 2 Marvin Rausand (RAMS Group) System Reliability Theory (Version 0.1) 25 / 29

Frequency Event i occurs now and then, with no specific duration Input data: Frequency f i If the event has a duration, use input similar to repairable unit. Marvin Rausand (RAMS Group) System Reliability Theory (Version 0.1) 26 / 29

On demand probability Unit i is not active during normal operation, but may be subject to one or more demands Input data: Pr(Unit i fails upon request) This is often used to model operator errors. Marvin Rausand (RAMS Group) System Reliability Theory (Version 0.1) 27 / 29

Cut set evaluation Ranking of minimal cut sets: Cut set unavailability The probability that a specific cut set is in a failed state at time t Cut set importance The conditional probability that a cut set is failed at time t, given that the system is failed at time t Marvin Rausand (RAMS Group) System Reliability Theory (Version 0.1) 28 / 29

Conclusions FTA identifies all the possible causes of a specified undesired event (TOP event) FTA is a structured top-down deductive analysis. FTA leads to improved understanding of system characteristics. Design flaws and insufficient operational and maintenance procedures may be revealed and corrected during the fault tree construction. FTA is not (fully) suitable for modeling dynamic scenarios FTA is binary (fail success) and may therefore fail to address some problems Marvin Rausand (RAMS Group) System Reliability Theory (Version 0.1) 29 / 29