Safety and Hazard Analysis



Similar documents
The Therac 25 A case study in safety failure. Therac 25 Background

Software Engineering. Computer Science Tripos 1B Michaelmas Richard Clayton

Developing software which should never compromise the overall safety of a system

Software Safety Basics

Controlling Risks Risk Assessment

SOFTWARE-IMPLEMENTED SAFETY LOGIC Angela E. Summers, Ph.D., P.E., President, SIS-TECH Solutions, LP

A Methodology for Safety Critical Software Systems Planning

ASSESSMENT OF THE ISO STANDARD, ROAD VEHICLES FUNCTIONAL SAFETY

SFTA, SFMECA AND STPA APPLIED TO BRAZILIAN SPACE SOFTWARE

Effective Application of Software Safety Techniques for Automotive Embedded Control Systems SAE TECHNICAL PAPER SERIES

University of Paderborn Software Engineering Group II-25. Dr. Holger Giese. University of Paderborn Software Engineering Group. External facilities

Topics. Producing Production Quality Software. Concurrent Environments. Why Use Concurrency? Models of concurrency Concurrency in Java

Software testing. Objectives

Risk. Risk Categories. Project Risk (aka Development Risk) Technical Risk Business Risk. Lecture 5, Part 1: Risk

System Specification. Objectives

Functional safety. Essential to overall safety

Understanding Safety Integrity Levels (SIL) and its Effects for Field Instruments

Software Certification and Software Certificate Management Systems

Assurance Cases and Test Design Analysis

Embedded Real-Time Systems (TI-IRTS) Safety and Reliability Patterns B.D. Chapter

Chapter 8 Software Testing

A Static Analyzer for Large Safety-Critical Software. Considered Programs and Semantics. Automatic Program Verification by Abstract Interpretation

Testing of safety-critical software some principles

Guide to Reusable Launch and Reentry Vehicle Software and Computing System Safety

CS 392/681 - Computer Security. Module 16 Vulnerability Analysis

INDEPENDENT VERIFICATION AND VALIDATION OF EMBEDDED SOFTWARE

COMMONWEALTH OF PENNSYLVANIA DEPARTMENT S OF PUBLIC WELFARE, INSURANCE, AND AGING

OSW TN002 - TMT GUIDELINES FOR SOFTWARE SAFETY TMT.SFT.TEC REL07

Calculation of Risk Factor Using the Excel spreadsheet Calculation of Risk Factor.xls

JOURNAL OF MEDICAL INFORMATICS & TECHNOLOGIES Vol. 21/2012, ISSN

Module 10. Coding and Testing. Version 2 CSE IIT, Kharagpur

Overview of IEC Design of electrical / electronic / programmable electronic safety-related systems

Lecture 10: Managing Risk" Risk Management"

Turntable microswitch assembly Plunger

Software Testing. Quality & Testing. Software Testing

Lab - Dual Boot - Vista & Windows XP

IV&V and Software Risk Management

FAULT TOLERANCE FOR MULTIPROCESSOR SYSTEMS VIA TIME REDUNDANT TASK SCHEDULING

Static analysis of numerical programs

Development of a Safety Knowledge Base

Rotating Machinery Diagnostics & Instrumentation Solutions for Maintenance That Matters

Registry Tuner. Software Manual

Specification and Analysis of Contracts Lecture 1 Introduction

Ethical Issues in the Software Quality Assurance Function

Development and Application of POSAFE-Q PLC Platform

Exploratory Testing in an Agile Context

A Comprehensive Safety Engineering Approach for Software Intensive Systems based on STPA

McAfee Network Security Platform Administration Course

Basic Testing Concepts and Terminology

SOFTWARE SAFETY STANDARD

Firewalls Netasq. Security Management by NETASQ

System Aware Cyber Security

Nuclear Science Merit Badge Workbook

Guide to Writing a Project Report

FactoryTalk View Site Edition V5.0 (CPR9) Server Redundancy Guidelines

RADIATION THERAPY FOR BRAIN METASTASES. Facts to Help Patients Make an Informed Decision TARGETING CANCER CARE AMERICAN SOCIETY FOR RADIATION ONCOLOGY

Functional Hazard Assessment (FHA) Report for Unmanned Aircraft Systems

Quality Management. Lecture 12 Software quality management

Outline. multiple choice quiz bottom-up design. the modules main program: quiz.py namespaces in Python

University of Ljubljana. Faculty of Computer and Information Science

Objectives 404 CHAPTER 9 RADIATION

Best Practices for Verification, Validation, and Test in Model- Based Design

UTC3100 and 3170 POS RAID Information

Value Paper Author: Edgar C. Ramirez. Diverse redundancy used in SIS technology to achieve higher safety integrity

Cyber Security and the Canadian Nuclear Industry a Canadian Regulatory Perspective

End to End Data Path Protection

Safety Requirements Specification Guideline

FUEL-16, Troubleshooting Fuel Supply Problems

Pedestrian Struck By Forklift

ELECTRICAL ENGINEERING

Goddard Procedures and Guidelines

Architecture Design & Sequence Diagram. Week 7

Overview of how to test a. Business Continuity Plan

Presentation Overview. Istwaan Knijff EMC & Safety themadag - 03 oktober Sensata Technologies Almelo. What about EMC?

Software Engineering for Real- Time Systems.

MultiPARTES. Virtualization on Heterogeneous Multicore Platforms. 2012/7/18 Slides by TU Wien, UPV, fentiss, UPM

Plain English Guide To Common Criteria Requirements In The. Field Device Protection Profile Version 0.75

Accident Investigation

Theory of electrons and positrons

Safety Analysis for Nuclear Power Plants

ISO Introduction

Intensity-Modulated Radiation Therapy (IMRT)

Rigorous Methods for Software Engineering (F21RS1) High Integrity Software Development

Transcription:

Safety and Hazard Analysis An F16 pilot was sitting on the runway doing the preflight and wondered if the computer would let him raise the landing gear while on the ground - it did A manufacturer of torpedoes for the U.S. Navy wanted to make a 'safe' torpedo. Their initial solution was to cause the torpedo to self-destruct if it made a 180 degree change in course. On the test run for this new 'safe' torpedo the captain fired the torpedo and nothing happened. So the captain ordered the sub back to base, executing a 180 degree turn Nancy Leveson, talk on Software Safety and Reliability Today s Lecture 1. Intro to Software Engineering 2. Inexact quantities 3. Error propagation 4. Floating-point numbers 5. Design process 6. Teamwork 7. Project planning 8. Decision making 9. Professional Engineering 10. Software quality 11. Software safety 12. Intellectual property Fall 2004 SE 101 Introduction to Software Engineering 2 1

Agenda Hazardous software - Therac 25 Safety analysis Software safety analysis Canadian success story - Darlington Fall 2004 SE 101 Introduction to Software Engineering 3 Therac-25 The Therac-25 is a computer-controlled radiationtherapy machine developed by the Atomic Energy of Canada Limited (AECL), a crown corporation. Dual-mode linear accelerator Electron therapy - 5 to 25 million electron volt (MeV) X-ray (or photon) therapy - 25 MeV Therac-25 material from Nancy Leveson and Clark Turner, An Investigation of the Therac-25 Accidents, IEEE Computer, July 1993 Fall 2004 SE 101 Introduction to Software Engineering 4 2

Therac-25 Eleven Therac-25s were installed, five in the States and six in Canada. Between June 1985 and January 1987, there were six massive overdoses. Kennestone Regional Oncology Center - paralyzed, in pain Ontario Cancer Foundation (Hamilton) - died Yakima Valley Memorial Hospital - minor disability, scarring East Texas Cancer Center - died East Texas Cancer Center - died Yakima Valley Memorial Hospital - died Fall 2004 SE 101 Introduction to Software Engineering 5 Therac-25 Mirror Counterweight X-ray mode target Electron mode scan magnets Flattener Fall 2004 SE 101 Introduction to Software Engineering 6 3

Therac-25 - Shared variables Treatment Module if mode/energy specified then calculate index to table of preset offset parameter set offset operating parameters call Magnet if mode/energy changed then start over Hand Module Position turntable Mode & energy level Offset operating parameters Fall 2004 SE 101 Introduction to Software Engineering 7 Therac-25 - Race condition Treatment Module if mode/energy specified then calculate index to table If edits of to mode and energy are started preset offset parameter and finished during call to Magnet, then edits are not detected by Treatment set offset operating parameters call Magnet if mode/energy changed then start over Mode & energy level Hand Module Offset Position turntable operating parameters Fall 2004 SE 101 Introduction to Software Engineering 8 4

Therac-25 - Overflow error Housekeeping Module If Class3 NOT = 0 then check turntable settings Check turntable settings If turntable setting inconsistent with treatment, then set F$mal Treatment Module Increment Class3 Check F$mal if F$mal=0 then system is consistent F$mal Class3 Fall 2004 SE 101 Introduction to Software Engineering 9 Hazardous Software Software is hazardous if It can cause a hazard (i.e., cause other components to become hazardous) It is used to control a hazard NASA Software Safety Guidebook Fall 2004 SE 101 Introduction to Software Engineering 10 5

History of Safety Engineering Prior to 1950, engineers used a fly-fix-fly process to develop safety-critical systems. In the 1950s, the military and aerospace industries started to develop and use predictive safety analysis techniques. Identify hazards Eliminate, reduce, or control hazardous conditions, to avoid or lessen the severity of accidents. Fall 2004 SE 101 Introduction to Software Engineering 11 Safety Analysis Different safety analysis techniques address different aspects of the problem. Identify hazards Demonstrate the absence of specific hazards. Determine the possible damaging effects resulting from hazards Determine the causes of a hazard Identify safety design criteria that will eliminate, reduce, or control identified hazards. Evaluate the adequacy of hazard controls Fall 2004 SE 101 Introduction to Software Engineering 12 6

Safety Analysis Most safety-analysis techniques are aimed at hardware failures or at external threats Failure Modes and Effects Analysis (FMEA) Identify critical hardware components, interfaces Identify possible failure modes for each critical component Determine the worst-case effect from each failure mode Hazard Analyses Identify environmental hazards Identify deviations in designs Identify potential deviations in operational use Identify failures at interfaces between components Fall 2004 SE 101 Introduction to Software Engineering 13 Software Failures Software does not break. Software failures are due to logic or design errors: The software has no coding errors, but is written from incorrect requirements The requirements are correct, but the software has coding errors that deviate from requirements Thus, software-safety processes often try to improve software safety by improving software correctness. Fall 2004 SE 101 Introduction to Software Engineering 14 7

NASA Space Shuttle Safety Process FMEAs are performed on all shuttle hardware and ground support equipment that interfaces with shuttle hardware at launch sites. FMEA-identified hazards that could result in loss of life or vehicle, loss of mission, loss of backup systems, are put on the Critical Items List (CIL) Closed-loop system for hazard documentation, resolution, and approval: Items on the CIL must be addressed or a waiver request must be approved before the Shuttle can fly. Fall 2004 SE 101 Introduction to Software Engineering 15 NASA Space Shuttle Safety Process NASA did not perform hazard analysis on Shuttle software during the software s development, and it does not perform hazard analysis on software upgrades. Striving for correct requirements and code Making software fault-tolerant through the use of redundancy These activities improve software quality and reliability, but they say nothing about whether the software is free of hazards. Fall 2004 SE 101 Introduction to Software Engineering 16 8

Software Safety Analysis Software Failure Modes and Effects Analysis Analyzes software components, or interactions between software and hardware components Software Faults - Data sampling rate - Data collisions - Illegal commands - Commands out of sequence - Time delays, deadlines - Multiple events - Safe modes Failures - Broken sensors - Memory overwritten - Missing parameters - Parameters out of range - Bad input - Power fluctuations - Gamma radiation NASA Software Safety Guidebook Fall 2004 SE 101 Introduction to Software Engineering 17 Software Fault Trees Software Fault Tree Analysis (SFTA) is a topdown approach to softwarefailure analysis that starts with an undesired software event and determines all the ways it can happen. Password Password corrupted file file corrupted 1.1 User User database not not in in database Security Security Breech Breech 1 Illegal Illegal access granted 1.2 Wrong assigned Wrong priority assigned Security granted security after >3 tries Previous recognized logout Previous not not recognized 1.3 1.2.2 1.2.1 1.2.3 1.2.2 >3 1.2.4 1.2.3 tries Figure adapted from Wallace et al., Software Fault Tree Analysis, NASA Software Assurance Technology Center, December 2003 Fall 2004 SE 101 Introduction to Software Engineering 18 9

Therac-25 AECL performed a fault-tree analysis on the Therac-25 that apparently excluded the software. It assumed ming errors had been reduced by extensive testing on a hardware simulator and under field conditions. Computer execution errors are caused by faulty hardware components and by soft (random) errors introduced by alpha particles and electromagnetic noise. Fall 2004 SE 101 Introduction to Software Engineering 19 Darlington Nuclear Generating Station The first nuclear shutdown system implemented completely in software Two independent sub-systems, each of which is responsible for shutting down the nuclear reaction in the event of an accident. Fall 2004 SE 101 Introduction to Software Engineering 20 10

Darlington Nuclear Generating Station Project: To determine whether the software and documentation met standards and could be certified as being safe. Approach: Verification Model requirements as mathematical s Model code fragments as mathematical s on program variables Prove that the program s match the requirements s Fall 2004 SE 101 Introduction to Software Engineering 21 Functional Requirements A Requirements Table represents a mathematical that describes a desired behaviour of the system to be developed. ElevDir(dir,loc,Req[]) = dir = f.(f loc Req[f]) true false f.(f loc Req[f]) true dir Up false Down dir Fall 2004 SE 101 Introduction to Software Engineering 22 11

Function Tables A Function Table represents a mathematical that describes the behaviour of a procedure. Procedure signature Precondition NewDirection (dir, loc, Req) value before value after R 1 = (bottom loc top)!reqabove(loc)! =!ReqBelow(loc)!!ReqAbove(loc)!!ReqBelow(loc)!!ReqAbove(loc)!!ReqBelow(loc)! dir = dir down up NC(Req,loc) Fall 2004 SE 101 Introduction to Software Engineering 23 Systematic Inspections Requirements Design requirement relation requirement relation program program Code program Fall 2004 SE 101 Introduction to Software Engineering 24 12

Reviews and Inspections 0 Well-formedness of tabular expressions requirement relation Fall 2004 SE 101 Introduction to Software Engineering 25 Reviews and Inspections 0 Well-formedness of tabular expressions 1 Requirements validation Domain Experts 1 requirement relation Fall 2004 SE 101 Introduction to Software Engineering 26 13

Reviews and Inspections Domain Experts 1 requirement relation 0 1 2 Well-formedness of tabular expressions Requirements validation - derivation program program 2 2 2 program Fall 2004 SE 101 Introduction to Software Engineering 27 Reviews and Inspections Domain Experts 1 requirement 3 relation 0 Well-formedness of tabular expressions 1 Requirements Validation 2 - derivation 3 verification program program 2 2 2 program Fall 2004 SE 101 Introduction to Software Engineering 28 14

Darlington Experience Experience: 35-person-years task; cost $4 million; found relatively few important discrepancies; but gained confidence in the code Since then: Ontario Hydro and the Atomic Energy Canada Limited (AECL) have developed a family of standards, procedures, and guidelines for developing safety-critical software for use in nuclear power plants, incorporating mathematical models of requirements, design, and code systematic inspections of requirements mathematical verification or rigorous argument that the design meets the requirements the code meets the design Fall 2004 SE 101 Introduction to Software Engineering 29 Summary Ensuring software safety is hard (Therac-25) and expensive (Darlington). The state of the practice focuses on software quality and reliability. The state of the art attempts to adapt safety analysis techniques so that they apply to software hazards. Fall 2004 SE 101 Introduction to Software Engineering 30 15

Announcements In this week s lab WHMIS training and quiz Required to progress to 1B Bring your WATCARD to lab (to get sticker) Promotion Rules For next week s lecture and web review: Intellectual Property IPE Ch. 17 Word Choice #3 Dupré 20,30,81,127,136 Fall 2004 SE 101 Introduction to Software Engineering 31 Announcements Quiz #3 in lab next Thursday. Covers lectures on professional engineering, quality attributes, safey, and intellectual property IPE 2, 3, 4, 17, 19, 20 And word choice Dupré 3,14,17,20,28,30,45,51,61,64,71, 75,81,99,115,127,129,136 SSF Award for Teaching Assistance Excellence The class collectively nominates one teaching assistant from any of your courses. Nominations due Friday November 26 Nomination forms available from Class Reps Fall 2004 SE 101 Introduction to Software Engineering 32 16