Software in safety critical systems Software safety requirements Software safety integrity Budapest University of Technology and Economics Department of Measurement and Information Systems Definitions Programmable electronic (PE) o Based on computer technology which may be comprised of hardware, software, and of input and/or output units Software o Intellectual creation comprising the programs, procedures, data, rules and any associated documentation pertaining to the operation of a data processing system Safety related software o Software that is used to implement safety functions Software safety integrity o Likelihood of software in a PE system achieving its safety functions under all stated conditions within specified period of time Systematic safety integrity o Part of the safety integrity relating to systematic failures in a dangerous mode of failure o Error in the design, implementation of software causes systematic failures
Determining software SIL Basic concepts: o Software SIL shall be identical to the PE system SIL (implementing the safety function) o Exception: Mechanism exists to prevent the failure of a software module from causing the system to go to an unsafe state Reducing software SIL requires o Analysis of failure modes and effects (probabilistic analysis) o Analysis of independence between software and the applied prevention mechanisms Different software modules may have different SIL Architecture for reducing software SIL 2 Független channelsszoftver modulok and bejárása independent acknowledgement (1) (2) (3) Channel 1 (1) (1) (2) (3) (3) Channel 2 (2) Signature Signature (1) (1) (3) (3) locking locking (4) (4) I/O
Problems and solutions Systematic failures in complex software: o Development of fault free software cannot be guaranteed in case of complex functions Goal: Reducing the number of faults (remaining after verification and validation activities) that may cause hazard > Fault prevention, fault removal, fault tolerance, fault forecasting o Target failure measure cannot be demonstrated by a quantitative analysis General techniques do not exist, estimations are questionable Standards prescribe methods and techniques for the software development, operation and maintenance: 1. Safety lifecycle 2. Techniques and measures in all phases of the lifecycle 3. Documentation 4. Competence and independence of personnel 1. Safety lifecycle
Hardware and software development PE system architecture (partitioning of functions) determines software requirements PES integration follows software development Final step: E/E/PES integration E/E/PES safety requirements specification Software safety requirements Software design and development PES integration (software and hardware) Programmable hardware design and development E/E/PES architecture Hardware safety requirements E/E/PES integration Non programmable hardware design and development Software safety lifecycle Safety req. spec. has two parts: o Software safety functions o Software safety integrity levels Validation planning is required Integration with PE hardware is required Final step: Software safety validation Software safety validation planning Software safety requirements Safety functions Safety integrity Software design and development PES integration (hw and sw) Software safety validation
Example lifecycle (V model) Development model is not prescribed by the standards o V model is characterized by clear conditions to step forward o V&V planning is explicitly supported (input for V&V activities) Software quality assurance Software Quality Assurance Plan o Specifying activities and documents according to ISO 9000 3 ISO 9001 accreditation o Determining all technical and control activities in the lifecycle Activities, inputs and outputs (esp. verification and validation) Quantitative quality metrics Specification of its own updating (frequency, responsibility, methods) o External supplier control Software configuration management o Configuration control before release for all artifacts (code, documents, ) o Changes require authorization Problem reporting and corrective actions (issue tracking) o Lifecycle of problems: From reporting through analysis, design and implementation to validation o Preventive actions
2. Techniques and measures Basic approach Goal: Preventing the introduction of systematic faults and controlling the residual faults SIL determines the set of techniques to be applied as o M: Mandatory o HR: Highly recommended (rationale behind not using it should be detailed and agreed with the assessor) o R: Recommended o : No recommendation for or against being used o NR: Not recommended Combinations of techniques is allowed o E.g., alternate or equivalent techniques are marked Hierarchy of methods is formed (references to tables)
Example: Guide to selection of techniques Software safety requirements specification: o Techniques 2a and 2b are alternatives o Referred table: Semi formal methods (B.7) Hierarchy of design methods
Hierarchy of design methods Hierarchy of design methods
Specific techniques: Design Safety bag technique o Independent external monitor ensuring that the main computer performs safe Memorizing executed traces o Comparison of program execution with previously documented reference in order to force it to fail safely if it attempts to execute a path which is not allowed Defensive programming o Checking anomalous control/data flow and data values during execution E.g., checking variable ranges, plausibility, consistency of configuration, availability of hw, etc. o and react in a safe manner Hierarchy of V&V methods
Hierarchy of V&V methods Hierarchy of V&V methods
Hierarchy of V&V methods Specific techniques: Verification Probabilistic testing o Deriving probabilistic figures about the reliability of components from (automated) testing via environment simulation focusing on frequent trajectories Test case execution from error seeding o Inserting errors in order to estimate the number of remaining errors after testing from the number of inserted and detected errors Fagan inspections o Revealing mistakes by a systematic audit on documents and design artifacts Sneak circuit analysis o Detecting unexpected paths or logic flow (latent conditions inadvertently designed into the system) which initiate undesired functions
Application of tools in the lifecycle Fault prevention: o Program translation from high level programming languages o MBD, CASE tools: High level modeling and code/configuration generators Fault removal: o Analysis, testing and diagnosis o Correction (code modification) Management tools o Contributing both to fault prevention and removal o Includes project management, configuration management, issue tracking Types of tools Safety concerns of tools o Tools potentially introducing faults Modeling and programming tools Program translation tools o Tools potentially failing to detect faults Analysis and testing tools Project management tools Requirements o Use certified or widely adopted tools Increased confidence from use (no evidence of improper results yet) o Use the well tested partswithout altering the usage o Check the output of tools (analysis/diversity) o Control access and versions
Safety of programming languages Factors for selection of languages o Functional characteristics (probability of faults) Logical soundness (unambiguous definition) Complexity of definition (understandability) Expressive power Verifiability (consistency with specification) Vulnerability (security aspects) o Availability and quality of tools o Expertise available in the design team Coding standards (subsets of languages) are defined o Dangerous constructs are excluded (e.g., function pointers) o Static checking can be used to verify the subset Specific (certified) compilers are available o Compiler verification kit for third party compilers Safety of programming languages Factors for selection of languages Constructs that make verification difficult (61508): o Functional Unconditional characteristics jumps (probability excluding of subroutine faults) calls Logical Recursion soundness (unambiguous definition) Complexity Pointers, of definition heaps (understandability) or any type of dynamic variables Expressive Interrupt power handling at source code level Verifiability Multiple (consistency entries with and specification) exits of loops and subprograms Vulnerability (security aspects) Implicit variable initialization or declaration o Availability Variant and quality records of and tools equivalence o Expertise Procedural available in parameters the design team Coding standards (subsets of languages) are defined o Dangerous constructs are excluded (e.g., function pointers) o Static checking can be used to verify the subset Specific (certified) compilers are available o Compiler verification kit for third party compilers
Language comparison Wild jumps: Jump to arbitrary address in memory Overwrites: Overwriting arbitrary address in memory Model of math: Well defined data types Separate compilation: Type checking across modules Coding standards for C and C++ MISRA C (Motor Industry Software Reliability Association) o Safe subset of C (2004): 141 rules (121 required, 20 advisory) o Examples: Rule 33 (Required): The right hand side of a "&&" or " " operator shall not contain side effects. Rule 49 (Advisory): Tests of a value against zero should be made explicit, unless the operand is effectively Boolean. Rule 59 (R): The statement forming the body of an "if", "else if", "else", "while", "do... while", or "for" statement shall always be enclosed in braces. Rule 104 (R): Non constant pointers to functions shall not be used. o Tools to check MISRA conformance (LDRA, PolySpace, ) Test cases to demonstrate adherence to MISRA rules MISRA C++ (2008): 228 rules US DoD, JSF C++: 221 rules (incl. metric guidelines) o Joint Strike Fighter Air Vehicle C++ Coding Standard
Interesting facts Boeing 777: Approx. 35 languages are used o Mostly Ada with assembler (e.g., cabin management system) o Onboard extinguishers in PLM o Seatback entertainment system in C++ with MFC European Space Agency: o Mandates Ada for mission critical systems Honeywell: Aircraft navigation data loader in C Lockheed: F 22 Advanced Tactical Fighter program in Ada 83 with a small amount in assembly GM trucks vehicle controllers mostly in Modula GM (Modula GM is a variant of Modula 2) TGV France: Braking and switching system in Ada Westinghouse: Automatic Train Protection (ATP) systems in Pascal Safety critical OS: Required properties Partitioning in space o Memory protection o Guaranteed resource availability Partitioning in time o Deterministic scheduling o Guaranteed resource availability in time Mandatory access control for critical objects o Not (only) discretionary Bounded execution time o Also for system functions Support for fault tolerance and high availability o Fault detection and recovery / failover o Redundancy control
Example: Safety and RTOS Compromise needed o Complex RTOS: Difficult to test o Bare machine : Less scheduling risks High maintenance risks Example: Tornado for Safety Critical Systems o Integrated software solution uses Wind River's securely partitioned VxWorks AE653 RTOS o ARINC 653: Time and space partitioning (guaranteed isolation) o RTCA/DO 178B: Level A certification o POSIX, Ada, C support 3. Documentation
Principles for documentation Type of documentation o Comprehensive (overall lifecycle) E.g., Software Verification Plan o Specific (for a given lifecycle phase) E.g., Software Source Code Verification Report Document Cross Reference Table o Determines documentation for a lifecycle phase o Determines relations among documents Traceability of documents is required o Relationships between documents are specified (input, output) o Terminology, references, abbreviations are consistent Merging documents is allowed o If responsible persons (authors) shall not be independent Document cross reference table (EN50128) creation of a document used document in a given phase (read vertically)
Example Software Planning Phase Software Development Plan Software Quality Assurance Plan Software Configuration Management Plan Software Verification Plan Software Integration Test Plan Software/hardware Integration Test Plan Software Validation Plan Software Maintenance Plan System Development Phase System Requirements Specification System Safety Requirements Specification System Architecture Description System Safety Plan Software Requirements Spec. Phase Software Requirements Specification Software Requirements Test Specification Software Requirements Verification Report Software Maintenance Phase Software Maintenance Records Software Change Records Software Assessment Phase Software Assessment Report Software Validation Phase Software Validation Report Software/hardware Integration Phase Software/hardware Integration Test Report Document structure in EN50128 30 documents in a systematic structure o Specification o Design o Verification Software Architecture & Design Phase Software Architecture Specification Software Design Specification Software Architecture and Design Verification Report Software Module Design Phase Software Module Design Specification Software Module Test Specification Software Module Verification Report Software Integration Phase Software Integration Test Report Software Module Testing Phase Software Module Test Report Coding Phase Software Source Code & Supporting Documentation Software Source Code Verification Report 4. Competence and independence of personnel
Human factors In contrast to computers o Humans often fail in: reacting in time following a predefined set of instructions o Humans are good in: handling unanticipated problems Human errors o Not all kind of human errors are equally likely o Hazard analysis (FMECA) is possible in a given context o Results shall be integrated into system safety analysis Reducing the errors of developers o Safe languages, tools, environments o Training, experience and redundancy (independence) Reducing operatorerrors: o Designing ergonomic HMI (patterns are available) o Designing to aid the operator rather than take over Safety management o Quality assurance o Safety Organization Organization Competence shall be demonstrated o Training, experience and qualifications Independence of roles: o DES: Designer (analyst, architect, coder, unit tester) o VER: Verifier o VAL: Validator o ASS: Assessor o MAN: Project manager o QUA: Quality assurance personnel
EN 50128: Independence of personnel Organization Person SIL 0: DES, VER, VAL ASS SIL 1 or 2: DES VER, VAL ASS SIL 3 or 4: MGR DES VER, VAL ASS or: MGR ASS DES VER VAL Specific design aspects: Development of generic software
Overview of the lifecycle System development Generic software: It can be used and re used after parameterization with specific data Operation and maintenance Software assessment Design for parameterization Requirement specification Architecture design Validation test plan Integration test plan Software validation Software/hardware Software integration integration V&V of parameterization Module design Module test plan Module testing Parameterization Module coding Design aspects Lifecycle phase System development Software requirement specification Architecture design Module design Verification and validation Maintenance Design aspect Decision about parameterized functions Specification of parameterized functions Specification of data validation Design of interfaces for parameterization Separation of program code and parameters Checking for potential parameter values and their combinations Checking compatibility of changes in program code and parameters
Parameterization activities Similar lifecycle phase Software requirements specification Software architecture and design Software integration Software validation Software assessment Activity Specifying application requirements Data preparation Data testing (verification) Data test reporting Parameterization (configuration) Validation of the system parameterized with application data Summary
Design of safe software Summary o Lifecycle o Techniques and measures o Documentation o Competence and independence of personnel Specific aspects o Safe subsets of programming languages o The role of tools o Safe operating systems o Development of generic software