Designing an Effective Risk Matrix



Similar documents
Why Process Safety Management Audits Fail?

Safety Integrity Level (SIL) Assessment as key element within the plant design

On-Site Risk Management Audit Checklist for Program Level 3 Process

Michael A. Mitchell, Cameron Flow Control, DYNATORQUE Product Manager

3.0 Risk Assessment and Analysis Techniques and Tools

Controlling Risks Risk Assessment

ELECTRICAL SAFETY RISK ASSESSMENT

Basic Fundamentals Of Safety Instrumented Systems

A PROGRESSIVE RISK ASSESSMENT PROCESS FOR A TYPICAL CHEMICAL COMPANY: HOW TO AVOID THE RUSH TO QRA

REQUIREMENTS OF SAFETY MANAGEMENT SYSTEM

Phase A Aleutian Islands Risk Assessment. Options and Recommended Risk Matrix Approach. April 27, 2010

RISK MANAGEMENT FOR INFRASTRUCTURE

LSST Hazard Analysis Plan

Risk Assessment / Risk Management Protocol

Hazard/Risk Identification and Control Procedure

3.4.4 Description of risk management plan Unofficial Translation Only the Thai version of the text is legally binding.

FAQ SHEET - LAYERS OF PROTECTION ANALYSIS (LOPA)

Methods of Determining Safety Integrity Level (SIL) Requirements - Pros and Cons

Guidance Notes on Job Safety Analysis for the Marine and Offshore Industries

Security Vulnerability Assessment (SVA) Revealed

Title: Basic Principles of Risk Management for Medical Device Design

identify hazards, analyze or evaluate the risk associated with that hazard, and determine appropriate ways to eliminate or control the hazard.

University of Paderborn Software Engineering Group II-25. Dr. Holger Giese. University of Paderborn Software Engineering Group. External facilities

Safety Management Systems (SMS) guidance for organisations

ISO 14971: Overview of the standard

Risk Management: Coordinated activities to direct and control an organisation with regard to risk.

Process Safety Management Guide. 4th Edition

RISK MANAGEMENT POLICY

Hazard Identification and Risk Assessment for the Use of Booster Fans in Underground Coal Mines

FMEA Failure Risk Scoring Schemes

Risk Matrix as a Tool for Risk Assessment in the Chemical Process Industry

U.S. Chemical Safety and Hazard Investigation Board

Business Continuity Planning. Presentation and. Direction

Performing a Cybersecurity Risk Assessment on an IACS or SIS. Marco Ayala, aesolutions John Cusimano, aesolutions

Process Safety Management

OPERATIONAL RISK MANAGEMENT B STUDENT HANDOUT

An Introduction to Risk Management. For Event Holders in Western Australia. May 2014

How To Improve Process Safety In Singapore

OCCUPATIONAL HEALTH AND SAFETY RISK ASSESSMENT PROGRAM FOR AGRICULTURE

Safety Assessment for a major hazard facility

Planning Your Safety Instrumented System

The use of statistical problem solving methods for Risk Assessment

Life Cycle Asset Management

Risk Assessment for Medical Devices. Linda Braddon, Ph.D. Bring your medical device to market faster 1

Rulemaking Directorate. Preliminary Regulatory Impact Assessment Explanatory Note 2012/2013

Safety Regulation Group SAFETY MANAGEMENT SYSTEMS GUIDANCE TO ORGANISATIONS. April

Guidance for Industry: Quality Risk Management

Risk Assessment Tools for Identifying Hazards and Evaluating Risks Associated with IVD Assays

Risk Management at Chevron

A System-safety process for by-wire automotive systems

Edwin Lindsay Principal Consultant. Compliance Solutions (Life Sciences) Ltd, Tel: + 44 (0) elindsay@blueyonder.co.

The Locomotive. Risk-Informed Fire Protection

(1) Extremely high risk CASCOM Commander, Commandants of Quartermaster, Ordnance or Transportation Schools, and DeCA Commander.

Quality Risk Management Tools Quality Risk Management Tool Selection When to Select FMEA: QRM Tool Selection Matrix

Risk Management in the Medical Laboratory: Reducing Risk through Application of Standards

Occupational safety risk management in Australian mining

Emergency Preparedness Guidelines

Obsolescence Management for Industrial Assets. Don Ogwude President Creative Systems International

EMERGENCY PREPAREDNESS & RESPONSE PROCEDURE

Fire Safety Risk Assessment Checklist for Residential Care Premises

Alarm Management What, Why, Who and How?

Introduction. Chapter 1

TRAINING AND SAFETY FOR AD

Guidance note. Risk Assessment. Core concepts. N GN0165 Revision 4 December 2012

Performance Based Gas Detection System Design for Hydrocarbon Storage Tank Systems

MANAGING THE RISKS OF CHANGE

1997 CCPS Conference and Workshop Proceedings Layer of Protection Analysis: A New PHA Tool After HAZOP, Before Fault Tree Analysis

QUALITY RISK MANAGEMENT (QRM): A REVIEW

Understanding Safety Integrity Levels (SIL) and its Effects for Field Instruments

Chapter 1 PROCESS HAZARDS ANALYSIS BASICS

FIRE CHIEF / ADMINISTRATOR

Environmental-Related Risk Assessment

Conference Proceedings

Program Hazard Analysis

Failure Analysis Methods What, Why and How. MEEG 466 Special Topics in Design Jim Glancey Spring, 2006

CYBER SECURITY RISK ANALYSIS FOR PROCESS CONTROL SYSTEMS USING RINGS OF PROTECTION ANALYSIS (ROPA)

Risk management a practical approach

USING INSTRUMENTED SYSTEMS FOR OVERPRESSURE PROTECTION. Dr. Angela E. Summers, PE. SIS-TECH Solutions, LLC Houston, TX

by Paul Baybutt and Remigio Agraz-Boeneker Primatech Inc. 50 Northwoods Blvd. Columbus, Ohio, USA

Annex 7 Application of Hazard Analysis and Critical Control Point (HACCP) methodology to pharmaceuticals

A Guide to the Legal Framework of the PSM Standard for Engineers

TÜV Rheinland Functional Safety Engineer Certificate (Process Hazard & Risk Analysis)

A Guide to Hazard Identification and Risk Assessment for Public Health Units. Public Health Emergency Preparedness Protocol

Alarm Management Standards Are You Taking Them Seriously?

CHAPTER 5 - SAFETY ASSESSMENTS, LOG OF DEFICIENCIES AND CORRECTIVE ACTION PLANS

Process Safety Management of Highly Hazardous & Explosive Chemicals. Management of Change

An iomosaic Whitepaper. Realizing Cost and Safety Benefits from Knowledge Management and Workflow Automation Solutions

Project Risk Management. Presented by Stephen Smith

Process Safety Management Training

HQMC 20 Aug 04 E R R A T U M. to MCO B OPERATIONAL RISK MANAGEMENT (ORM)

MANAGEMENT OF CHANGE. January rd Edition

Title: OHS Risk Management Procedure

HAZARDOUS MATERIALS MANAGEMENT ISSUE 2

Introducing and Managing Process Safety Key Performance Indicators (KPIs)

The SPE Foundation through member donations and a contribution from Offshore Europe

Power plant safety: a wise business move

Civil Air Patrol BASIC LEVEL OPERATIONAL RISK MANAGEMENT

OSHA Regulation and the Growing Popularity of Employee Leasing Programs. Including Temporary Worker Services and Professional Employer Organizations

Hazard Operability Studies (HAZOP) Germanischer Lloyd Service/Product Description

To Mary Kay O'Connor Process Safety Center Home Page To Program details for Day 1 To Program details for Day 2

Transcription:

Designing an Effective Risk Matrix HENRY OZOG INTRODUCTION Risk assessment is an effective means of identifying process safety risks and determining the most cost-effective means to reduce risk. Many organizations recognize the need for risk assessment, but most do not have the tools, experience and resources to assess risk quantitatively. Therefore, these organizations use qualitative or semiquantitative risk assessment tools, such as risk ranking. Although risk matrices are easy to use, unless they are designed properly, they can create liability issues and give a false sense of security. An effective risk ranking matrix should have the following characteristics: Be simple to use and understand Not require extensive knowledge of quantitative risk analysis to use Have clear guidance on applicability Have consistent likelihood ranges that cover the full spectrum of potential scenarios Have detailed descriptions of the consequences of concern for each consequence range Have clearly defined tolerable and intolerable risk levels Show how scenarios that are at an intolerable risk level can be mitigated to a tolerable risk level on the matrix Provide clear guidance on what action is necessary to mitigate scenarios with intolerable risk levels Risk ranking uses a matrix that has ranges of consequence and likelihood as the axes. The combination of a consequence and likelihood range gives an estimate of risk or a risk ranking. Although there are many risk matrices that have been developed and published, the development and application of risk matrices present their own challenges. Construction of a risk matrix starts by first establishing how the matrix is intended to be used. Some typical uses for risk ranking are process hazard analyses, facility siting studies, and safety audits. A key initial decision that has to be made is to define the risk acceptability or tolerability criteria for the organization using the matrix. Without adequate consideration of risk tolerability, a risk matrix can be developed that implies a level of risk tolerability much higher than the organization actually desires. Another key aspect of risk matrix design is having the capability to evaluate the effectiveness of risk mitigation measures. The risk matrix should always allow the risk ranking for a scenario to move to a risk tolerable level after implementation of mitigating measures. Otherwise it may be difficult to determine the effectiveness of mitigation measures. Although risk matrices are easy to use, unless they are designed properly, they can create liability issues and give a false sense of security. Without adequate consideration of risk tolerability, a risk matrix can be developed that implies a level of risk tolerability much higher than the organization actually desires.

The next step is to define the consequence and likelihood ranges. A typical risk matrix is a four by four grid. Larger matrices usually have more likelihood ranges. First determine what are the consequences of interest. These can include personnel safety, public safety, environmental impact, property damage/ business interruption, corporate image and legal implications. Each consequence of interest may have a different definition for a specified consequence category. For example Table 1, which is taken from MIL-STD-882D, shows an example of multiple consequences that can be defined for a single consequence range. Table 1: Example of Multiple Consequences for a Consequence Range Source: MIL-STD-882D Description Category Environmental, Safety, and Health Result Criteria Catastrophic I Could result in death, permanent total disability, loss exceeding $1M, or irreversible severe environmental damage that violates law or regulation. Critical II Could result in permanent partial disability, injuries or occupational illness that may result in hospitalization of at least three personnel, loss exceeding $200K but less than $1M, or reversible environmental damage causing a violation of law or regulation. Marginal III Could result in injury or occupational illness resulting in one or more lost work days(s), loss exceeding $10K but less than $200K, or mitigatible environmental damage without violation of law or regulation where restoration activities can be accomplished. Negligible IV Could result in injury or illness not resulting in a lost work day, loss exceeding $2K but less than $10K, or minimal environmental damage not violating law or regulation. Few organizations have established corporate risk tolerability criteria and thus have not defined a common basis for making risk decisions. In this example, each consequence range includes consequences for personnel safety, environmental impact and property damage. One potential downfall of equating consequence criteria for property damage with personnel death is that some might equate this to the value the company puts on human life. Once the consequence ranges have been defined, the corresponding likelihood ranges can be defined. The risk tolerability of events with different potential consequences should be different. For example, no organization would tolerate having a high likelihood of having a Bhopal type event where thousands of public citizens were killed or injured. However, every organization recognizes that use of hazardous materials poses a risk that cannot be eliminated, but only controlled. Few organizations have established corporate risk tolerability criteria and thus have not defined a common basis for making risk decisions. Table 2, also taken from MIL-STD-882D, provides an example of suggested probability (likelihood) levels.

Table 2: Example of Likelihood Ranges Source: MIL-STD-882D Description* Level Specific Individual Item Fleet or Inventory** Frequent A Likely to occur more than 10-1 in that life. Continuously experienced. Probable B Will occur several times in the life of an item, with a probability of occurrence less than 10-1 but greater than 10-2 in that life. Will occur frequently. Occasional C Likely to occur some time in the life of an item, with a probability of occurrence less than 10-2 but greater than 10-3 in that life. Will occur several times. Remote D Unlikely but possible to occur in the life of an item, with a probability of occurrence less than 10-3 but greater than 10-6 in that life. Unlikely, but can reasonably be expected to occur. Improbable E So unlikely, it can be assumed occurrence may not be experienced, with a probability of occurrence less than 10-6 in that life. Unlikely to occur, but possible. *Definitions of descriptive words may have to be modified based on quantity of items involved. **The expected size of the fleet or inventory should be defined prior to accomplishing an assessment of the system. In Table 2, likelihood is defined in terms of a probability that the potential consequences will be experienced during the life of the item. For most process facilities, the item of interest is the plant, process or unit being reviewed. Assuming a typical design plant life of 20 years, the probabilities given in the above table can be converted into frequencies by dividing by 20. Therefore, category A would have a frequency of greater than once every 2 years. In moving from the Frequent to Occasional likelihood range, the frequency drops by a factor of 10 for each range. However, in moving from the Occasional to the Remote likelihood range the frequency changes by a factor of 1000. This arrangement creates likelihood ranges that are narrow at the more frequent end of the scale and very broad at the less frequent end. The other problem with likelihood categories that are defined in terms of frequency is having the relevant data to quantify the frequency of realizing the potential consequences. Generally, this involves determining the frequency of the initiating event and then determining the probability of all other contributing events. Without extensive experience in quantitative risk assessment and a comprehensive database of failure rates, this becomes a judgmental activity and may result in assigning frequencies to scenarios that are much lower than would be determined through quantitative analysis. Because risk ranking is a semi-quantitative tool, is must be conservative and in some cases assign higher than actual frequencies to scenarios. In those cases the company may choose to conduct a quantitative risk analysis to refine the number before investing considerable resources to mitigate that risk. The final step in developing the risk matrix is to translate the tolerability criteria

onto the matrix. At a minimum the risk matrix must have clear blocks where the risk is tolerable or intolerable. Another matrix taken from the CCPS Guidelines for Hazard Evaluation Procedures, Second Edition, is shown in Table 3. Table 3: Example Risk Ranking Matrix Source: CCPS Guidelines for Hazard Evaluation Procedures, Second Edition Consequence Frequency 1 2 3 4 4 IV II I I At a minimum the risk matrix must have clear blocks where the risk is tolerable or intolerable. 3 IV III II I 2 IV IV III II 1 IV IV IV III There are some issues with this example. First, in the first row of the risk-ranking matrix (Table 3), the risk rank changes from a II for consequence category 2 to a IV for consequence category 1. This creates a disconnect in the risk ranking as there is no risk rank of III for events with a frequency of 4. Table 4 provides a description of the risk ranking categories used in Table 3. For risks ranked I or II there is a time period specified for implementation of mitigation measures. This is a sure way to violate your own procedures and incur the associated liability by recommending mitigating measures that may take longer than the specified time to implement, especially if it requires approval of a capital project. Therefore, special procedures and approvals need to be put in place to waive the time limits for those situations. Also in Table 4, the description of Risk Rank III is defined as Acceptable with controls. This is somewhat confusing as all scenarios are acceptable with the proper controls. That is the whole point of risk assessment. Do we assume that there is no need to verify that procedures and controls are in place to mitigate scenarios with a Risk Rank of IV? So how do we avoid these pitfalls and still have an effective risk-ranking tool for use in making risk decisions in day-to-day operations, such as during hazard and operability (HAZOP) studies? One option is to avoid using quantitative frequencies or probabilities for the likelihood ranges and use a layer of protection analysis (LOPA) approach as shown in the Table 5. This approach is not perfect, but it is simple to implement and easy for most HAZOP participants to understand. The highest likelihood range (level 4) is defined by the likelihood of the initiating event (e.g., human error, control failure). Then for each level of protection that exists the likelihood range is reduced one level. This approach assumes that each level of protection has a similar failure probability, which is generally acceptable for rough risk screening such as HAZOP risk ranking. Some failures have fairly well defined frequencies and can be used directly as shown in the table. For example, the spontaneous failure of a pressure vessel has a frequency in the range of 10-5 per year and thus by itself would qualify as a level 1 likelihood. Similar likelihood levels can be defined for other common equipment loss of containment failures like pipe and hose leaks and

ruptures. Table 4: Example Risk Ranking Categories Source: CCPS Guidelines for Hazard Evaluation Procedures, Second Edition Risk Rank Category Description I Unacceptable Should be mitigated with engineering and/or administrative controls to a risk ranking of III or less within a specified period such as six months II Undesirable Should be mitigated with engineering and/or administrative controls to a risk ranking of III or less within a specified period such as 12 months III IV Acceptable with controls Should be verified that procedures or controls are in place Acceptable as is No mitigation required Table 5: Likelihood ranges based on levels of protection The below likelihood ranges can be used in conjunction with typical consequence ranges shown in Table 6 Likelihood Range Level 4 Level 3 Qualitative Frequency Criteria: Typical Scenarios Initiating event or failure Hose leaks/ruptures One level of protection Piping leaks One option is to avoid using quantitative frequencies or probabilities for the likelihood ranges and use a layer of protection analysis (LOPA) Level 2 Level 1 Two levels of protection Full-bore failures of small process lines or fittings Three levels of protection Tank/process vessel failures Table 6: Typical Consequence Range Criteria Consequence Range Qualitative Safety Consequence Criteria Level 4 Level 3 Level 2 Onsite or offsite: Potential for multiple life-threatening injuries or fatalities. Environment: Uncontained release with potential for major environmental impact Property: Plant damage value in excess of $100 million Onsite or offsite: Potential for a single life-threatening injury or fatality. Environment: Uncontained release with potential for moderate environmental impact Property: Plant damage value in the range of $10-100 million Onsite or offsite: Potential for an injury requiring a physician's care. Environmental: Uncontained release with potential for minor environmental impact Property: Plant damage value in the range of $1-10 million Level 1 Onsite: Potential restricted to injuries requiring no more than first aid. Offsite: Odor or noise complaint Environment: Contained release with local impact Property: Plant damage value in the range of $0.1 to 1 million

The resulting risk matrix is shown in Figure 1. Figure 1: Risk Matrix In order to develop effective mitigating measures, it helps to understand how different layers of protection are challenged as a typical incident develops. Table 8 provides the typical activation order of different layers of protection in response to a process deviation. A failure (the initiating event) occurs that takes the process outside of its normal operating range. The basic process controls, alarms, interlocks and operator supervision are the first to respond by adjusting process parameters to return to normal operating range. As the process reaches one or more of its operating limits, the SIS or ESD systems activate to maintain the process in a safe condition by shutting down all or part of the process. This is the last point at which the chemicals in the process can be kept in their primary containment systems. When the process parameters reach the equipment design limits the relief systems are the next to activate. Once a loss of containment incident has occurred, the only option is to try to reduce its consequences through emergency response. And the action required based on the risk ranking is shown in Table 7. Risk Level Action Required A B C D Risk mitigation required to risk level C or D Risk mitigation required to risk level C or D Risk mitigation to risk level D is optional No further risk mitigation required

Table 8: Expected activation order of layers of protection: 1. Process or equipment designed for process operating limits 2. Basic process controls and alarms, and operator adjustments to process deviations 3. Critical alarms and operator response to process approaching operating limits 4. Safety Interlock Systems (SIS) or Emergency Shutdown (ESD) systems take action at operating limits 5. Relief systems that activate at equipment design limits 6. Mitigation systems that contain the effects of incidents 7. Plant emergency response to control the effects of incidents 8. Community emergency response to protect the public from the effects of an incident Table 9: Strategy for reducing risk: 1. Inherent (eliminate the hazard by using less hazardous materials, reducing inventory, operating under less severe conditions, incorporating human factors in design) 2. Passive (minimize the hazard through process and equipment design by making the equipment resistant to upsets) 3. Active (detect and control process deviations to avoid exceeding operating limits and equipment design limits) 4. Procedural (prevent or control incidents through administrative controls, such as procedures, training, safe work practices and emergency plans) The benefit of using a risk matrix is that it identifies those risks that need to be mitigated and therefore allows for more cost-effective risk mitigation. As mentioned earlier, different mitigating measures can provide different levels of protection. Table 9 provides a typical strategy for developing recommendations to mitigate scenarios with intolerable risk levels. The most effective mitigation is to make the process more inherently safe. The next most effective mitigating measures are passive systems that do not require any external means of activation, followed by active systems. The least desirable option is to use administrative controls. The latter are prone to failure from either a breakdown of management systems or human error. The approach presented in Tables 5 and 6 and Figure 1, addresses the main issues associated with the development of risk matrices and simplifies the use of this tool without the need to establish corporate risk tolerability criteria. It allows different risks (personnel, public, environmental and business) to be identified and mitigated. It is simple to use and does not require expertise in quantitative risk assessment It allows recommendations to be prioritized (from A to D) based on risk level. It allows all scenarios to be mitigated to a tolerable (C or D) risk level, and to show on the matrix how the risk was reduced (by reducing likelihood, consequence or both)

The key to risk management is to identify risks that are intolerable and to mitigate them to a tolerable level. In a PHA study, teams can usually identify ways to reduce the risk of any scenario. The benefit of using a risk matrix is that it identifies those risks that need to be mitigated and therefore allows for more cost-effective risk mitigation. This is becoming increasingly important as companies have reduced their operating budgets and have limited resources to manage risk. iomosaic s Consulting Services: Auditing Calorimetry, Reactivity, and Large-Scale Testing Due Diligence Support Effluent Handling Design Facility Siting Fire and Explosion Dynamics Incident Investigation and Litigation Support Pipeline Safety Pressure Relief Design Process Engineering Design and Support ABOUT THE AUTHOR Mr. Ozog is a General Partner at iomosaic Corporation. Prior to joining iomosaic, Mr. Ozog was a consultant with Arthur D. Little, Inc. for twenty one years, where he managed the process safety consulting business. He also worked for seven years at the DuPont Company as a process and startup engineer. Mr. Ozog is an expert in process safety and risk management, process hazard analysis (HAZOP, FMEA, FTA), and process safety auditing. He has helped numerous companies and governmental agencies identify process risks and implement cost effective mitigation measures. He teaches courses in each of these areas and is also an instructor for the American Institute of Chemical Engineers' Educational Services. Mr. Ozog has a B.S. and M.S. in Chemical Engineering from the Massachusetts Institute of Technology. He is a member of the American Institute of Chemical Engineers and serves on various sub-committees for them. Process Hazards Analysis Risk Management Program Development Quantitative Risk Assessments Software Structural Dynamics Training At iomosaic, we are helping our clients discover practical and cost effective solutions to safety, risk, and business challenges. iomosaic Corporation is a leading provider of safety and risk technology consulting services and software solutions. WE RE ON THE WEB: WWW.IOMOSAIC.COM CONTACT US iomosaic Salem Corporate Headquarters 93 Stiles Road Salem, NH 03079 Tel: 603-893-7009 Fax: 603-251-8384 iomosaic Houston 2401 Fountain View Drive Suite 850 Houston, TX 77057 Tel: 713-490-5220 Fax: 832-533-7283 iomosaic Minneapolis 401 North 3 rd Street Suite 410 Minneapolis, MN 55401 Tel: 612-338-1669 Fax: 832-533-7283