Hospital Authority. Enhanced Contingency Planning for IT System Failures



Similar documents
Systems Support - Standard

Enterprise UNIX Services - Systems Support - Extended

ESXi Cluster Services - SLA

For more information, please visit the IST Service Catalog at

Emergency Recovery. Corporate Business Continuity Plan

IT Security Incident Management Policies and Practices

Oracle Maps Cloud Service Enterprise Hosting and Delivery Policies Effective Date: October 1, 2015 Version 1.0

Cyber Security Incident Handling Policy. Information Technology Services Center (ITSC) of The Hong Kong University of Science and Technology

Success or Failure? Your Keys to Business Continuity Planning. An Ingenuity Whitepaper

PATCH MANAGEMENT. February The Government of the Hong Kong Special Administrative Region

BUSINESS CONTINUITY PLANNING

Managed Service Plans

MSP Service Matrix. Servers

Title: DISASTER RECOVERY/ MAJOR OUTAGE COMMUNICATION PLAN

Get what s right for your business. Technologies.

LAMAR STATE COLLEGE - ORANGE INFORMATION RESOURCES SECURITY MANUAL. for INFORMATION RESOURCES

Ohio Supercomputer Center

The Service Provider will monitor the VM for the Customer and provide notifications on an opt in basis, which is strongly recommended.

RSA SecurID Tokens Service Level Agreement (SLA)

Data Management Policies. Sage ERP Online

Managed Security Services SLA Document. Response and Resolution Times

Continuity of Operations Planning. A step by step guide for business

Report on Hong Kong SME Cloud Adoption and Security Readiness Survey

Information and Communication Technology. Patch Management Policy

Virginia Commonwealth University School of Medicine Information Security Standard

The University of Iowa. Enterprise Information Technology Disaster Plan. Version 3.1

BUSINESS CONTINUITY MANAGEMENT REQUIREMENTS FOR SGX MEMBERS NEW RULES FOR INCLUSION IN SGX-ST RULES

BUSINESS CONTINUITY PLANNING

University of Pittsburgh Security Assessment Questionnaire (v1.5)

i. Maintenance of the operating system, applications, content on the server, or fault tolerant network connections

Business Continuity Management Policy and Plan

Designtech Cloud-SaaS Hosting and Delivery Policy, Version 1.0, Designtech Cloud-SaaS Hosting and Delivery Policy

The Office of the Government Chief Information Officer INFORMATION SECURITY INCIDENT HANDLING GUIDELINES [G54]

EMERGENCY MANAGEMENT POLICY

PSU Hyland OnBase Document Imaging and Workflow Services Level Memorandum of Understanding

GEARS Cyber-Security Services

Managed Storage Service Level Agreement (SLA)

How To Manage A Patch Management Process

Birkenhead Sixth Form College IT Disaster Recovery Plan

NHS Lancashire North CCG Business Continuity Management Policy and Plan

DRAFT Disaster Recovery Policy Template

OREGON STATE UNIVERSITY MASTER EMERGENCY MANAGEMENT PLAN

When Your Networkʼs Down, Call Crown

IT Checklist. for Small Business INFORMATION TECHNOLOGY & MANAGEMENT INTRODUCTION CHECKLIST

ABB s approach concerning IS Security for Automation Systems

Information Services. Standing Service Level Agreement (SLA) Firewall and VPN Services

Data Security Incident Response Plan. [Insert Organization Name]

Honeywell Industrial Cyber Security Overview and Managed Industrial Cyber Security Services Honeywell Process Solutions (HPS) June 4, 2014

HIGH-RISK SECURITY VULNERABILITIES IDENTIFIED DURING REVIEWS OF INFORMATION TECHNOLOGY GENERAL CONTROLS

Symmetry Networks. Corporate Managed Services Schedule

Remote Infrastructure Support Services & Managed IT Services

Karen Winter Service Manager Schools and Traded Services

INFORMATION TECHNOLOGY SERVICES TECHNICAL SERVICES June 2012

Carahsoft End-User Computing Solutions Services

Data Center Colocation - SLA

Managed IT Services. Maintain, manage and report

Network Instruments white paper

INFORMATION TECHNOLOGY SERVICES IT CHANGE MANAGEMENT POLICY & PROCESS

MiServer and MiDatabase. Service Level Expectations. Service Definition

1 Introduction. 2 Design and Functionality. 3 Client Support

Remote Services. Managing Open Systems with Remote Services

Sentinel Platform/Managed IT Services Agreement Page 1 of Term of Agreement

IT Disaster Recovery and Business Resumption Planning Standards

Supplemental IT Solutions: More Reliable Networks Are Our Business

LHRIC Network Support - Additional Service Features

Managed Services Agreement. Hilliard Office Solutions, Ltd. PO Box Phone: Midland, Texas Fax:

Call us today Managed IT Services. Proactive, flexible and affordable

28400 POLICY IT SECURITY MANAGEMENT

Business Continuity Management

Columbia College Process for Change Management Page 1 of 7

Disaster Recovery. 1.1 Introduction. 1.2 Reasons for Disaster Recovery. EKAM Solutions Ltd Disaster Recovery

Statement of Service Enterprise Services - AID Microsoft IIS

FortiCompanion to Technical Support

CITY UNIVERSITY OF HONG KONG Business Continuity Management Standard

Departmental On-Site Computing Support (DOCS) Server Support SLA

THORNBURG INVESTMENT MANAGEMENT THORNBURG INVESTMENT TRUST. Business Continuity Plan

University of Nottingham Emergency Procedures and Recovery Policy

G-Cloud 6 Service Definition DCG Cloud Disaster Recovery Service

IT Disaster Recovery Plan Template

Enterprise IT is complex. Today, IT infrastructure spans the physical, the virtual and applications, and crosses public, private and hybrid clouds.

<Client Name> IT Disaster Recovery Plan Template. By Paul Kirvan, CISA, CISSP, FBCI, CBCP

Frequently Asked Questions: Notice on Technology Risk Management

EHRs and Information Availability: Are You At Risk?

Better secure IT equipment and systems

Office of Information Technology Hosted Services Service Level Agreement FY2009

APPENDIX 7. ICT Disaster Recovery Plan

Service Level Agreement. Server Hosting

Remote IT Support. What is RITS?

Domain 1 The Process of Auditing Information Systems

Service Definition. ADNS Domain V0.4. Signoff. Name Role Signature & Date. Jim Leeper. Windows Platform. Page 1

Dynamics CRM - Support Service Description

Transcription:

For discussion on 27.5.2004 AOM-P325 Hospital Authority Enhanced Contingency Planning for IT System Failures Purpose This purpose of this paper is to highlight enhancements to HAHO and IT Contingency Planning in response to the recent Sasser Worm infection of some Personal Computers across HA. Background 2. Since the inception of HA, IT is a major focus of development. With the growing use of IT in different areas of the HA, nowadays the clinical services are heavily relying on the use of IT systems. Clinicians order over 95% of medications electronically. As major risk is imposed on IT system breakdown and failures, IT contingency plans, back up and recovery procedures have been developed in various service areas both at HAHO and hospital levels to ensure that critical clinical services will still be able to operate probably in a degraded mode and have the capability to be resumed in the case of IT system failures. These include the use of down time, manual systems, standby and backup facilities. The contingency plans and procedures are reviewed from time to time both by the ITD and hospital management. 3. Apart from the traditional threats, e.g. hardware, application software, network, power supply, that may cause failures of IT systems, new threats have been emerging in the IT arena. These threats include the IT hackers, computer virus and worms that are generated external to the HA and can affect the operations of IT systems very rapidly and may potentially cause major IT service interruptions across the whole of the HA. The Sasser Worm Incident 4. On Monday May 3, just before 11:00 am, the Sasser Worm was introduced into the HA Network, most likely via a personal laptop being connected to the network. The Sasser Worm attacked a known vulnerability in the Microsoft Operating System. In the past, code (viruses and worms) to exploit known Microsoft vulnerabilities took about 6 weeks to be released by Hackers providing reasonable time for organizations to test in-house applications before applying the Microsoft supplied software patched to protect against such attacks. In this case the worm appeared within 18 days of Microsoft announcing the vulnerability resulting in a large number of organizations not updating their software in time, including HA. As is the

- 2 - case with all ant-virus tool vendors, they can only develop signatures and removal tools once they get access to the actual virus/worm and time to release fixes various greatly between vendors. 5. Even though HA ant-virus software could protect against Sasser Variant A and B, it was Variant C that entered HA with no anti-virus signature available to HA yet. Our anti-virus software vendor supplied a beta version directly from their development laboratories at 13:30 the same day and was subsequently rolled out across HA and the outbreak was contained, infected systems cleaned and most services to end-users restored later the same day. Immediate Corrective Actions 6. In view of this increasing risk of a broader IT service outage affecting most Clusters, the IT Division cancelled a Strategic Planning Session scheduled for Friday 7 May and convened an urgent meeting with key stakeholders to critically assess the potential impacts of these emerging threats, problem identifications and focus on areas of improvements. Briefings were conducted to inform and update the Directors Meeting, Senior Executive Meeting, and the Supporting Services Development Committee. The following immediate actions have been agreed and are being implemented: - Desktop PC Environment 7. On the technical aspect, it was decided to strengthen the control of the desktop PC environments, particularly on the timeliness of application of the latest patches and update of the latest Anti-virus definition files. The use of more automated software and tools to assist in the application of software patches will be employed. Policies and guidelines will be reviewed to assist project teams and hospital users to take precautionary measures on the desktop PC environment to protect against computer virus attacks. Aligning IT Contingency Planning with HA Contingency Planning 8. Historically in HA, IT outages was contained to the failure of a single application or more localized technical infrastructure components limiting the impact on service delivery to patients. As the Sasser incident demonstrated, some IT failures can and will affect the broader HA organization and may occasionally seriously impact in service delivery to patients as HA relies on IT for high volume throughput efficiencies. It became now necessary to align IT Contingency Planning with overall HA Contingency Planning and to start focusing more on business recovery and continuity. 9. A 3-tier problem definition framework has been developed to better define the impacts on critical clinical services that are caused by these major IT system failures. The corresponding responses and escalation procedures have also been developed against different levels of severities. (Refer to Annex I: Checklist for IT Disaster Control Team):

- 3 - (i) (ii) (iii) Response Level 1: substantial disruption in clinical services due to IT problem in one cluster and cluster IT contingency plan is being activated. Response Level 2: substantial disruption in clinical services due to failure of IT system in more than one cluster Response Level 3: substantial disruption in clinical services due to failure of mission critical IT system in most clusters Problem Handling and Communication IT Disaster Control Team 10. To improve the efficiency on handling critical IT problems, it was decided to establish an IT Disaster Control Team in case of major IT system failures to centralize the coordination of problem assessment, monitoring of service impacts and external information dissemination. This team will be headed by the HAHO clinical service coordinators and supported by the IT Cluster Liaison officers and technical staff. On Response Level 3, the highest level of emergencies, the IT Disaster Control Team will be under the direct command of CE. The Communication Plan 11. The communication channel under IT disasters is outlined in Annex II, which defines the problem escalation procedures between ITD, cluster and senior management. The levels of IT disasters, service impacts and recovery time will be communicated to cluster coordinators, HAHO senior management and other relevant parties when necessary. (Refer to Annex 2 - The Communication Channel under IT Disasters) 12. In order for the IT Disaster Control Team to assess and monitor the service impacts, clusters are required to report on hourly basis to the IT Disaster Control Team through a Situation Report. Advice Sought 13. Members are invited to comment on the enhancements to HA/IT Contingency Planning outlined in this paper. Hospital Authority AOM\PAPER\325 27 May 2004

Annex 1 to AOM-P325 Checklist for IT Disaster Control Team IT Contingency Communication & coordination Definition Disaster Control Team Members Data collection and Dissemination IT Response Response Level 1 Response Level 2 Response Level 3 Substantial disruption in clinical services due to IT problem in one cluster and cluster IT contingency plan is being activated Cluster IT disaster control coordinator & team Head Office IT Disaster control team (HOITDC) - Establish channel of data communication & dissemination from Hospital to HAHO - Situation report from affected cluster hourly - Feedback to cluster on time for recovery - Assess need to inform other cluster Cluster IT Contingency Plan HO Support for - damage control & - disaster recovery - activate system- specific contingency Situation monitoring Substantial disruption in clinical services due to failure of IT system in more than one cluster + DD(IT)/ D(F) +Activate HO IT Disaster control centre + Establish communication channel to other clusters + alert non-affected clusters + focus recovery on prioritised clinical areas,e.g. SOPD, pharmacy, A&E, on mission critical systems + estimated time for recovery Substantial disruption in clinical services due to failure of mission critical IT system in most clusters +CE, D(F), Central Command Committee (if major disaster) +Activation of communication channel to all clusters for assessment of service disruption and for updating progress in all clusters (initial hourly communication) Same as 2

- 2 - IT Contingency Communication & coordination Service reorganization Response Level 1 Response Level 2 Response Level 3 Consider reprioritisation of service. Direct patient service should continue as far as possible. Closure of service is not expected. If closure of service of a limited section of the hospital is unavoidable, the authority for decision making rest with CCE. CE should be informed. Same as 1. If there is closure of services involving more than one cluster or the closure involve major service implications, CE s endorsement should be sought Same as 2 Clinical management According to contingency plan Same as 1 Same as 2 HR Activate deployment at hospital level if Activate deployment at cluster required level if required Activate deployment at HA level if required Communication- internal Notify D(F) Alert & notification to Head Office Duty Officer Communication to other clusters if needed + Notify CE, + notify cluster IT coordinators - clinical informatics - hospital admin - cluster/hospital IT + Notify Directors/CCE, Chief Pharmacy Office, Finance. Communication- external Alert & notify Public Affairs Section Situation report to Public Affairs Section Liaise with other As appropriate for incident at local level As appropriate for incident at Government Departments central level Activate notification network as per major disaster Situation report to Public Affairs Section As appropriate for incident at higher level Governance As in routine Subsequent reporting CE to inform Chairman and Board with updating. Hospitals to inform Hospital Governing Committee.

Annex 2 to AOM-P325 The Communication Channel under IT Disasters ITD Problem Escalation Procedure Cluster IT Contingency Plan Activation COMMUNICATION CHANNEL BEYOND ITD Health Informatics (HI)Team Disaster Coordination Officer (Dr KH Lee/Dr James Kong) (via IT disaster hotline) Situation Level 1 (Discussed with DD(IT) Inform D(F)) Situation monitoring Affected cluster to send report on the hour (00 to 15 min) until stand down Assess need to inform other cluster Inform Head Office Duty Officer, Public Affairs, Chief Pharmacy Office Level 2 (discuss with D(F) and DD(IT) Affected cluster to send report on the hour (00 to 15 min) until stand down Activate IT control Inform other cluster Inform Head Office Duty Officer, Public Affairs, Chief Pharmacy Office, Finance, Human Resources, Finance Management Activate IT control centre All cluster to send report on the hour (00 to 15 min) until stand down Level 3 (HI/D(F) to check with CE) Consider Activating Central Command Committee Declare to all hospital IT Co - ordinator