ICT Disaster Recovery Plan



Similar documents
DeltaV Virtualization High Availability and Disaster Recovery

Table of contents. Matching server virtualization with advanced storage virtualization

South Colonie Central School District Server Virtualization and Disaster Recovery Plan

Backup and Redundancy

BACKUP STRATEGY AND DISASTER RECOVERY POLICY STATEMENT

1.1 In consultation with management, to identify against business objectives, issues of self-development and training.

CLOUD SERVICE SCHEDULE Newcastle

Shared Machine Room / Service Opportunities. Bruce Campbell November, 2011

Guardian365. Managed IT Support Services Suite

Re-Colocation. DataCenter. DataCenter Re-Colocation Service WhiteBook. AnchNet

DISASTER RECOVERY WITH AWS

Introduction. Setup of Exchange in a VM. VMware Infrastructure

Determine dates with you telecom suppliers so that the new office is online before your move for both Phones and Data connections.

Information Technology Department Annual Report

Contents. Finance and Information Technology Directorate. Disaster Recovery Policy

EMC Data Domain Management Center

Synology Disaster Recovery Deployment Guide Document ID

Nutanix Tech Note. Failure Analysis All Rights Reserved, Nutanix Corporation

Chabot Las Positas Community College District

ACME Enterprises IT Infrastructure Assessment

Sagari Ltd. Service Catalogue and Service Level Agreement For Outsource IT Services

WhatsUp Gold v16.3 Installation and Configuration Guide

1. Management Application (or Console), including Deferred Processor & Encryption Key 2. Database 3. Website

Aljex Software, Inc. Business Continuity & Disaster Recovery Plan. Last Updated: June 16, 2009

Westek Technology Snapshot and HA iscsi Replication Suite

HP Data Protector software Zero Downtime Backup and Instant Recovery. Data sheet

INFRASTRUCTURE AS A SERVICE (IAAS) SERVICE SCHEDULE Australia

VDI Best Practices with Citrix XenDesktop.

Server Virtualization with VMWare

NOTICE ADDENDUM NO. TWO (2) JULY 8, 2011 CITY OF RIVIERA BEACH BID NO SERVER VIRTULIZATION/SAN PROJECT

Information Services hosted services and costs

Instant Recovery for VMware

IT Discovery / Assessment Report Conducted on: DATE (MM/DD/YYY) HERE On-site Discovery By: AOS ENGINEER NAME Assessment Document By: AOS ENGINEER NAME

BME CLEARING s Business Continuity Policy

Disaster Recovery & Business Continuity Plan for ICT Services

Advanced VMware Training

Small Business Server Part 1

Quorum DR Report. Top 4 Types of Disasters: 55% Hardware Failure 22% Human Error 18% Software Failure 5% Natural Disasters

Complete Storage and Data Protection Architecture for VMware vsphere

Best practices for operational excellence (SharePoint Server 2010)

Server Virtualization A Game-Changer For SMB Customers

CLOUD SERVICE SCHEDULE

About Backing Up a Cisco Unity System

Using Emergency Restore to recover the vcenter Server has the following benefits as compared to the above methods:

Contract # Accepted on: March 29, Starling Systems. 711 S. Capitol Way, Suite 301 Olympia, WA 98501

EMC Integrated Infrastructure for VMware

e Shandor Simon Director, Networking Services Latin School of Chicago

ABB Technology Days Fall 2013 System 800xA Server and Client Virtualization. ABB Inc 3BSE en. October 29, 2013 Slide 1

Quick Setup Guide. 2 System requirements and licensing Kerio Technologies s.r.o. All rights reserved.

Overview Customer Login Main Page VM Management Creation... 4 Editing a Virtual Machine... 6

Introduction to Microsoft Small Business Server

Customized Cloud Solution

High Availability and Disaster Recovery Solutions for Perforce

Technical Paper. Leveraging VMware Software to Provide Failover Protection for the Platform for SAS Business Analytics April 2011

Belgacom Group Carrier & Wholesale Solutions. ICT to drive Your Business. Hosting Solutions. Datacenter Services

POSITION DESCRIPTION

Source-Connect Network Configuration Last updated May 2009

Internet Redundancy How To. Version 8.0.0

Modification after decommission of AX100 SAN (RVN00-FILEDR) /3/2009 Deputy IT Operations Manager

POLICY NAME IT DISASTER RECOVERY POLICY AND PLAN POLICY NUMBER POLICY FILE REFERENCE 3/3/6 DATE OF ADOPTION REVIEW OR AMENDMENT DATES

DISASTER RECOVERY PLAN FOR MKHAMBATHINI MUNICIPALITY

APPENDIX 7. ICT Disaster Recovery Plan

High Availability Solution

APPENDIX 7. ICT Disaster Recovery Plan

NetIQ Advanced Authentication Framework. Maintenance Guide. Version 5.1.0

TalentLink Disaster Recovery & Service Continuity

Application Note: Failover with Double- Take Availability and the Scale HC3 Cluster. Version 2.0

I Product description for serverloft Dedicated Servers

Lab 5 Explicit Proxy Performance, Load Balancing & Redundancy

MAKING YOUR VIRTUAL INFRASTUCTURE NON-STOP Making availability efficient with Veritas products

Solutions as a Service N.Konstantinidis Technical Director - MNG

EMC AVAMAR INTEGRATION WITH EMC DATA DOMAIN SYSTEMS

Availability and Disaster Recovery: Basic Principles

Active Directory Infrastructure Design Document

Villiers High School ICT Department Job Description

Curriculum Network Support

ActiveImage Protector 3.5 for Hyper-V with SHR. User Guide - Back up Hyper-V Server 2012 R2 host and

Virtually Effortless Backup for VMware Environments

Managed Services Overview Servers, Exchange, Help Desk, and Citrix Infrastructures

Softverski definirani data centri - 2. dio

Page 1 of 5

Virtualization: Benefits & Pitfalls. Matt Liebowitz, Kraft Kennedy Tim Garner, Aderant Mike Lombardi, Vertigrate Sergey Polak, Ropes & Gray LLP

Server Virtualization and Consolidation

VMWARE VSPHERE 5.0 WITH ESXI AND VCENTER

McAfee Endpoint Encryption Hot Backup Implementation

CA ARCserve Family r15

XenClient Enterprise Synchronizer Migration

Upgrade to Webtrends Analytics 8.7: Best Practices

Audit4 Installation Requirements

Virtualization, Business Continuation Plan & Disaster Recovery for EMS -By Ramanj Pamidi San Diego Gas & Electric

HIGH AVAILABILITY STRATEGIES

High Availability with Windows Server 2012 Release Candidate

Virtual Web Appliance Setup Guide

Planning, Implementing and Managing SafeBoot Enterprise Systems

How To Backup A Virtualized Environment

CVE-401/CVA-500 FastTrack

Informix Dynamic Server May Availability Solutions with Informix Dynamic Server 11

Proven Technical and Management skills over a career spanning more than 10 years. Brunswick Vic 3056 Australia

b. Contact for contract issues/requests (Including billing)

VM backup is the new standard fare

Transcription:

Dronfield Henry Fanshawe School Policy No: S33 ICT Disaster Recovery Plan Revision No: 1 Date Issued: September 2012 Committee: Author: Statutory RDD Date Adopted: September 2012 Minute No: 12/20 Review Date: September 2015 The following are situations that would affect the functionality of the Dronfield Henry Fanshawe School network system and their proposed solution. Please note that backups are taken of all critical servers and data. Whole School Power Failure Chance of failure <0.5% Effect - Critical Should the school suffer a complete power failure then all the computers would shutdown. The servers have uninterruptible power supplies (UPS) and would shut themselves down automatically after 15 minutes. The UPS s would safely shutdown the servers resulting in no loss of server data. All data from the desktops which is not already saved to the servers would be lost. A whole school power failure would result in the closure of the school until the situation was resolved. Once the power was restored then all the servers would need manually restarting. Solution: There is no effective workaround to this problem. The school would need to liaise with the utilities power supplier to determine the seriousness of the problem and the downtime. Downtime Unknown Partial Power Failure A-Block Critical Chance of failure - <1% Effect - Critical A failure of the power supply to A-Block would result in the loss of the Server Room and all the main ICT suites. Solution: There is no effective workaround to this problem. The school would need to liaise with the utilities power supplier and school electrician to determine the seriousness of the problem and the downtime. Downtime - Unknown Partial Power Failure Server Room Serious Effect - Low A failure of power to the Server Room would result in the failure of the whole network due to the loss of the core switches.

Solution: The server room is powered from 2 different power distribution boards. Should the server room board fail then equipment can be changed to the secondary board. Downtime Max 2 Hours Failure of Core Switches Chance of failure - <0.5% Effect - None Both of the core switches has several redundancy features already built-in but it is still possible for them to fail totally. Solution: Each of the core switches has redundancy built in to the switch plus there are two switches per function. Failure of Room Switches Chance of failure - <1% Effect - Low Failure of room based switches would cause limited data loss. Solution: A room based switch is always held as a spare and used if required. Downtime 30 Minutes. Failure of Blade Enclosure, Virtual Connects and Power Supplies Chance of failure - <0.5% Effect Critical Total failure of the Blade Enclosure is very unlikely as the parts within it none moving and have multiple redundancy including 6 power supplies and 2 network controllers. However if the enclosure did fail then this would fail every server in the enclosure and restrict access to critical data. Solution: The Blade enclosure and its internal components are covered with a 5 year next working day warranty. Downtime Access to admin and curriculum data could be restored with 24 hours via the backup server located in F-Block. The main key service which would not be available would be the 200 thin clients located around school and they would be offline until the blade enclosure was restored (2 days). Failure of individual Blade Host Server (VMWare or Citrix) Chance of failure - <1% Effect None The servers are configured in an n+1 configuration meaning that there is always 1extra server available should one fail. Solution: Failure of a host will result in the resources on that server being shared among the other remaining servers. None Failure of SAN1 (VMWARE) Chance of failure - <0.1% Effect Low

SAN1 is a storage area network device with multiple redundant parts including redundant hard disks, power supplies and network connections. A failure of the SAN completely would result in some servers being unavailable depending on which servers were hosted on that SAN. Solution: Failure is unlikely and individual failure of redundant parts would have no effect. In the very unlikely failure of the SAN enclosure then some data would migrate automatically to SAN2 otherwise data would need to be restored from backup. The SAN enclosure is on a next day warranty and could be replaced. 48 hours Failure of SAN2 (VMWARE) Chance of failure - <0.1% Effect Low SAN2 is a storage area network device with multiple redundant parts including redundant hard disks, power supplies and network connections. A failure of the SAN completely would result in some servers being unavailable depending on which servers were hosted on that SAN. Solution: Failure is unlikely and individual failure of redundant parts would have no effect. In the very unlikely failure of the SAN enclosure then some data would migrate automatically to SAN1 otherwise data would need to be restored from backup. The SAN enclosure is on a next day warranty and could be replaced. 48 hours Failure of SAN2 (Citrix) Chance of failure - <0.1% Effect Low SAN3 is a storage area network device with multiple redundant parts including redundant hard disks, power supplies and network connections. A failure of the SAN completely would result in the loss of the Citrix environment and virtual desktops feeding the thin clients (10ZIGs). Solution: Failure is unlikely and individual failure of redundant parts would have no effect. In the very unlikely failure of the SAN enclosure then we would be reliant on a replacement enclosure being supplied by HP. The SAN enclosure is on a next day warranty and could be replaced. 48 hours Failure of Master Domain Controller (DHFS-V-DC01) Effect - Low The failure of the master domain controller would limit the issuing of DHCP IP addresses. Other domain controllers would take over other functionality such as DNS. Solution: DHCP would be installed on another domain controller. Fix master domain controller as soon as possible. Downtime 1 Hour Failure of Domain Controllers (DHFS-V-DC02, DHFS-V-DC03 and Backup) Effect - None The failure of a domain controller would have limited effect on functionality as the other DC s would take over its functions. We have enough domain controllers to continue working

Solution: Fix failed domain controller as soon as possible. Failure of Admin Server Effect - Medium The failure of the Admin server would cause loss of access to staff and admin based data. Solution: Give temporary access to the backup server until the Admin Server is fixed. Fix Admin server as soon as possible and restore data from backup server to Admin. Downtime 2 to 3 hours Failure of Storage Server Effect - Medium The failure of the Storage server would cause loss of access to student and curriculum based data. Solution: Give temporary access to the backup server until the Storage Server is fixed. Fix Storage server as soon as possible and restore data from backup server to Admin. Downtime 2 to 3 hours Failure of SQLServer s (SQLServer an SQLServer2) Chance of failure - <10% Effect - Low Loss of MIS system and eportal. Solution: There are 2 eportal servers and any failure would have limited effect as the other server could be used. The failed server would then be fixed as soon as possible Failure of SQLMain Effect - High Loss of SQL Database functionality including access to MIS, Exchequer and Opera Solution: Install SQL on another VM and restore the data for MIS, Exchequer and Opera. Downtime 1 Day Failure of Web Server Effect - Medium Loss of School Website and VLE. Solution: Create another VM and restore data from backup Downtime 1-2 days Failure of Anti-Virus server (DHFS-V-CTRL01)

Effect - None The failure of the Anti-virus server would result in the AV not being updated on the servers and machines in school. The machines would work with their current updates. Lack of AV updates for 1 day would be negligible. Solution: Recreate the VM and install Sophos. Failure of VM Management Server (DHFS-V-MGMT01) Effect - None This would result in loss of control of the 10zig thin clients Solution: We would recreate a new VM and reinstall the 10zig management software Failure of VMware Centre Server (DHFS-V-VC01) Effect Low (High) This would result in servers not being able to vmotion to another host. Failure of this server alone would not result in any downtime but a subsequent loss of a host would mean that a virtual server would not remain up Solution: Restore vcentre to another Host (migrate servers to remaining hosts) (0.5 days) Failure of Phone System server (DHFS-V-Unify) Effect High This would result in a loss of the internal phone system. Solution: Whilst the server is recovered any calls to the school would be redirected to a mobile phone in reception or a personal mobile associated with a DDI. Downtime 1 day Failure of Backup Server Effect Low The failure of the Backup server would not allow us to backup or recover data. Solution: Transfer operation of Backup server temporarily to another machine located in F-Block. Fix Backup server as soon as possible. Failure of Proxy Server Effect - Medium Should the Proxy Server fail then external access would be lost to the web site, intranet and VLE.

Solution: Temporarily install all data to another server. Fix Proxy server as soon as possible. Downtime 1 day Failure of Exchange Effect - High Loss of email access. Solution: Either fix the Exchange server or copy functionality to another server. Restore backups if necessary Downtime 1-2 days Failure of Appserv1 Effect - Low Loss of access to student controlled assessment work and Impero. Solution: Fix the server and restore backups if necessary. Downtime 1 day Failure of Citrix Environment Servers Effect Medium Servers DHFS-V-CSG01 DHFS-V-CTX01 DHFS-V-CTXDS01 DHFS-V-DDC01 DHFS-V-DDC02 DHFS-V-FS01 DHFS-V-WI01 DHFS-V-WI02 NFS_03 The above servers control the Citrix environment. Although these servers do have some redundancy there are 2 servers whose failure would cause a total loss of access to virtual desktops. These are DHFS-V-CSG01 and NFS_03. Solution: Each of the servers can be reinstalled or restored from backup. NFS_03 holds the base images and this server would need to be recovered and the images restored. 1-2 Days Failure of Door Control Server (DHFS-V-DoorCtrl) Effect - Low The failure of the door control system would result in the door controllers failing to receive updates regarding user identities. The doors would continue to open based on their current data. Solution: The doors would continue to work or would be defaulted to open whilst school was in operation. Restore the server as quickly as possible. None

Failure of Print Control Server (DHFS-V-PCUT) Effect - Low The print control server looks after all network printing in school. It s failure would result in loss of network printing Solution: Install Papercut on another virtual server and restore Papercut settings 0.5 1 day Failure of Lightspeed Web Filter Chance of failure - <1% Effect - High Loss of secure internet. Internet access could be maintained but it would not be properly filtered. Solution: This is proprietary equipment and is under a 3 year next day replacement warranty. The school could run unfiltered but this could not be advisable. Downtime 1-2 days Failure of Router Chance of failure - <0.5% Effect Low Loss of internet and email access. Solution: The main router is the property of KCOM and as such we have no control over repair times. We do have a backup ADSL line which could be used for access and is accessible behind the Lightspeed web filtering server so secure internet access could be maintained. but the service would be degraded. Fire in A-Block Chance of failure - <0.1% Effect Critical A fire in A-Block would be the most serious situation for the computer network as this is the location of the Server Room containing all the servers and core switches, and most of the major ICT suites. Solution: Although most of the equipment could be bought reasonably easily off the shelf the main problem would be redirecting fibre optic cabling. We would be able to get limited access to other buildings within 2-3 days but full access would be 7-14 days. Restoring the server room and ICT suites would be dependent on the level of destruction. The main delay would be restoring the infrastructure of the building. Downtime Unknown Fire in Other Blocks (Non A-Block) Serious Chance of failure - <0.1% Effect - High A fire in another building other than A-Block would require restoration of the infrastructure. Solution: Restore the building infrastructure as soon as possible. ICT equipment could be bought with 7 days. Cabling would take longer. Downtime Unknown