Council Meeting - 14 May 2013 ICT systems Back-up, business continuity and disaster recovery proposals 1. Purpose of report To seek approval for improvements to the councils ICT back-up arrangements and the provision of a full business continuity and disaster recovery solution. 2. Key issues 2.1 The Council currently backs up its Westport House (WPH) ICT systems to magnetic tape cassettes, which are stored offsite at the Purbeck Sports Centre (PSC). Conversely the PSC backup tapes are stored at WPH. The current regime is inadequate because not all the data can be backed up each night. Other risks associated with this arrangement are that tapes could be lost or damaged during transit, and it is also a labour intensive process. Furthermore, should there be an incident involving damage to hardware or loss of one of the buildings, it would be necessary to purchase and configure recovery equipment before backed up systems could be restored. This represents a significant extra risk to business continuity and the Council s disaster recovery capability. 2.2 Management Team has undertaken a review of the situation with the IT Manager and has concluded that continuing with the current inadequate arrangements is not advisable. As a minimum, steps need to be taken to enable WPH systems to be backed up online to PSC and the PSC systems should be relocated to WPH. By taking these steps, it would also be opportune to equip PSC for possible use as a restoration and relocation site should an incident occur that results in Westport House being unavailable for use. 2.3 However, the range of options available to the Council, (see section 5, Further information, below for details), include: Relocate PSC systems and data to WPH and backup online to PSC using a Network Attached Storage (NAS) device, Relocate PSC systems and data to WPH and backup online to PSC using a Storage Area Network (SAN) device; and Relocate PSC systems and data to WPH and implement a full business continuity and disaster recovery standby facility at PSC. Commission the private sector or another council to receive backup data and provide business continuity arrangements. 2.4 Policy Group considered this matter on 17 April 2013 and endorsed the following recommendation. 3. Recommendation Council gives approval for: 1
(1) the relocation of PSC systems and data to WPH with a full business continuity and disaster recovery standby facility at PSC; (2) expenditure amounting to 44,100 in respect of the estimated capital set up costs and 3,800, rising to 7,500 from 2016/17, in respect of the estimated on-going revenue cost. 4. Policy issues 4.1 How will this affect the environment, social issues and the local economy? The Council depends on the IT systems and up to date data to provide services to local people, businesses and visitors. The proposed improvements to the back-up arrangements and the provision of a full business continuity and disaster recovery solution would reduce the potential disruption to services were there to be an event that resulted in damage to the ICT server room or WPH being unavailable. 4.2 Implications 4.2.1 Resources Under the current arrangements, IT staff manually change over tapes and convey the tapes between sites, which typically take up to 45 minutes per day time which could be spent by staff on resolving IT issues for staff and Councillors. The staff time that could be redeployed to other vital work is estimated to cost 7,300 per annum. Continuing with the current inadequate arrangements would necessitate the replacement of the existing tape library device in the near future (see paragraph 5.9 below). This would not be replaced if online back-ups are implemented. There would also be no need to purchase backup tapes and cleaning cassettes which cost the Council up to 200 per annum. As explained in paragraph 5.5 below, a new communications link between WPH and PSC is also required irrespective of any changes to the back-up arrangements. The expenditure shown in the following table would therefore be incurred whether or not the proposed online back-up and business continuity and disaster recovery standby facility at PSC are implemented. Replacement tape library 4,000 DPSN connection setup cost 4,600 Excess construction charge (telegraph pole carry) (DPSN) 6,000 Consultancy (DPSN) 1,300 Total 15,900 Revenue Item Support & maintenance (Tape ibrary) first 3 yrs included in purchase price 2013/14 2015/16 2016/17-2017/18 0 400 2
100Mb fibre DPSN circuit to PSC Circuit charge 90Mb Core charge DPSN circuit to WPH 3,000 3,000 800 800 Total 3,800 4,200 The unapproved capital programme for 2013/14 includes a provision for ICT business continuity and disaster recovery infrastructure improvements totalling 55,000. The estimated incremental capital and revenue costs for each of the options given in paragraph 2.3 are as follows: 4.2.1.1 Relocate PSC systems and data to WPH and backup online to PSC using a Network Attached Storage (NAS) device. No replacement tape library would be required, so in addition to the estimated DPSN costs set out above at 4.2.1, the extra costs for implementing this basic option would be: Network Attached Storage (NAS) device extra to continue as now option 6,000 Consultancy extra to continue as now option 1,300 Total extra to continue as now option 7,300 Revenue Item Support & maintenance (NAS) first 3 yrs included in purchase price 2013/14 2015/16 2016/17-2017/18 0 600 Total extra to continue as now option 0 600 4.2.1.2 Relocate PSC systems and data to WPH and backup online to PSC using a Storage Area Network (SAN) device. In addition to the estimates above at 4.2.1, the extra costs for implementing this higher specification option would be: Storage Area Network (SAN) - extra to NAS option 9,000 SAN Fibre Channel Switch 3,500 Consultancy extra to NAS option 900 Total extra to NAS option 13,400 3
Revenue Item Support & maintenance (SAN & Switch) first 3 yrs included in purchase price - extra to NAS support 2013/14-2015/16 2016/17-2017/18 0 1,500 Total extra to NAS option 0 1,500 4.2.1.3 Relocate PSC systems and data to WPH and implement a full business continuity and disaster recovery standby facility at PSC. In addition to the estimates at 4.2.1.3, the extra costs for implementing this full recovery option would be: Structured cabling changes for recovery centre and internal PSC 4,000 Esx virtual server host equipment at PSC 8,000 Consultancy (including project scoping) extra to SAN option 3,500 Total extra to SAN option 15,500 Revenue Item Support & maintenance Esx server host first 3 yrs included in purchase price 2013/14-2015/16 2016/17-2017/18 0 1,600 Total to SAN option 0 1,600 The total capital cost for implementing this option, which is the preferred option, is estimated to be 28,200 with an additional revenue cost of up to 3,300 from 2016/17 on top of the expenditure that would be incurred if the Council were to continue with the existing back up arrangements (which is not recommended). 4.2.1.4 Relocate PSC systems and data to WPH and commission the private sector or another council to provide business continuity and disaster recovery facilities. Indicative figures have been obtained for off-site server hosting and disaster recovery / business continuity provision which show estimated one off set up costs of 50,000 with running costs of 16,000 for a three year contract. It is likely the actual cost would be lower than this following a competitive tendering exercise. 4.2.2 Equalities There are no equality issues arising from this report. 5. Further information 5.1 The Council is required to put in place business continuity management arrangements to continue to provide services to the public. 4
5.2 Although data is backed up to magnetic tape, there are risks associated with these arrangements, for example: the previous day s backups could be incomplete and tapes may not be delivered at PSC (and vice versa). Furthermore, neither WPH nor PSC are currently equipped for use as a standby site if one or other were to be inaccessible to staff or so badly damaged that they could not be used. Recovery equipment would have to be purchased at the time of an incident and its setup and the restoration of the backed up data would significantly increase the recovery time for the various Council systems. 5.3 As indicated above at paragraph 2.3 the options available to the Council include: 5.3.1 Relocate PSC systems and data to WPH and backup online to PSC using a Network Attached Storage (NAS) device. This option would allow the Council to relocate PSC systems to the WPH datacentre and deliver business applications to PSC staff in a more efficient and resilient manner. The optic fibre connection required to do this would also facilitate the backing up of systems and data directly from WPH to the PSC outside normal business hours. The NAS option would provide the minimum specification and functionality option for online storage. However, it would not contribute to the Council s business continuity arrangements. Recovery equipment would still need to be purchased, configured and have backed up data restored should an emergency or business continuity event occur. 5.3.2 Relocate PSC systems and data to WPH and backup online to PSC using a Storage Area Network (SAN) device. This proposal would improve on the NAS option (5.3.1) by providing an equivalent storage infrastructure to the Council s current production environment which supports over 40 virtualised servers and associated data. This environment could provide the basis of a platform to support a replication/recovery centre for the Council s WPH offices and datacentre. Recovery equipment would still need to be purchased, configured and have backed up data restored should an emergency or business continuity event occur. 5.3.3 Relocate PSC systems and data to WPH and implement a full business continuity and disaster recovery standby facility at PSC. This would be a further improvement and requires a SAN device with associated infrastructure (as described in option 5.3.2 above), replication arrangements and a recovery server. This option has the fullest functionality and would enable the Council to fail over to a remote site (PSC) in the event of a business continuity or disaster recovery incident and continue to provide services. It would allow for testing to ensure the solution would work correctly in the event of an emergency or other business continuity event. 5
5.3.1 Relocate PSC systems and data to WPH and commission the private sector or another council to provide full business continuity and disaster recovery facilities. Management Team also considered the possibility of backing up online to another council or the private sector with standby facilities. However, this was discounted as it was felt that reliance on external service providers for critical services involves higher risk and is more expensive than the in-house solution. 5.4 The Council s main computer network at WPH currently connects to the PSC through an encrypted Virtual Private Network (VPN) connection carried over the Council s and PSC s respective Internet connections. This connection is subject to external contention by other (public) internet users and often results in less than satisfactory performance for PSC staff accessing resources at the WPH datacentre. A new robust communications link between PSC and WPH is required to address the current shortcomings. This would also enable the Council to implement the proposed improvements to the back-up arrangements and the business continuity and disaster recovery solution. 5.5 The Council currently pays for a 100Mb circuit into the Dorset Public Services Network (DPSN) from WPH. 10Mb of this is used for the shared service partnerships (Revenues & Benefits, HR Payroll and the Dorset Waste Partnership). The remaining 90Mb capacity is currently unused and this capacity could be used to connect WPH to the PSC. The estimated cost of using this remaining capacity is shown in the revenue tables above. 5.6 Network Attached Storage (NAS) devices are mainly used as dedicated devices for file oriented data. NAS is a lower cost storage option which is ideal for static or nontransactional data (i.e. appropriate for file storage but not as good for virtual servers, as used by the Council. 5.7 Storage Area Network (SAN) devices mainly provide high performance shared disk space. SAN devices are used to provide production environments for virtual servers, transactional databases and fast changing data. SAN devices cost more than most NAS devices, being manufactured for prolonged higher performance loading and faster data input / output. 5.8 The existing Quantum tape library device, used to produce the back-up tapes, is nearing the end of its life (5 years is the accepted industry lifecycle) and a replacement would be required in the near future at an estimated one off cost of 4,000. The large number of moving parts, physical tape media and daily use put the device at a higher risk of failure than other hardware devices. 5.9 The current PSC lease has approximately 6 years left to run, but it is likely that the Council will seek a new lease on expiry of the existing lease. The possible use of PSC as a standby restoration and relocation site is therefore seen as a long term investment. 5.10 Any structured network cabling work required at the PSC could be carried out as a variation to the contract for the dry-side improvements scheduled for summer 2013. 5.11 It is notable that there haven t been any occasions when WPH has been unavailable since it was built in 1982. However, reliance on in-house ICT systems had not been so critical in the past when many services were manual, paper-based or hosted by the County Council. As such the risks to the Council are now much greater were there to be an event that resulted in damage to the ICT server rooms or WPH being unavailable. 6
Background papers: Report to Management Team on 6 March 2013 For further information contact: Paul Gammon, IT Manager Phil McStraw, General Manager Central Services 7