Table 1. Data Centre (General)

Size: px
Start display at page:

Download "Table 1. Data Centre (General)"

Transcription

1 PSCIOC Glossary for Pan Canadian Approach on Sharing of Data Centres Survey Version 1.0 This Glossary was developed to define the terms used in the survey for the Pan Canadian Approach of Sharing of Data Centres, with a view to establish a common understanding of terms for participating PSCIOC members and stakeholders. Table 1. Data Centre (General) Term Business Continuity Planning (BCP) Clustered System Data Centre Data Centre Infrastructure Management (DCIM) Data Centre Module Disaster Recovery Planning (DRP) A collection of procedures and information which is developed, compiled and maintained in readiness for use in the event of an emergency or disaster, for the purpose continuing business processes until the emergency is resolved. i An architecture that ties together uniprocessor, symmetric multiprocessing (SMP) and/or massively parallel processing (MPP) systems with all nodes sharing access to disks. Also called a shared-disk system. ii The data centre is the department in an enterprise that houses and maintains back-end information technology (IT) systems and data stores its mainframes, servers and databases. In the days of large, centralized IT operations, this department and all the systems resided in one physical place, hence the name data centre. iii Data centre infrastructure management (DCIM) software tools monitor, measure, manage and/or control data centre performance, utilization and energy consumption. Ideally, DCIM provides this functionality for all IT-related equipment (such as servers, storage and network switches) and infrastructure components (such as PDUs and CRACs). DCIM enables IT professionals to understand energy consumption from a holistic data centre perspective down to individual components, such as servers (and, in some cases, even semiconductor chips). iv A pre-engineered, standardized building block that can be configured as the user wishes, to facilitate standardization in a changeable environment. v Methods and procedures for returning a data centre to full operation after a catastrophic interruption (e.g., including recovery of lost data). The use of alternative network circuits to 1

2 Downtime Redundant Capacity Components Server Room Single Points of Failure (SPOF) Stringered Raised Floors Stringerless Raised Floors re-establish communications channels in the event that the primary channels are disconnected or malfunctioning. vi Period during which an equipment or machine is not functional or cannot work. It may be due to technical failure, machine adjustment, maintenance, or non-availability of inputs such as materials, labour, power. Average downtime is usually built into the price of goods produced, to recover its cost from the sales revenue. Opposite of uptime. Also called waiting time. vii The number of active capacity components in a system beyond the minimum number of units required to support the IT load is referred to as redundant. If one unit of capacity is required to support the computer equipment, more than one unit of capacity is installed. Terms such as N+1 (need plus one), or N+2 are commonly applied to capacity component counts but not distribution paths. viii A room that houses mainly computer servers. In information technology circles, the term is generally used for smaller arrangements of servers; larger groups of servers are housed in data centres. Server rooms usually contain headless computers connected remotely via KVM switch, ssh, VNC, or remote desktop. ix A part of a system that, if it fails, will stop the entire system from working. They are undesirable in any system with a goal of high availability or reliability. x This type of raised floor generally consists of a vertical array of steel pedestal assemblies (each assembly is made up of a steel base plate, tubular upright, and a head) uniformly spaced on two-foot centres and mechanically fastened to the concrete floor. The steel pedestal head has a stud that is inserted into the pedestal upright and the overall height is adjustable with a leveling nut on the welded stud of the pedestal head. xi One non-earthquake type of raised floor generally consists of an array of pedestals that provide the necessary height for routing cables and also serve to support each corner of the floor panels. With this type of floor, there may or may not be provisioning to mechanically fasten the floor panels to the pedestals. This stringerless type of system (having no mechanical attachments between the pedestal heads) provides maximum accessibility to the space under the floor. However, stringerless floors are significantly weaker than stringered raised floors in supporting 2

3 Structural Platforms Uptime Uninterrupted Power Supply (UPS) lateral loads and are not recommended. xii One type of structural platform consists of members constructed of steel angles or channels that are welded or bolted together to form an integrated platform for supporting equipment. This design permits equipment to be fastened directly to the platform without the need for toggle bars or supplemental bracing. Structural platforms may or may not contain panels or stringers. xiii Part of active time during which an equipment, machine, or system is either fully operational or is ready to perform its intended function. Opposite of downtime. xiv Uninterruptible power supplies (UPSs) are devices that maintain the supply of power to a load even when the AC input power is interrupted or disturbed. This is typically accomplished by drawing the necessary power from a stored energy source, such as a battery. xv Table 2. Tiered Reliability Term Tier Classification System Tier Certification Tier I Basic Site Infrastructure Tier classifications from the Uptime Institute describe the site level infrastructure topology required to sustain data centre operations in accordance with pre-defined Tier levels (I-IV). xvi The Uptime Institute is the only organization worldwide that Certifies data centre designs and facilities to the Tier Classification System (1-IV). Tier Certification is an unbiased, third party validation of the Tier level that applies to enterprise and managed service data centres. Tier Certification recognizes organizational accomplishment and industry achievement. xvii The Fundamental Requirement: a) A Tier 1 basic data centre has non-redundant capacity components and a single, non-redundant distribution path serving the computer equipment. b) Twelve hours of on-site fuel storage for engine generator(s). 3

4 Tier II Redundant Site Infrastructure Capacity Components The Performance confirmation tests: a) There is sufficient capacity to meet the needs of the site. b) Planned work will require most of all of the site infrastructure systems to be shut down affecting computer equipment, systems, and users. The operational impacts: a) The site is susceptible to disruption from both planned and unplanned activities. Operation (Human) errors of site infrastructure components will cause a data centre disruption. b) An unplanned outage or failure of any capacity system, capacity component, or distribution element will impact the computer equipment. c) The site infrastructure must be completely shut down on an annual basis to safely perform necessary preventative maintenance and repair work. Urgent situations may require more frequent shut downs. Failure to regularly perform maintenance significantly increases the risk of unplanned disruption as well as the severity of the consequential failure. xviii The Fundamental Requirement: a) A Tier II data centre has redundant capacity components and a single, non-redundant distribution path serving the computer equipment. b) Twelve hours of on-site fuel storage for N capacity The Performance confirmation tests: a) Redundant capacity components can be removed from service on a planned basis without causing any of the computer equipment to be shut down. b) Removing distribution paths from service for maintenance or other activity requires shutdown of computer equipment. c) There is sufficient permanently installed capacity to meet the needs of the site when redundant components are removed from service for any reason. 4

5 Tier III Concurrently Maintainable Site Infrastructure The operational impacts: a) The site is susceptible to disruption from both planned activities and unplanned events. Operation (Human) errors of site infrastructure components may cause a data centre disruption. b) An unplanned capacity component failure may impact the computer equipment. An unplanned outage or failure of any capacity system or distribution element will impact the computer equipment. c) The site infrastructure must be completely shut down on an annual basis to safely perform preventative maintenance and repair work. Urgent situations may require more frequent shutdowns. Failure to regularly perform maintenance significantly increases the risk of unplanned disruption as well as the severity of the consequential failure. xix The Fundamental Requirement: a) A concurrently maintainable data centre has redundant capacity components and multiple independent distribution paths serving the computer equipment. Only one distribution path is required to serve the computer equipment at any time. b) All IT equipment is dual powered as defined by the Institute s Fault Tolerant Power Compliance Specification, Version 2.0, and installed properly to be compatible with the topology of the site s architecture. Transfer devices, such as point-of-use switches, must be incorporated for computer equipment that does not meet this specification. c) Twelve hours of on-site fuel storage for N capacity. The Performance confirmation tests: a) Each and every capacity component and element in the distribution paths can be removed from service on a planned basis without impacting any of the computer equipment. b) There is sufficient permanently installed capacity to meet the needs of the site when redundant components are removed from service for any reason. 5

6 Tier IV Fault Tolerant Site Infrastructure 6 The operational impacts: a) The site is susceptible to disruption from unplanned activities. Operation errors of the site infrastructure components may cause a computer disruption. b) An unplanned outage or failure of any capacity system will impact the computer equipment. c) An unplanned outage or failure of any capacity component or distribution element may impact the computer equipment. d) Planned site infrastructure maintenance can be performed by using the redundant capacity components and distribution paths to safely work on the remaining equipment. e) During maintenance activities, the risk of disruption may be elevated. (This maintenance condition does not defeat the Tier rating achieved in normal operations). xx The Fundamental Requirement: a) A Fault Tolerant data centre has multiple, independent, physically isolated systems that provide redundant capacity components and multiple, independent, diverse, active distribution paths simultaneously serving the computer equipment. The redundant capacity components and diverse distribution paths shall be configured such that N capacity is providing power and cooling to the computer equipment after any infrastructure failure. b) All IT equipment is dual powered as defined by the Institute s Fault Tolerant Power Compliance Specification, Version 2.0, and installed properly to be compatible with the topology of the site s architecture. Transfer devices, such as point-of-use switches, must be incorporated for computer equipment that does not meet this specification. c) Complementary systems and distribution paths must be physically isolated from one another (compartmentalized) to prevent any single event from simultaneously impacting both systems or distribution paths. d) Continuous Cooling is required. For more information

7 see the Institute publication Continuous Cooling Is Required for Continuous Availability. e) Twelve hours of on-site fuel storage for N capacity. The Performance confirmation tests: a) A single failure of any capacity system, capacity component, or distribution element will not impact the computer equipment. b) The system itself automatically responds ( self heals ) to a failure to prevent further impact to the site. c) Each and every capacity component and element in the distribution paths can be removed from service on a planned basis without impacting any of the computer equipment. d) There is sufficient capacity to meet the needs of the site when redundant components or distribution paths are removed from service for any reason. The operational impacts: a) The site is not susceptible to disruption from a single unplanned event. b) The site is not susceptible to disruption from any planned work activities. c) The site infrastructure maintenance can be performed by using the redundant capacity components and distribution paths to safely work on the remaining equipment. d) During maintenance activity where redundant capacity components or a distribution path shut down, the computer equipment is exposed to an increased risk of disruption in the event a failure occurs on the remaining path. This maintenance configuration does not defeat the tier rating achieved in normal operations. e) Operation of the fire alarm, fire suppression, or the emergency power off (EPO) feature may cause a data centre disruption. xxi 7

8 Table 3. Network Term Internet Exchange Point IP Switching Equipment Multihoming Network Operations Centre (NOC) Structured Cabling An Internet Exchange Point (IXP) is a place where multiple ISPs interconnect their networks together (see figure below). Potentially many peering sessions can be established across a single well-populated IXP peering fabric or across private peering sessions over (typically) fiber cross connects. xxii Internet Protocol (IP) routing and switching equipment provides Internet transit, interconnection with more carriers and suppliers, and provisioning for virtual private network (VPN) services. xxiii Multihoming is a mechanism used to configure one computer with more than one network interface and multiple IP addresses. It provides enhanced and reliable Internet connectivity without compromising efficient performance. The multihoming computer is known as the host and is directly or indirectly connected to more than one network. xxiv The central location where a company's servers and networking equipment are located. The NOC may reside either within a company's campus or at an external location. xxv Premises cabling, also called structured cabling, is a standardized cabling system designed to carry voice, data and video signals in a commercial or residential environment. Traditionally, cabling has focused on Cat 5 unshielded twisted pair cable as used in Ethernet networks, but network architectures are changing. Since these cabling standards were first developed around 1990, networks have increased in speed 1,000 times, from 10 Mb/s to 10 Gb/s promoting the use of fiber, especially in backbone networks and workstation connections. xxvi 8

9 Table 4. Environmental Term Humidity Range HVAC LEED Standard Temperature Range The recommended humidity is 25%-45% with a 5.5 C (41.9 F) dew point. Air conditioning systems help control humidity by cooling the return space air below the dew point. Too much humidity, and water may begin to condense on internal components. In case of a dry atmosphere, ancillary humidification systems may add water vapor if the humidity is too low, which can result in static electricity discharge problems which may damage components. xxvii Short for heating, ventilation, and air conditioning. The system is used to provide heating and cooling services to buildings. HVAC systems have become the required industry standard for construction of new buildings. Before the creation of this system, the three elements were usually split between three or more devices. xxviii Leadership in Energy and Environmental Design (LEED) consists of a suite of rating systems for the design, construction and operation of high performance green buildings, homes and neighborhoods. Points are distributed across major credit categories such as Sustainable Sites, Water Efficiency, Energy and Atmosphere, Materials and Resources, and Indoor Environmental Quality. xxix The recommended temperature range for a data centre is C (64 81 F). The temperature in a data centre will naturally rise because the electrical power used heats the air. Unless the heat is removed, the ambient temperature will rise, resulting in electronic equipment malfunction. By controlling the air temperature, the server components at the board level are kept within the manufacturer's specified temperature/humidity range. xxx 9

10 Table 5. CoLocation Term Colocation Shared Infrastructure Shared Services Locating customer equipment in a third-party datacentre. Colocation often refers to Internet service providers (ISPs) or cloud computing providers that furnish the floor space, electrical power and high-speed links to the Internet for a customer's Web servers. Colocation eliminates having to build a secure facility that provides power and air conditioning for companyowned servers. In addition, colocation centres are often located near major Internet connecting points and can provide access to multiple Tier 1 Internet backbones. Although most equipment monitoring is performed remotely by the customer, a colocation datacentre may offer equipment maintenance and troubleshooting arrangements. Note: this model differs from shared services in that customers are only renting space, as opposed to having access to managed services. xxxi Any system that pools resources in a single chassis and spreads them among independent server nodes within that same chassis to achieve optimized performance. The advantages of a shared infrastructure system are found in three main areas: power efficiency, thermal efficiency, and compute density. Each of these three areas effect the cost of building and maintaining a data centre. xxxii A delivery model in which a shared-service centre, supported by dedicated people, processes and technologies, acts as a centralized provider of a defined business function for use by multiple enterprise constituencies. The service centre may be physical, virtual or logical. Note: this model differs from colocation in that customers have access to managed services, as opposed to just the rental of space. xxxiii Table 6. Cloud Computing 10

11 Converged Infrastructure Data as a Service (DaaS) Hybrid Cloud Computing Infrastructure as a Service (IaaS) Multitenancy On Demand Self Service Platform as a Service (PaaS) Converged Infrastructure is achieved through a systematic approach that brings all server, storage, and networking resources together into pools of resources. It brings together management tools, policies, and processes so resources and applications are managed in a holistic, integrated manner. It integrates security to provide protection from today s sophisticated security threats at both the perimeter and interior of a business. xxxiv An information provision and distribution model in which data files (including text, images, sounds, and videos) are made available to customers over a network, typically the Internet. xxxv The combination of external public cloud computing services and internal resources (either a private cloud or traditional infrastructure, operations and applications) in a coordinated fashion to assemble a particular solution. Hybrid cloud computing implies significant integration or coordination between the internal and external environments at the data, process, management or security layers. xxxvi In this most basic cloud service model, cloud providers offer computers, as physical or more often as virtual machines, and other resources. The virtual machines are run as guests by a hypervisor, such as Xen or KVM. Management of pools of hypervisors by the cloud operational support system leads to the ability to scale to support a large number of virtual machines. xxxvii Multi-tenancy is an architecture in which a single instance of a software application serves multiple customers. Each customer is called a tenant. Tenants may be given the ability to customize some parts of the application, such as color of the user interface (UI) or business rules, but they cannot customize the application's code. xxxviii On-demand self-service allows users to obtain, configure and deploy cloud services themselves using cloud service catalogues, without requiring the assistance of IT. xxxix A category of cloud computing services that provide a computing platform and a solution stack as a service. Along 11

12 Private Cloud Public Cloud Security as a Service (SECaaS) Software as a Service (SaaS) with SaaS and IaaS, it is a service model of cloud computing. In this model, the consumer creates the software using tools and/or libraries from the provider. The consumer also controls software deployment and configuration settings. The provider provides the networks, servers, storage and other services. xl A virtual private cloud (VPC) is a dynamically configurable pool of public cloud computing resources that requires the use of encryption protocols, tunneling protocols and other security procedures to transfer data between a private enterprise and a cloud service provider. A VPC essentially turns the provider s multi-tenant architecture into a single-tenant architecture. xli A public cloud is one based on the standard cloud computing model, in which a service provider makes resources, such as applications and storage, available to the general public over the Internet. Public cloud services may be free or offered on a pay-per-usage model. xlii An outsourcing model for security management. Typically, Security as a Service involves applications such as anti-virus software delivered over the Internet but the term can also refer to security management provided in-house by an external organization. xliii Software that is owned, delivered and managed remotely by one or more providers. If the vendor requires user organizations to install software onpremises using their infrastructures, then the application isn t SaaS. SaaS delivery requires a vendor to provide remote, outsourced access to the application, as well as maintenance and upgrade services for it. The infrastructure and IT operations supporting the applications must also be outsourced to the vendor or another provider. The provider delivers an application based on a single set of common code and data definitions, which are consumed in a one-to-many model by all contracted customers at any time. Customers may be able to extend the data model by using configuration tools supplied by the provider, but without altering the source code. xliv 12

13 Storage as a Service (STaaS) Testing as a Service (TEaaS) Virtualization A business model in which a large service provider rents space in their storage infrastructure on a subscription basis. The economy of scale in the service provider's infrastructure allows them to provide storage much more cost effectively than most individuals or corporations can provide their own storage, when total cost of ownership is considered. Storage as a Service is often used to solve offsite backup challenges. xlv An outsourcing model in which testing activities associated with some of an organization s business activities are performed by a service provider rather than employees. xlvi The abstraction of IT resources that masks the physical nature and boundaries of those resources from resource users. An IT resource can be a server, a client, storage, networks, applications or OSs. Essentially, any IT building block can potentially be abstracted from resource users. Abstraction enables better flexibility in how different parts of an IT stack are delivered, thus enabling better efficiency (through consolidation or variable usage) and mobility (shifting which resources are used behind the abstraction interface), and even alternative sourcing (shifting the service provider behind the abstraction interface, such as in cloud computing). xlvii Table 7. Security Term Data Security Fire Protection System 13 Data security is the practice of keeping data protected from corruption and unauthorized access. The focus behind data security is to ensure privacy while protecting personal or corporate data. xlviii Data centers feature fire protection systems, including passive and active design elements, as well as implementation of fire prevention programs in operations. Prevention tools often include: smoke detectors, fire sprinkler systems, aspirating smoke detectors, clean agent fire suppression gaseous systems, and firewalls (physical fireproof barriers). For mission critical data centres fireproof vaults with a Class

14 Network Security Physical Security Prepared by: Erin Sullivan Management Consultant Sierra Systems Inc. October 3, rating are necessary to meet NFPA 75 standards. xlix Measures taken to protect a communications pathway from unauthorized access to, and accidental or willful interference of, regular operations. l Physical security plays a large role with data centres. Physical access to the site is usually restricted to selected personnel, with controls including bollards and mantraps. Video camera surveillance and permanent security guards are almost always present if the data center is large or contains sensitive information on any of the systems within. The use of finger print recognition man traps is starting to be commonplace. li 14

15 Appendix: References i Introduction to Business Continuity Planning, SANS Institute InfoSec Reading Room, 2002, 28 September 2012, < ii IT Dictionary, Gartner, 2012, 1 Oct 2012, < iii IT Dictionary, Gartner, 2012, 1 October 2012, < iv IT Dictionary, Gartner, 2012, 1 October 2012, < v Niles, Susan. Standardization and Modularity in Data Centre Physical Infrastructure, Schneider Electric, page 4, 2011, 28 September 2012, < vi IT Dictionary, Gartner, 2012, 1 October 2012, < vii What is Downtime? BusinessDictionary.com, 28 September 2012, < viii W. Pitt Turner IV (et al), Tier Classifications Define Site Infrastructure Performance, Uptime Institute, LLC, 2008, 28 September 2012, < ix Data Center, Wikipedia, 28 September 2012, 28 September 2012, < x K. Dooley, O'Reilly, Designing Large-scale LANs, Page 31, 2002, 1 October 2012, < xi Data Center, Wikipedia, 28 September 2012, 28 September 2012, < xii Data Center, Wikipedia, 28 September 2012, 28 September 2012, < xiii Data Center, Wikipedia, 28 September 2012, 28 September 2012, < xiv What is Uptime? BusinessDictionary.com, 28 September 2012, < xv Uninterruptable Power Supplies, Natural Resources Canada, 20 April 2009, 1 October 2012, < xvi Data Center Site Infrastructure Tier Standard: Topology, Uptime Institute, LLC, page 1, 2010, 1 October 2012, < xvii Tier Certifications of Designs and Facilities, Uptime Institute, LLC, page 2, 2 October 2012, < xviii Data Center Site Infrastructure Tier Standard: Topology, Uptime Institute, LLC, page 1-2, 2010, 1 October 2012, < xix Data Center Site Infrastructure Tier Standard: Topology, Uptime Institute, LLC, page 2, 2010, 1 October 2012, < xx Data Center Site Infrastructure Tier Standard: Topology, Uptime Institute, LLC, page 2-3, 2010, 1 October 2012, < i

16 Appendix: References xxi Data Center Site Infrastructure Tier Standard: Topology, Uptime Institute, LLC, page 3, 2010, 1 October 2012, < xxii What is an IXP?, Dr Peering International, 2011, 1 October 2012, < IXP.php> xxiii May/June 2012 Issue of BICSI News Magazine, Bicsi, May-June 2012, 1 October 2012, < xxiv Multihoming, Techopedia, , 1 October 2012, < xxv NOC, TechTerms.com, 2012, 28 September 2012, < xxvi Structured Cabling Association, , 28 September 2012, < xxvii 2008 ASHRAE Environmental Guidelines for Datacom Equipment, American Society of Heating, Refrigerating, and Air Conditioning Engineers (ASHRAE), 2008, 1 October 2012, < xxviii What is HVAC? BusinessDictionary.com, 28 September 2012, < xxix Leadership in Energy and Environmental Design, Wikipedia, 30 September 2012, 2 October 2012, < xxx 2008 ASHRAE Environmental Guidelines for Datacom Equipment, American Society of Heating, Refrigerating, and Air Conditioning Engineers (ASHRAE), 2008, 1 October 2012, < xxxi Colocation, PCMag.com, , 1 October 2012, < xxxii Normandeau, Kevin, Shared Infrastructure Benefits the Data Center, Digital Reality, 6 April 2011, 1 October 2012, < xxxiii IT Dictionary, Gartner, 2012, 1 Oct 2012, < xxxiv Converged Infrastructure: Accelerate your IT to achieve better business results, HP, 2012, 1 October 2012, < xxxv Data as a Service, TechTarget.com, , 28 September 2012, < xxxvi IT Dictionary, Gartner, 2012, 1 Oct 2012, < xxxvii Infrastructure as a Service, Wikipedia, 28 September 2012, 28 September 2012, < xxxviii Multi-tenancy, TechTarget.com, , 28 September 2012, < xxxix Perera, David, The Real Obstacle to Federal Cloud Computing, Fierce Government IT, 12 July 2012, 1 October 2012, < xl Mell, Peter, Grace, "The NIST of Cloud Computing". National Institute of Science and Technology. September 2011, 1 October 2012 < ii

17 Appendix: References xli Virtual Private Cloud, TechTarget.com, , 28 September 2012, < xlii Public Cloud, TechTarget.com, , 28 September 2012, < xliii Security as a Service, TechTarget.com, , 28 September 2012, < xliv IT Dictionary, Gartner, 2012, 1 Oct 2012, < xlv Storage as a Service, Wikipedia, 28 September 2012, 28 September 2012, < xlvi Testing as a Service, TechTarget.com, , 28 September 2012, < xlvii IT Dictionary, Gartner, 2012, 1 October 2012, < xlviii What is Data Security?, Spam Laws, 2012, 2 October 2012, < xlix Data Center, Wikipedia, 1 October 2012, 2 October 2012, < l IT Dictionary, Gartner, 2012, 1 October 2012, < li Data Center, Wikipedia, 28 September 2012, 28 September 2012, < iii