Tier 3+ 1
Title: Design of a Shared Tier III+ Data Center: A Case Study with Design Alternatives and Selection Criteria Abstract: Schoolcraft College is constructing a High Availability (HA) Data Center that is targeted at an Uptime Institute Tier Rating of III+. The new center will provide colocation hosting services to the education, municipal, and commercial communities. The design criteria of this facility include substantial redundancies in the power, cooling, network, and security areas. For example, the power infrastructure redundancy includes a continuous-duty natural gas generator that will supplement the DTE public power grid, such that campus power capacity should remain flat even when this 150+ rack facility is at capacity. The data center is also being designed to give students first-hand exposure to the skills needed to design and operate such a high-performance facility, without compromising security or uptime for municipal, commercial, and educational institution customers. The design criteria will be presented, alternative designs discussed, and final selections presented and rationalized. 2
Concurrently maintainable Redundant Capacity components Multiple independent distribution paths One distribution path required to serve computer equipment at any time Dual powered IT equipment Twelve hours of on-site engine generator fuel storage for N capacity Additional Tier 4 Features Fault tolerant N capacity power/cooling available after any infrastructure failure Multiple independent active distribution paths 3
50% of unplanned downtime is caused by batteries Liebert High Availability Response Team Down Unit Analysis
Average cost of the Unplanned Outage Total cost by industry of unplanned shutdown Source: Ponemon Institute
Space is money Reputation defines success or failure Customer service must be paramount PUE drives profitability/cost containment Design flexibility 6
100% Uptime Tier 3+ rated data center PUE 1.5 Minimum of N+1 redundancy in all critical systems Private power generation 2N back-up power Carrier neutral with multiple carriers System Design Basic Co-Location Space Power/Cooling Access to Carriers Security And Beyond Lease Equipment Disaster Recovery Maintenance Remote Hands DBMS OS Mgmt. 7
High density power average 5kw/rack 24/7 monitoring 24/7 client access Latest high efficiency HVAC system Inert gas primary fire suppression system Dual authentication physical security with biometrics 8
CHP Generator vs. Fuel cell Single vs. dual utility feeds # of failover protections Single vs. dual backup power UPS Battery vs. Flywheel UPS Tier III vs. IV (A+B) Busway vs. conduit/wire Branch circuit monitoring Panel vs. Busway vs. PDU Transfer switch vs. PLC failover management On site load bank vs. Maintenance service 9 UPS DC Bus Voltage (dc) Float Voltage Grid Disturbance 98% of disturbances < 10 sec. 10 0 20 Time (Seconds)
Data Center Power Plant 10
B 1.) From the Row A 3.) To inside the Rack B A B A B A B A B A B A B A B A 2.) To the Rack Power Busway A Dual Corded Server (Two Power Supplies) B Power Busway A BusPlug Solid State Transfer Switch B A Power Strip (a.k.a. PDU or Power Distribution Unit ) Single Corded Device Back of Rack 11
Single vs. Multiple carriers Single vs. Multiple entry points (Diverse Entry) Single vs. Multiple carrier paths (Diverse Path) Single vs. Dual lateral connection per carrier Entry pathways owned vs. Carrier owned Single vs. Dual core carrier routers to MDF Single vs. Dual edge switches (HSRP) Cisco 6500 series vs. Nexus with SDN Carrier Battery plant vs. Operator A-B UPS 12
13
14
SCDC and MERIT Network Relationship Merit will re sell SCDC colocation services Merit delivers a 150Mbps into the Applied Sciences via an AT&T EVC Merit Networks services will come into the SCDC via AT&T or Level3 last mile with whom they already have a relationship. Merit has approximately 3000 miles of fiber network in Michigan. Merit has relationships and fiber access to area public school districts and universities as potential SCDC clients 15
Cooling 101 Get air to front of device to allow device fans to pull air to the back Cooling is largest non-it power usage Leakage Open spaces create air mixing, turbulence, loss of efficiency Design layout CFD failure mode verified 16
Area/perimeter (blowing up the balloon) Ducted supply and/or ducted return Raised floor, In row, Economizers, etc. Hot aisle vs. Cold aisle containment DX (Air), Glycol, Chilled Water DX pumped refrigerant with free cooling below 55 F Heat exchangers 17
N+1 cooling capacity Liebert DSe pumped refrigerant with free cooling CFD Validation Highest efficiency with hot aisle containment Typical PUE = 2.0 Hot exhaust air PUE = Cold Supply Air Power to Racks Total D. C. Power 18
24/7 Alarm active An MCOLES certified PA330 Police Authority, co-located in the Data Center building Dual authentication with Prox card and Biometrics 24/7 monitored ( by third party) CCTV mega pixel security cameras with remote viewing Motion activated video record with 90 day retention minimum Non-Clients/Vendors 100% escorted 19
Building vs. Room UPS vs. Feeder General vs. First responder activation Fire Suppression Activation Code requirements Equipment servicing room CRAC s and IT equipment vs. CRAC s only EPO First responder only Equipment servicing room CRAC s Agent effectiveness IT Equipment power optional Power for lighting & utility outlets NEC Article 645 - B Disconnection Means (Emergency Power Off) Section 645.10 of the 2008 NEC requires that there be disconnecting means for each zone in the IT room. Section 645.10 of the 2011 NEC has two alternatives for the disconnecting means, (A) covers remote disconnect controls with requirements the same as the 2008 NEC and (B) covers critical operations data systems. Critical operations data systems (defined in 645.2) are permitted to have alternate disconnecting means provided that five additional conditions are met: (1) An approved shut down procedure has been established (2) Qualified personnel are continuously available 24/7 (3) Smoke sensors are in place. (4) A fire suppression system is in place. (5) Plenum cables are used for signaling 20
New evaporative particulate Inert gas FM200/ECARO dual detector 165 Dry pipe Dual action - 185 2 detector active to charge lines Pellet melt water zone First Responder Training 21
Preventative vs. Reactive How much Granular view vs. Sensory Overload Methods & Protocols SNMP BACnet Mod bus Dry Contact Alerting email text phone call audible alarms Response Policy Infrastructure HW vs. Network 22
Critical to Uptime and PUE Hand in Hand with Redundancy SNMP BACnet Modules BMS DCIM Transfer Switch Facility Power Meter PUE = Power to Racks Total D. C. Power 23
Policy Compliance SSAE16 SOC2, HIPAA ~100 Control policies with Quality Control Repository Operations guide Risk Analysis & Mitigation Plan with over 100 validation points Disaster Recovery Plan First Responder Guide Employee handbook DCIM & Asset management Incident management & Ticketing System times 24
Preventative Service-effecting or Non-service-effecting Notification of Clients (2-3 weeks in advance) Network and compute redundancy and DR CRAC s & Condensers Primary Transformer Generator Switchgear UPS Wrap around maintenance bypass Breakers (ARC Flash) & Coordination Fire Suppression & EPO Transfer switches & Control logic 25
Data Center Footprint 26
Academic Program focus: Data Center Design Operation Management Continuing Education Seminars Teaching lab in data center for hands-on learning Lab sponsorships being sought from EMC, CISCO, HP, Dell, etc. Focus on latest offerings/technologies 27
A SSAE16 SOC2, HIPAA, PCI compliant facility Superior Infrastructure Superior Redundancy Superior Power Security Expertise in Commercial Data Center Design and Management 28
29