Data Center Commissioning: What you need to know Course # CXENERGY1511 AIA Provider # 50111116 AABC Commissioning Group CxEnergy Conference Henderson, NV April 30, 2015 Presented by: Judson H. Adams, P.E., CxA, ATD Power Management Corporation
AIA Stuff: 2 Credit(s) earned on completion of this course will be reported to AIA CES for AIA members. Certificates of Completion for both AIA members and non-aia members are available upon request. This course is registered with AIA CES for continuing professional education. As such, it does not include content that may be deemed or construed to be an approval or endorsement by the AIA of any material of construction or any method or manner of handling, using, distributing, or dealing in any material or product.
Course Description 3 Data centers have become a critically important facility type in our modern economy. Because of their unique requirements, commissioning them properly requires some specific knowledge and skills. This practical session will discuss uptime tier classifications, working in live data centers, balancing reliability vs efficiency, and much more. Also featured will be a review of example datacenter-specific functional performance and integrated system tests.
Learning Objectives: 4 At the end of this course, participants will. 1. Be familiar with the Uptime Institute Tier classifications 2. Understand the 5 levels of data center Cx 3. Be able to differentiate Integrated Systems Testing from Functional Performance Testing 4. Understand challenges associated with Cx in a live data center
Data Center 5 A facility used to house computer systems and associated components, such as telecommunications and storage systems. It generally includes redundant or backup power supplies, redundant data communications connections, environmental controls (e.g., air conditioning, fire suppression) and various security devices.- -Wikipedia 2015 Internet Enterprise Telecommunications Colocation (wholesale and retail) Server/computer room Telecom closets
Critical Infrastructure 6 Physical systems designed and constructed specifically to support the hardware, network, and applications that make up the data center : Electrical Power Mechanical Cooling Fire Suppression Automation Monitoring
Critical Infrastructure 7 Building Automation System UPS systems Chillers Power Distribution Modules Cooling Towers Remote Power Panels Pumps/VFDs/Loop Critical Distribution Busway Outside Air Units EPO Economizers CRAHs/CRACs/IRCs Makeup water and Pressure reducer valves Power and Environmental Monitoring Systems Liquid Detection Systems Lighting Control Systems Electrical Substations Generators and Fuel Delivery Systems Fire Alarm and Detection Systems Preaction Sprinkler Systems Transfer gear/ats Distribution Switchboards/Switchgear Gaseous Fire Suppression Systems
Capacity Component 8 Capacity Component is any power producing and energy storage equipment, such as: Electrical power generator UPS and Battery system Chiller Cooling Tower Chilled water pump CRAC/CRAH units Fuel storage Water Storage
Distribution path 9 A distribution path is the means by which the power is transferred from the capacity component to the load, such as: Transformers Electrical feeders ATS, Switchgear, PDUs Hydronic piping Refrigerant piping Valves
ITEQ Load 10 The load (expressed in Kilo-Watts) of the users Information Technology Equipment. What s in The Rack. Telecommunications provider equipment Firewalls Routers Network switches Servers Storage arrays (SANs) And others..
11 Uptime Institute Tier Classification System
Uptime Institute Tier Classification System 12 An International Standard developed by The UTI that defines criteria for the critical infrastructure design based on four levels of classification: Tier I Basic Capacity Tier II Redundant Capacity Components Tier III Concurrently Maintainable Tier IV Fault Tolerant/Continuous Cooling
Uptime Institute Tier Classification System 13 Tier levels are defined at a specific ITEQ load using ASHRAE Handbook Extreme outdoor design conditions for the specific location (n=20 years) Nashville, TN Typical Cooling Design (0.4%) = 94.4 DB Extreme (20 year high) = 103.1 DB Las Vegas, NV Typical Cooling Design (0.4%) = 108.3 DB Extreme (20 year high) = 116.1 DB
Uptime Institute - Tier I 14 Tier I Basic Capacity Self sustaining for 12 hours - Does not rely on utilities Number and size of capacity components is sufficient to maintain the operation of the business between maintenance intervals Maintenance requires shutting down the data center Subject to planned and un-planned outages
Uptime Institute - Tier I 15 12 hour Water Storage N Generator(s)
Uptime Institute - Tier I 16 N Chiller(s) N AC Unit(s)
Uptime Institute - Tier I Electrical Oneline 17 From UTI white paper: Tier Classifications Define Site Infrastructure Performance Note: Example only, not prescriptive
Uptime Institute - Tier II 18 Tier II Redundant Capacity Components Tier I, plus: Number of capacity components provides redundancy (N+1) for: unexpected failure maintenance without system shutdown Maintenance of pathways requires shutdown Subject to planned and un-planned outages
Uptime Institute - Tier II 19 N+1 UPS N+1 Generators
Uptime Institute - Tier II 20 N+1 Chillers N+1 Cooling Towers
Uptime Institute - Tier II Electrical Oneline 21 From UTI white paper: Tier Classifications Define Site Infrastructure Performance Note: Example only, not prescriptive
Uptime Institute - Tier III 22 Tier III Concurrently Maintainable Tier II, plus: Multiple distribution pathways for power and cooling (one may be passive) All IT equipment has redundant power supplies Generator must be rated for continuous operation Each and Every single component and distribution pathway can be removed for service/replaced without shutting down the data center Maintenance requires shudown
Uptime Institute - Tier III 23 Double Isolation Valves
Uptime Institute - Tier III Redundant Pathways 24
Uptime Institute - Tier III Redundant Pathways 25
Uptime Institute - Tier III 26 Main-Tie-Tie-Main
Uptime Institute - Tier III Electrical Oneline 27 From UTI white paper: Tier Classifications Define Site Infrastructure Performance Note: Example only, not prescriptive
Uptime Institute - Tier IV 28 Tier IV Fault Tolerant Tier III, plus: Multiple distribution paths - both active Compartmentalization between redundant components/pathways Class-A Continuous Cooling is required Failures are self healing Unexpected failure of any single component will not interrupt the operation of the data center The site is not subject to outages from a planned event, or a single unplanned event
Uptime Institute - Tier IV 29 Continuous Cooling during loss of utility
Uptime Institute - Tier IV 30 Compartmentalization
Uptime Institute - Tier IV Electrical Oneline 31 From UTI white paper: Tier Classifications Define Site Infrastructure Performance Note: Example only, not prescriptive
Uptime Institute Tier Availability Matrix 32 Planned Shutdowns for maintenance* Unplanned Equipment or Distribution Failures ** Tier I Tier II Tier III Tier IV 2 every year 1.2 per year 3 every 2 years Not required 1 per year 1 every 2.5 years Not Required 1 every 5 years Annual hours of 28.8 22 1.6 0.8 downtime** Availability 99.67% 99.75% 99.98% 99.99% * Based on recommended practice ** Historical data from Uptime Institute members Source: Data Center Site Infrastructure Tier Standard: Topology 2012 - Uptime Institute, LLC
33 Data Center Commissioning aka Level 5 Commissioning
Level 5 Commissioning 34 Level 0: Design Review Level 1: Planning Level 2: Factory Acceptance Testing Level 3: Pre-Functional Inspections/Startup Level 4: Functional Testing Level 5: Integrated Systems Testing
Level 0: Design Review 35 CxA will assist with defining the OPR: What is your target ITEQ Load? What are your Resiliency Goals (Tier Level)? Phasing? How will the site be tested? Make sure testing requirements are in the specifications Photo source: www.zeniumdatacenters.com/the-risk-of-short-term-data-center-planning/
Level 0: Design Review 36 Review the Basis of Design: Compare to OPR Sequence of operations Piping flow diagram Electrical oneline diagram Back-check at CDs
Level 0: Design Review A few pointers regarding controls: 37 Keep sequences simple Fewer control points means fewer failures Consider resiliency vs. efficiency The fallacy of UPS backup All capacity components should self sustainable
Level 1: Planning 38 Begins in design phase Prepare outline of testing scripts Test for maintenance modes (Tier III) Test for failure modes (Tier IV)
Level 1: Planning 39 Load bank placement for heat run test? How will the different failure modes be tested? How does the testing plan integrate with construction schedule?
Level 1: Planning 40 Existing Data Center Modifications: Identify Critical MOPS (Methods of Procedure) Is the sequence of construction integrated with Cx activities? What is the user s tolerance for risk?
Level 2: Factory Acceptance 41 Generators Paralleling Gear UPS Systems Static Switches Chillers Custom HVAC equipment
Level 2: Factory Acceptance 42 Full sequence demonstrated Safeties and alarms Heat Run at full load Transients recorded Chillers: Quick Start Low load
Level 2: Factory Acceptance 43 Why? Testing performed by factory technicians Deficiencies identified and corrected prior to shipment Test Drive: Owner becomes familiar with system before owning it
Level 3: Pre-Functional Overall readiness for FPTs 44 Equipment matches submittals Installation per construction drawings SVCs complete Manufacturer startup complete Electrical Acceptance Testing Photo source: http://www.tequipment.net/
Level 4: Functional Testing 45 Cooling Equipment: Controls interface Alarm notification Team-work mode Heat Run for containment systems Failure modes Run Enable Failsafe Loss of Comm Power restore
Level 4: Functional Performance Test 46 Example: Chiller Plant Add/Drop Chiller New DC3 Expansion (1,800 kw total ITEQ) Added a 5 th chiller to existing plant of 4 Replaced all primary CHW pumps (2N) Installed active secondary CHW loop (Tier III) Live Data Center
47
48
49
Level 4: Functional Testing 50 Generator/Transfer Equipment: Test individual generators Transients, load steps Alarm notification Thermal Imaging Test parallel gear/ats Full Heat Run at rated load Entire plant
Level 4: Functional Testing 51 UPS and STS Equipment: Test individual modules Inverter/Bypass Maint Bypass Battery discharge Alarm notification Thermal Imaging Test parallel gear Full Heat Run at rated load Entire plant Interaction with Generator(s)
Level 4: Functional Testing 52 Emergency Power Off (EPO): Document how to restore Confirm control power source Avoid normally closed circuits for UPS/PDU equipment Interface with smoke dampers Is there a way to bypass for maintenance?
Level 5: Integrated Systems Testing 53 Integrated System Testing is a comprehensive test protocol that incorporates all electrical power, cooling, and control functions under design load IST is typically performed over multiple days IST requires participation from all contractors, vendors, the owner s agent, and the CxA
Level 5: Integrated Systems Testing 54 The Prove it Phase: Heat Run: Demonstrate ability for the IT equipment rooms to carry design load under steady state design conditions Low Load Conditions: Demonstrate ability of the infrastructure to carry Day One loads at steady state design conditions
Level 5: Integrated Systems Testing 55 The Prove it Phase: Loss of Utilities: Demonstrate the ability to automatically react to utility outages (this is NOT a failure Tier I) Maintenance: At load, demonstrate the ability to remove capacity components (Tier II) and pathways (Tier III) from service without interruption
Level 5: Integrated Systems Testing 56 Perform maintenance procedures: Remove a generator/ups from the bus Power down and isolate one electrical switchboard/ups/pdu completely Shut off one Distribution panel serving multiple CRAC units Isolate a chiller /pump Simulate a valve replacement Simulate a complete BAS controls outage
Level 5: Integrated Systems Testing 57 The Prove it Phase: Equipment Failure: Demonstrate the ability to automatically react to unplanned equipment failures (Tier II) Pathway Failure: Demonstrate the ability to automatically react to unplanned pathway failures (Tier IV) Example: Fire in a chiller room or electrical room Assume total loss of room
Level 5: Integrated Systems Testing 58 Example: Pull the Plug Low Load 3 Diesel Generators in parallel Dual bus UPS system (A /B) Dual Static Transfer Switches (A/B) Air cooled Chillers (N+1) Variable Primary pumping Computer Room Air Handlers (CRAH)
59
60
61
62
Level 5: Integrated Systems Testing 63 Example: Heat Run and BAS failure New DC3 Expansion (1,800 kw total ITEQ) Five air-cooled chillers in parallel Four primary CHW pumps (2N) DP control (five sensors fail safe) Computer Room Air Handlers (CRAH) Live Data Center Tier III Design
64
65
66
67 Commissioning within a live data center
Commissioning within a live data center 68 New Equipment Commissioning Outage/failures during the testing process impact the business Full load Systems Integration Testing is sometimes not feasible without scheduled downtime (Tier I and Tier II) Medium risks Full Facility Retro-Commissioning It s was never really commissioned before Usually using live load for the test load High Risks Especially Tier I and Tier II Facilities
Commissioning within a live data center 69 The CxA in a mission critical project will often be expected to prepare MOPs (Methods of Procedure) for all critical steps in construction and testing The CxA is typically best suited for this role, as he/she has the best understanding of the overall risks related to the procedure Critical MOPS must be identified early in the design process Team effort Must have sign off by owner, engineers, and contractors Perform a Dry Run in advance
Example MOP 70
Example MOP LOG 71
Questions and Discussion 72