Electrical Efficiency Modeling of Data Centers By Neil Rasmussen White Paper 113
Executive Summary The conventional models used to estimate electrical efficiency of data centers are grossly inaccurate for real-world installations. Estimates of electrical losses typically are made by summing the inefficiencies of the various electrical devices, such as power and cooling equipment. This paper shows that the values commonly used for equipment inefficiency are quite inaccurate. A simple and more accurate efficiency model is described that provides a rational basis to identify, quantify, and eliminate waste of power and cooling equipment. 2
Introduction The total cost of ownership (TCO) for network-critical physical infrastructure (NCPI) in a typical datacenter can be $80,000 to $150,000 per rack. Of this cost, the cost of electrical power consumption is a very significant contributor, on the order of 20% of the total cost 1. This is of interest because much of the electrical power consumption is wasted and a significant amount of this waste is avoidable. It is estimated that data centers consume 40,000,000,000 kw-hr of electricity per year and the reduction of the waste associated with this consumption is a material public policy issue as well as being of material concern to data center operators. This paper shows how the typical simplistic models of data center efficiency grossly underestimate the electrical waste in data centers. Therefore, the entitlement to improve efficiency is consequently much greater than commonly believed. An improved model is presented that gives improved accuracy for data center losses and suggests where energy improvement opportunities are available. Model for Data Center Electrical Efficiency The efficiency of a data center is expressed as the fraction of the total electrical power supplied to the data center which ultimately is delivered to the Information Technology devices. The value for efficiency in this case falls between zero and 1, and is commonly expressed in percent. Efficiency = IT Load Power / Total Data Center Input Power If the data center were 100% efficient then all of the power supplied to the data center would reach the IT loads. In the real world there are a number of ways that electrical energy is consumed by devices other than the IT loads, including but not limited to: transformers, UPS, power wiring, fans, air conditioners, pumps, humidifiers, and lighting. Some of these devices, like UPS and transformers, are in series with the IT loads while others, like lighting and fans, are in parallel with the IT loads. In any case, virtually all of the electrical power feeding the data center ultimately ends up as waste heat. A diagram showing where electrical and heat power flows in a typical data center is shown in Figure 1. The data center described by Figure 1 is a typical high availability dual power path data center with N+1 CRAC units, operating at a typical value of 30% of design capacity. 1 Details on the contributors to TCO are described in APC White Paper #6, Determining Total Cost of Ownership for Data Center and Network Room Infrastructure 3
Figure 1 Power flow in a typical data center Chiller 33% Humidifier 3% Electrical Power IN CRAC 9% IT Equipment 30% PDU 5% INDOOR INDOOR DATA CENTER HEAT HEAT Waste Heat OUT UPS 18% Switchgear / Generator 1% Lighting 1% Note that much less than one-half of the electrical power feeding a data center actually is delivered to the IT loads. The data center in this example is said to be 30% efficient. Power and cooling equipment in the data center are characterized for their losses by the manufacturers, who provide data regarding electrical efficiency for each device. In the case of power equipment, efficiency is typically expressed in percent; in the case of cooling equipment, efficiency may expressed as a Coefficient of Performance which is the ratio of the heat power removed by the air conditioner to the electrical input power. Lighting power is simply expressed as a simple constant loss (i.e. a 60 Watt light bulb is 0% efficient since it generates 60 Watts of heat). The published values of efficiency for similar devices by different manufacturers do not vary dramatically, leading to the simplified view that the losses of a data center could be determined by simply adding the inefficiencies of the various components. Unfortunately, this approach does not provide accurate results in the case of real data centers. The use of equipment manufacturer s efficiency ratings causes users or designers to dramatically overestimate efficiency, and consequently underestimate the losses, of real data centers. Factors Required in a Data Center Efficiency Model The first and largest error in common models for data center efficiency is the assumption that the efficiency of the components is constant and independent of load. In fact, the efficiency of data center power and cooling components varies significantly with load. 2005 American Power Conversion. All rights reserved. No part of this publication may be used, reproduced, photocopied, transmitted, or stored in any retrieval system of any nature, without the written permission of the copyright owner. www.apc.com Rev 2005-0 4
The second largest error in data center efficiency modeling is the assumption that the power and cooling components are operating at or near full design load. This error would be insignificant if the efficiency of these devices were independent of load. However, a number of factors work together to cause devices to operate substantially below their rated load capacity. The third error is the assumption that the heat output of the power and cooling equipment is an insignificant fraction of the IT load, and that the inefficiency of cooling this equipment can be ignored. The major errors described above actually compound each other, particularly as the IT load is decreased. This is quite relevant, since the actual IT load on real data centers is typically significantly less than the design capacity. The consequence is that the errors above can cause data center losses to be underestimated by a factor of two or even more. Fortunately, a simple model can be constructed that incorporates the above factors and provides reliable efficiency estimates. Improved Models for Component Efficiency An improved model for overall data center efficiency depends on how accurately the individual components such as UPS are modeled. Characterizing power and cooling components by using a single efficiency value is the common method, but it is inadequate in real data center installations. The actual efficiency of a component such as a UPS is shown in Figure 2 as a function of load. UPS Efficiency Figure 2 Typical efficiency of a UPS vs. load 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Load % of full power rating 5
Note that at light load the efficiency of this device falls to zero. That is because there are some losses, such as control logic losses, that are independent of load. These losses are called by various names including: no-load, shunt, tare, or parallel losses. Additional losses that give rise to the flat part of the efficiency curve are those losses that are proportional to the load. These losses are called proportional losses. Furthermore, there are losses that are proportional to the square of the load. These losses are called square-law losses and give rise to an efficiency curve that falls with increasing load. The UPS described by the data of Figure 2 might be described as having 91% efficiency. This is in fact the efficiency at full load. However, at low loads the description of this device as having 91% efficiency is grossly in error, for example at 10% load the same UPS exhibits only 50% efficiency. Clearly a single parameter model for efficiency is inadequate in this case. In Table 1 the types of losses are mapped against the various types of equipment used in a data center, with typical values. Each type of loss is summed in the last column as the total loss for the component. Table 1 Typical electrical losses of NCPI components expressed as a fraction of full load component rating No-load Proportional Square-law Total loss NCPI Component loss + loss + loss = (single parameter) UPS 4% 5% - 9% PDU 1.5% - 1.5% 3% Lighting 1% - - 1% Wiring - - 1% 1% Switchgear - - 0.5% 0.5% Generator 0.3% - - 0.3% CRAC 9% 0% - 9% Humidifier 1% 1% - 2% Chiller plant 6% 26% - 32% From Table 1, it can be seen that by characterizing each type of device using no more than two parameters it is possible to create more complete models for the components used in data centers. Note that the losses in the table are expressed as a percentage of full load rating of the equipment and that for actual loads less than the full load, the percentage loss will change in the following way: No-Load Loss: loss percent increases with decrease of load 6
Proportional loss: loss percent is independent of load Square Loss: loss percent decreases with decrease of load The typical UPS of the prior Figure 2 is not accurately modeled by a single efficiency parameter but is appropriately modeled by the no-load and proportional loss parameters in Table 2. Effect of Loading on Data Center Efficiency The prior section explains that the efficiency of power and cooling systems decreases significantly when used below the equipment design rating. This means that any analysis of data center efficiency must properly represent load as a fraction of the design capacity for the various power and cooling systems. Simple efficiency models which use only a single efficiency value to model equipment are insensitive to loading (efficiency does not change with load in these models). Yet it is a fact that in the average data center, power and cooling equipment is routinely operated well below its rated capacity. The result is that simple models significantly overestimate efficiency of real data centers. For each type of power or cooling component, there are four reasons why a component is operated below its rated capacity: 1. The data center IT load is simply less than the system design value 2. The component has been purposefully oversized to provide a safety margin 3. The component is operating with other similar components in an N+1 or 2N array 4. The component is oversized to account for the diversity of the load Reason 1 The data is clear that in fact the average data center operates at 65% below the design value. This situation is described in more detail in APC White Paper #37, Avoiding Costs From Oversizing Data Center and Network Room Infrastructure. The next sections of this paper will show that underutilization is a very large contributor to data center inefficiency. Reason 2 It is a routine practice to oversize components via a common practice called derating. The idea here is to avoid operating components near their capacity limits. It is possible to operate facilities with no derating, but derating values of 10-20% are recommended design practice for high availability facilities. Reason 3 It is common practice to use devices in an N+1 or even 2N configuration to improve reliability and to provide for maintenance of components without powering down the system. Operating a system in an N+1 mode 7
causes the load to be spread among more components, effectively reducing their loading. For a 2N system, the continuous loading on any single component is less than ½ of its design value. The efficiency of a data center is therefore strongly affected by the operation of devices in N+1 or 2N configurations. Reason 4 The fourth and final way that a component is used below its rating is due to diversity of the load. This effect is subtle and best illustrated by example. Consider a data center with a 1 MW load. Consider further that for this data center a 1.1 MW UPS is selected. Between the loads and the UPS are 10 Power Distribution Units (PDUs), each feeding a portion of the data center. The question is what the rating of each of these PDUs is, and therefore what average load are they operating at. At first glance it would appear that if each were rated at 100 kw, the system design would be satisfied. Furthermore, if each PDU ran at full load the data center could run the entire load. However, in real data centers it is nearly impossible to assure balance of the loads on PDUs. The load on a particular PDU is dictated by the nature of the IT equipment in the region of the data center where the PDU is located. In fact, the loads on various PDUs in real data centers often vary by a factor of 2. If a PDU feeds a section of the data center that is physically utilized to capacity but is still not using the full power capacity of that PDU, then the remaining capacity on that PDU is unusable if the other 9 PDUs are fully loaded. This means that the only way to ensure that the full capacity of the data center can be utilized is to substantially oversize the total PDU capacity. The typical over sizing of PDU capacity is on the order of 30% to 100%. As in the previous examples, this oversizing degrades the system efficiency. It should be noted that the same problem that gives rise to PDU oversizing also drives the oversizing of air handlers. In addition to loss of efficiency due to the four reasons above, another major error in modeling data center efficiency is the assumption that the heat output of the power and cooling equipment (inefficiency) is an insignificant fraction of the IT load and can therefore be ignored. In fact, heat generated by power and cooling equipment within a data center is no different than heat generated by IT equipment, and must be removed by the cooling system. This creates an additional burden on the cooling system which creates the need for oversizing of the cooling system as well as creating additional losses in the cooling system. To properly account for these losses the cooling load must include both the IT equipment and the losses of any power and cooling devices located within the conditioned space. Improved Model for Data Center Efficiency Based on the discussion above, it is possible to construct an improved model for data center efficiency. The improved model has the following attributes: Components are modeled with a no-load loss plus a loss proportional to the load plus a loss proportional to the square of the load Oversizing due to component derating is incorporated 8
Underutilization due to N+1 or 2N designs is incorporated The cooling load includes both the IT load and the heat load due to inefficiency of indoor power and cooling components For a given data center installation, the model provides a graphical output of the efficiency as a function of load, understanding that typical data centers operate well below design capacity. An implementation of the model is straightforward and obeys the following general flow: Determine the average degree of oversizing for each power and cooling component type, provided derating, diversity, and redundancy factors. Determine the operating losses of each type of component using the input load, the fraction of rated load for the component type based on oversizing, the no-load loss, and the proportional loss. Determine the additional proportional loss due to the need for the cooling system to cool power and cooling equipment within the data center. Sum all the losses Compute and tabulate the losses as a function of the IT load in the data center. A computer model based on these principles has been implemented to compute energy consumption in the APC Data Center TCO analysis methodology, described in APC White Paper #6, Determining Total Cost of Ownership for Data Center and Network Room Infrastructure. The Efficiency of Real World Data Centers Equipped with a better model for data center power consumption, it is possible to make improved estimates of data center efficiency. Using typical values for equipment losses, derating, load diversity oversizing, and redundancy, the efficiency curve of Figure 3 can be developed. 9
Figure 3 Efficiency of a typical data center using the improved model Electrical Efficiency 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Fraction of the Power Capacity Used Note that this curve of efficiency vs. load is considerably different than estimates based on conventional calculations that utilize published component efficiency. A conventional estimate of the efficiency of the data center described by Figure 3 would be a value of 60-70% independent of load. Note the dramatic decrease in data center efficiency predicted by the improved model, particularly at the lower loads where many data centers actually operate. The model shows that for very lightly loaded data centers the inefficiency effects can be dramatic. For example, given a data center loaded to only 10% of its rated capacity, for every 10 Watts delivered to the data center only about 1 Watt actually reaches the IT equipment. The remaining 9 Watts is lost to inefficiencies in the Network-Critical Physical Infrastructure. Another way to look at these losses is in terms of financial cost. Figure 4 shows the annual electricity cost of a 1 MW data center as a function of IT load. This is based on a typical high availability dual power path design with N+1 CRAC units. A fully loaded electricity cost of $.010 per kw-hr is assumed for this analysis. 10
Figure 4 Annual electrical cost for a typical 1 MW data center as a function of the fraction of the design capacity that is utilized Annual Electrical Cost $2,000,000 $1,800,000 $1,600,000 $1,400,000 $1,200,000 $1,000,000 $800,000 $600,000 $400,000 $200,000 $0 Total Electrical Cost NCPI Load Electricity IT Load Electricity 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Fraction of the Power Capacity Used Figure 4 shows that the total electricity cost of a 1MW data center ranges from $600,000 to $1,700,000 per year, depending on the size of the IT load. Note that even if there is no IT load the cost is over $500,000 per year, driven by the inefficiencies of the power and cooling systems. At the 30% capacity utilization level of the typical data center, over 70% of the electricity costs are caused by inefficiencies of the power and cooling equipment. Opportunity to Reduce the Losses The model clearly shows that primary contributors to data center electrical costs are the no-load losses of the infrastructure components, which exceed the IT load power consumption in typical situations. It is notable that no-load losses are ignored in conventional analysis; indeed, a review of product specifications shows that the important no-load loss specifications for power and cooling devices are not routinely provided by equipment manufacturers. An analysis of the data quickly identifies and prioritizes the opportunities for reducing losses and improving the operating efficiency of data centers. These opportunities are as follows: 11
By far the biggest opportunity for savings is to reduce the oversizing of data centers by using an adaptable modular architecture that allows the power and cooling infrastructure to grow with the load; potential reduction in losses: 50%. Improve the efficiency of cooling systems; potential reduction in losses: 30% Reduce the no-load losses of power and cooling components in the data center; potential reduction in losses: 10% A more detailed discussion of efficiency entitlements and efficiency improvement opportunities is the subject is the subject of APC White Paper #114, Improving the Electrical Efficiency of Data Centers. Conclusion Conventional models of data center efficiency typically overstate efficiency because they do not properly comprehend the degree to which equipment is oversized, nor do they comprehend reduction of efficiency at the reduced loads where most data centers operate. An improved model provides more accurate numeric values for data center efficiency, as well as giving insight into where the losses go and how they can be reduced. Typical data centers draw more than twice as much power at the IT loads require. The cost associated with this power consumption is a considerable fraction of the Total Cost of Ownership of the system. All of the power consumed beyond the needs of the IT equipment is undesirable and much of this may be avoidable. Oversizing of data centers is the single biggest contributor to data center inefficiency, suggesting that scalable solutions that grow with the IT load offer a major opportunity to reduce electrical waste and costs. The potential electricity cost savings for a typical 1 MW data center are on the order of $2,000,000 to $4,000,000 over the typical 10 year life of the facility. Due to the large amount of power and cost used to supply data center inefficiency, the reduction of these losses should be of interest to all data center owners, as well as an issue of public policy. 12
About the Author: Neil Rasmussen is a founder and the Chief Technical Officer of American Power Conversion. At APC, Neil directs the world s largest R&D budget devoted to Power, Cooling, and Rack infrastructure for critical networks, with principal product development centers in Massachusetts, Missouri, Denmark, Rhode Island, Taiwan, and Ireland. Neil is currently leading the effort at APC to develop modular scalable data center infrastructure solutions and is the principal architect of APC s InfraStruXure system. Prior to founding APC in 1981, Neil received his Bachelors and Masters degrees from MIT in electrical engineering where he did his thesis on the analysis of a 200MW power supply for a Tokamak Fusion reactor. From 1979 to 1981 he worked at MIT Lincoln Laboratories on flywheel energy storage systems and solar electric power systems. 13