Measure Server delta- T using AUDIT- BUDDY The ideal tool to facilitate data driven airflow management Executive Summary : In many of today s data centers, a significant amount of cold air is wasted because it bypasses the server, or hot air recirculates back into the cold aisle. The extent of this problem can be determined by comparing the delta- T (temperature differential) at the CRAC unit and delta- T at the server; the higher the difference, the more significant the cooling inefficiency. In the past, measurement of the server delta- T has proven difficult. Now users can place two AUDIT- BUDDY systems (one in the hot aisle and one in the cold aisle) to track changes in delta- T at a given server over an extended period of time and assess the degree of recirculation or bypass airflow present. This simple portable tool yields a metric to indicate cooling efficiency at the rack level. 202 Worcester Street, Unit 5, North Grafton, MA 01536 www.purkaylabs.com info@purkaylabs.com 1.774.261.4444
Introduction An efficient data center requires an effective cooling and airflow management system. The modern forced air cooled data center is designed to maximize the passage of cold air through the server. In reality, a fair amount of cold air bypasses the servers completely or recirculates with hot air, leading to poor efficiency and un- necessarily higher electricity usage. The problem has become more difficult with the addition of high- energy consuming blade servers and rapid changes in an active facility. It is possible to assess the cooling efficiency by comparing the server delta- T with the HVAC delta- T at the CRAC units. In the past, it was difficult to measure the server delta- T. Now, users can place two AUDIT- BUDDY systems in the hot and cold aisle and to track the changes in delta- T at a given server over an extended period of time. The information gathered, when compared to the HVAC delta- T, can be used to assess the degree of recirculation or bypass airflow and address the cooling (in)efficiency. Data Center Cooling The data center cooling model has changed significantly over the years. Previously, the cooling objective was to cool the whole data center at a uniform temperature, around 68 F. 1 Now, the aim is to direct cooling to servers and not overcool portions of the room that don t need to be cooled. 2 The basic premise is by separating the hot and cold aisles the cold supply air goes to the server intakes and allow hot server exhaust to go directly to the cooling unit return air. 3 This allows data centers to maintain a specific inlet temperature designated by the equipment manufacturer s specifications, while keeping cooling costs low. (See Diagram 1: Airflow Distribution- Underfloor) Diagram 1: Underfloor Airflow Distribution 1 Lucian Lipinsky de Orlov Lowering Data Center Cooling Costs with Airflow Modeling and Perforated raised- floor tiles 27 April 2009 at http://searchdatacenter.techtarget.com/news/1354850/lowering- data- center- cooling- costs- with- airflow- modeling- and- perforated- raised- floor- tiles 2 Arthur Cole and Wally Phelps Simple Steps to a Greener Data Center 25 November 2008 at http://www.itbusinessedge.com/cm/community/features/interviews/blog/simple- steps- to- a- greener- data- center/?cs=23181 3 Ibid. Page 1
Unfortunately, a fair amount of cold air is wasted. The biggest and most common problem is that the cold air supply air bypasses the server rack and mixes with the warm return air without cooling the server. Another associated problem occurs when the hot air from the server exhaust recirculates back into the cold aisle instead of returning to the CRAC unit. (See Diagram 2: Bypass & Recirculation Airflow). Both these factors contribute significantly to a poor PUE number and the CRAC units are forced to work harder to compensate for the cold air inefficiency, raising the cost of cooling. Diagram 2: Bypass & Recirculation Airflow The immediate inclination is to add additional cooling units or crank up the CRAC and overcool the facility to address the temperature variation at the rack. This mistakenly assumes that inlet server temperature is too hot or too cold because there is not enough cold air being supplied or that the air is not cold enough. Instead, the fundamental issue is that the cold air is not reaching the server efficiently, and instead is being lost or wasted through bypass or recirculation airflow. 4 Data Center Managers must improve their airflow management in order to improve cooling conditions. This means identifying and eliminating the airflow waste to ensure that the majority of cold air supplied to the data center produces effective cooling at the server. 5 4 Vali Sorell Airflow Management Strategies for Efficient Data Center Cooling March 2009 at http://searchdatacenter.techtarget.com/tip/air- flow- management- strategies- for- efficient- data- center- cooling 5 Vali Sorell Airflow Management Strategies for Efficient Data Center Cooling March 2009 at http://searchdatacenter.techtarget.com/tip/air- flow- management- strategies- for- efficient- data- center- cooling Page 2
Determining Airflow Waste & Cooling Efficiency through the delta- T ratio The Data Center Manager must first measure the temperatures at both the CRAC unit and at the server. Data Center Managers can then quantify the amount of airflow waste by comparing the server delta- T (at the server rack) with the HVAC delta- T (at the CRAC unit). The server delta- T (ΔT Server ) is difference between inlet and outlet temperature of a server. HVAC delta- T (ΔT HVAC ) is the difference between the supply air and the return air. If the ΔT HVAC is the same or similar to the ΔT Server, then cooling is efficient. This means that there is little to no change in temperatures between the CRAC unit and the Server, and all or the majority of the cold air supply is reaching the server. If the ΔT HVAC is greater than the ΔT Server, then there is bypass air circulation. This means that some of the cold air is returning to the CRAC unit without cooling the server. The greater the difference, the more cold air is bypassing the server, and the data center is being over cooled unnecessarily. If the ΔT HVAC is less than the ΔT Server, then there is recirculation airflow present. This means that hot air from the server exhaust recirculates back into the cold aisle instead of returning to the CRAC unit. The greater the difference, more likely there is not enough HVAC airflow reaching the server and there is high chance of hotspots. Cooling Pattern Cooling Condition Result ΔT HVAC = ΔT Server Efficient Cooling Cold air reaches server with no or minimal waste ΔT HVAC > ΔT Server Bypass Airflow Cold air not reaching server; overcooling ΔT HVAC < ΔT Server Recirculation Airflow Server exhaust mixing with cold air A simple metric to characterize cooling efficiency is to compare the ratio of ΔT HVAC to ΔT Server or the delta- T rati ( T RATIO ). The T RATIO is defined as: delta- T ratio ( T RATIO ) = Server delta- T (ΔT Server ) : [Outlet Temperature Inlet Temperature] HVAC delta- T (ΔT HVAC ) : [Return Temperature Supply Temperature] With this formula, Data Center Managers can determine the extent of bypass or recirculation airflow. Implication Corrective Action Table 1 delta- T Ratio (ΔT RATIO ) for Rack 20010 Below 0.9 0.9 to 1.2 Above 1.2 Bypass Dominates Low Entering Temperatures, Excessive HVAC usage Better containment required for safe operation of servers. Normal Operating Conditions Strive for 1.0 Recirculation Dominates too little HVAC airflow, high entering temperatures Consider blanking panels and other corrective actions. Page 3
The Problem of Measuring ΔT Server While ΔT RATIO is a simple and effective way to gauge cooling efficiency, measuring the ΔT RATIO is not common practice in the modern data center. The reason is that it is far easier to measure the ΔT HVAC at the CRAC unit, than measure the ΔT Server at the server rack. CRAC units are supplied with temperature sensors and controls at the return and supply. Busy Data Center Managers can easily make adjustments, and supply colder air to the racks without any extra measurements. But, as mentioned previously, colder air does not compensate for cooling inefficiencies due to bypass or recirculation airflow. If anything, overcooling gives a false sense of reliability, which leaving the server open to hot spots and server failures. Measurement of ΔT Server is extremely important, but difficult to ascertain. The temperature at these multiple server, rack and aisle locations will vary greatly at any given point. Data Center Managers measuring the ΔT Server cannot just hold a thermometer at the inlet and outlet of the server and get an accurate measurement of ΔT Server. Data Center Managers have to take three factors into account when measuring ΔT Server. (1) Rack Height: Every server rack contains multiple servers, and the temperature will vary greatly across the server profile. One simply has to look at a CFD model or thermal contour map to see that the inlet temperature at the front of the rack will vary from the top to the bottom. One must measure delta- T at different heights to account for variance across the server profile. (2) Aisle Composition: In the same vein as the individual server rack, there will be temperature variation from rack to rack across an aisle. The inlet temperature at one end of an aisle will not necessarily be the same as inlet temperature at the other end. This is further complicated by the mix of old servers and high- energy blade servers. One must measure delta- T at different racks to account for variations across an aisle. (3) Time: The IT load varies throughout the day in an active data center. The IT load in a financial services firm may be much higher during the day during trading hours than at night. As a result the delta- T will change over time in a dynamic environment due to server loading, cooling algorithms, and airflow variance. One must measure delta- T over a long period of time (24 hours) to account for load variations. Traditional methods of measurement such as BAS or DCIM are inadequate to measure ΔT Server because none monitor the inlet and outlet at multiple rack heights simultaneously, or measure over a long period of time. Hand held devices are good for spot measurements only and completely miss the time variance aspect. IR guns are simply not accurate enough. 6 6 For more information, please Purkay Labs Introducing AUDIT- BUDDY : Monitoring Temperature and Humidity for Greater Data Center Efficiency at http://www.purkaylabs.com/pdfs/audit_white_paper.pdf Page 4
Given these challenges rack height, aisle composition, time Data Center Managers have mostly chosen to ignore or approximate server delta- T values. The only practical solution is to use two AUDIT- BUDDY systems to measure the inlet and outlet air temperature at the server. AUDIT- BUDDY is a stand- alone temperature and humidity monitor that measures true air quality in Data Centers. Each system consists of three temperature and humidity (TH1) Modules and an adjustable carbon fiber rod. The design permits facility managers to measure inlet air temperature at three different heights (up to 84-48U). AUDIT- BUDDY is battery powered and weights 5.5lbs so it can measure multiple servers, and then be moved across an aisle to measure be moved across multiple racks. The weighted triangular base allows the AUDIT- BUDDY to be placed as close as 1 inch away from the server rack, allowing the patent pending fan design to draw in air and quickly adjust to the thermal ambient. Measuring ΔT Server with AUDIT- BUDDY To measure delta- T across a rack, place one AUDIT- BUDDY in the Cold Aisle side and the other on the Hot Aisle side (See Figure 1). The TH1 Modules may be placed at any height on the adjustable Carbon Figure 1 Measuring ΔT Server with AUDIT- BUDDY Fiber Rod. This allows for measurement of three different servers on the rack at the same time. The placement of the TH1 Modules should be as shown with the TH1 Module as close to the server inlet and exhaust as possible. Figure 2 illustrates a close up for measurement of a particular server. The design of the AUDIT- BUDDY system allows the TH1 Module to be as close as 1 from the server. Page 5
Figure 2 Close- up of Measurement of one Server Once the Modules are positioned, both TH1 Modules should be set to collect data using LongScan mode with a sample rate of 1 minute. The sampling duration is dependent on the test interval. For example, if the desire is to track the delta- T performance through a work day of 9 hours, set the collection period to 12 hours or higher. After the data collection is completed, the data is transferred from the TH1 Modules using a USB stick directly to the PC or MAC Excel Program. Analysis of Data Purkay Labs offers the delta- T Excel Macro along with the delta- T package. This Macro runs both on a PC or Mac computers running Excel 2010 or higher. Starting the Macro is achieved simply by clicking the Start button. Figure 3 Start Button for delta- T Macro Once the Macro is enabled, the program asks the user to identify the data collected by the two TH1 Modules. The data is stored in the USB in a special compressed format and the files are identified as.pkl files. The file name reflects the time when the data collection was started and is named automatically based on when the Data collection was started in a PMDDHHMM format. For example, if measurement for the Inlet TH1 Module in the middle position was started at July 12 at 7:45PM, the file name will be M7121945.PKL. Page 6
The program will ask the user to identify both files. It does not matter which file is identified first. The program determines which files represents the inlet and outlet readings automatically. The program lines up the measurements even though the TH1 Module units may have been started slightly different times. The ΔT Server information is calculated and presented automatically. The program, as an option, allows the user to input the Operator and Specific Server Name, before allowing the user to save the information in a.csv file. Figure 4 Delta- T Macro Output Page 7
Figure 5 Delta- T, Inlet, and Outlet Plot over time The Macro Utility also allows the user to generate a Plot of the ΔT Server over time (Figure 5) to be exported to a.gif or.png file. The trend plot allows for one to synchronize wide variance of ΔT Server with specific server loads. This information would be useful in either tuning the server load via virtualization or changing the cooling strategy around that particular server. For example, a higher ΔT Server at certain parts of the day would indicate perhaps the airflow is not uniform when the server is drawing lot more IT Load. This would be an indicator that corrective action may be necessary for that specific server. The Macro Utility also allows for integrating six Module Data into one report to report to the user the overall summary of Bypass or Recirculation airflow present in the rack by calculating an aggregate T- RATIO for the particular rack where the measurements were conducted. Towards a more Energy Efficient Data Center AUDIT- BUDDY Delta- T Package gives the Data Center Manager an economical tool to measure T Server in a facility in simple, inexpensive manner. One can get the T Server of three servers averaged over a period of operation without requiring any additional infrastructure additions, permanent or temporary installs in the cabinet or learning a new software tool. Once the average T Server is gathered, one can get the T- RATIO for the Data Center by looking at the T HVAC. The Manager now has an idea of how efficient the cooling air flow is and the degree of over cooling Page 8
present and what corrective actions are required Now the manager has the metric in hand to guide whether to raise the operating temperature of the facility or simply shut down some of the extra CRAC units, or do some simple air flow management with tiles. One can measure T- RATIO again after the changes have been made to decide whether the lever has been moved sufficiently or there are more efficiency opportunities to be realized. Given that almost every Data Center is overcooled because of different factors cited above, the AUDIT- BUDDY Delta T Package gives the Data Center Manager the metric through which one could implement a clear air flow management strategy, reduce operational expense without affecting the reliability of the data centers. This low hanging fruit can be realized quite easily using AUDIT- BUDDY Delta- T Package while reducing the carbon footprint of the Data Center. Summary AUDIT- BUDDY offers an economical way to measure ΔT Server throughout the Data Center taking in to account variations on the server load throughout the day. It is accurate, measures true air temperature and best of all is portable. It may be moved from rack to rack as the dynamics of the Data Center change. Powered by 3 AA alkaline batteries, AUDIT- BUDDY can be installed in minutes and requires little back- end management. It can be used on demand, and stored when not needed. The system imposes no demands on the existing infrastructure and may be deployed throughout the Data Center floor as required. This inexpensive tool will allow the Facility Engineer to manage the Data Center more efficiently and reduce Operational Expense by establishing better cooling strategies without putting the servers at risk. About Purkay Labs Founded in 2012, Purkay Labs specializes in temperature and humidity monitoring devices. Our flagship product AUDIT- BUDDY is a revolutionary standalone monitor that measures free air in the white space. Use this inexpensive tool to spot check air quality, measure non- BAS covered zones, prove SLA and generate real- time CFD. Make data- driven decisions with AUDIT- BUDDY to control energy costs and increase operational efficiency. Ideal for telecom closet, server room, colo and mission critical facilities. AUDIT- BUDDY is designed with the simplicity and portability in mind. We proudly provide personalized and knowledgeable customer service for our products. Additional information is available at www.purkaylabs.com or by email at info@purkaylabs.com. Page 9