Airflow Simulation Solves Data Centre Cooling Problem The owner s initial design for a data centre in China utilized 40 equipment racks filled with blade servers spread out in three rows along the length of a 47 foot by 32 foot room. The owners contacted American Power Conversion (APC), West Kingston, Rhode Island, to ask for a quote on computer room air conditioners (CRACs) to cool the data centre. Because the layout utilized a high number of servers in a relatively limited amount of space, APC recommended that instead of using a raised floor to distribute cooling air that the computer room air conditioners (CRACs) be located in the rows of equipment racks to improve cooling efficiency. While providing the quote, APC engineers simulated the initial design and discovered that the failure of a CRAC would cause the temperatures to rise above 90 F, close to the point at which equipment would begin shutting down. APC engineers simulated a number of different designs and determined that adding one more row of equipment and one more CRAC would maintain the room at safe temperatures even after the failure of a CRAC.
The owner recognized from the beginning the potential for problems in cooling their new data centre. The data centre made extensive use of blade servers, which greatly increase the amount of computing power that can be packed into a given amount of space but at the same time generate much more heat than traditional servers. A standard server cabinet dissipates on the order of 2-3 kilowatts of power while vendors are now designing blade servers that can demand over 20 kw of cooling per rack. The initial design utilized the traditional raised floor approach for distributing cooling air to the data centre. The drawback of the raised floor approach is that the sources of cooling air are located far from the equipment that needs to be cooled. This creates inevitable inefficiencies in moving the air to where it is needed and creates the possibility that significant amounts of cool air will never even reach the servers that require cooling. The simplest way to address this challenge is to add more airconditioning capacity but this is expensive and usually does not solve the problem, which is one of air distribution. Improving on the initial design Ben Steinberg, senior applications engineer for APC, who was assigned to create the proposal, felt that the initial design could be improved by eliminating the raised floor and deploying CRACs within the rows of equipment. The advantage of this approach is that it provides the potential to deliver the cooling air to where it is needed with much smaller losses. Steinberg used hand calculations to determine that one of the company s NetworkAIR IR 40 kw in-row precision air conditioning units positioned in each row should be able to handle the data centre cooling requirements. But Steinberg was far from done. In this type of application, the biggest challenge is usually ensuring that the data centre will continue to operate despite losing an air conditioning unit. It s almost impossible to determine if redundancy truly exists by just looking at the design, so I made the decision to use computational fluid dynamics
(CFD) to simulate the heat generation, air flow, heat removal in the room, Steinberg said. CFD can calculate and graphically illustrate complete airflow patterns, including velocities and distributions of variables such as pressure and temperature. As part of the analysis, the user may change the layout of the building or the operating conditions, and observe the effect of the changes on the airflow patterns and temperature distribution, and how they impact the cooling performance. Engineers are able to quickly evaluate the performance of alternative equipment configurations. APC uses FloVENT CFD software from Mentor Graphics to analyze and optimize data centre cooling configurations. FloVENT is designed specifically for modeling heating and cooling applications so it is both easier to use and more powerful than general purpose CFD codes when evaluating data centre cooling, Steinberg said. FloVENT also has a team of support engineers that provide excellent support because they have a very good understanding of data centre cooling issues. Simulating cooling performance of the data centre Steinberg worked with the computer-aided design drawing of the data centre provided by the owner. His basic approach was to configure the four aisles created by the three rows of equipment as alternating hot and cold aisles.starting from one of the 47-foot walls, he positioned the servers and CRACs to make the successive aisles cold, hot, cold, and hot. Steinberg positioned the rack-mounted servers so that the rear of the servers blow hot air into hot aisles while the front of the servers draw in cool air from the cold aisles. He put one CRAC in each row positioned so that it draws in hot air from a hot aisle and emits cool air to a cold
The original design provided by the customer aisle. This approach optimizes cooling efficiency by minimizing the ability for the hot air to mix with the cold. He constructed a box representing the room and created cubes representing each rack of equipment. The owner provided the make and model number of each server and Steinberg obtained technical specifications from the manufacturer s web site to determine their power consumption and airflow. He modeled the cooling units by entering their airflow rate and inlet and outlet temperatures. Steinberg then ran a steady state analysis of the data centre with all CRAC units operating. The CFD simulation provided the temperatures, airflows, and pressures at all areas in the room. This information not only made it possible to determine the cooling performance of the design but also provided information that helped understand the reasons behind the design s performance. As expected, the analysis showed that the CRAC units were easily able to cool the data centre. Then Steinberg moved on to the more challenging part of the analysis. He removed the CRACs from the model one after another and reran the simulation. The results showed that the design was able to maintain acceptable temperatures with the CRAC in row 1 or row 2 out of operation because whichever unit was still in operation was able to keep hot aisle 2 cool. On the other hand, when the CRAC in row 3 was removed from the simulation, temperatures quickly reached the unacceptable 90 F level, primarily because there was no CRAC unit to remove heat from hot aisle 4.
Developing a new design that solved the problem A new design developed by APC Steinberg went back to the customer and showed them the problem with the original design. The customer was very happy that APC had gone beyond simply responding to their request for proposal and used simulation to evaluate whether or not the proposed solution would actually provide reliable computing performance. The data centre owner s executives stated that its critical nature made it essential to maintain redundancy. Steinberg suggested adding a fourth row of equipment with a fourth CRAC unit. He pointed out that the fourth row would provide each hot aisle with one CRAC unit on each side of the aisle, providing redundancy in case one unit failed. He also mentioned that the fourth row would make it possible to easily expand the data centre in the future without changing the cooling configuration. He further recommended the use of blanking panels for unused vertical space in the rack enclosures to prevent hot server exhaust from taking a shortcut back to the equipment intake.
In order to confirm this assumption, Steinberg modified the FloVENT model to add the fourth row of equipment and the fourth CRAC. He simulated the new design with all CRAC units operating and then once with each one of the CRAC units removed from the simulation. The results showed that regardless of which CRAC was removed, temperatures remained at safe levels of between 75 F and 80 F throughout the data centre. Steinberg showed these results to the owner and the owner made the decision to go with the new design and purchase the four CRACs from APC. It would have been extremely costly to install the equipment in the data centre, run tests, and then discover that it would not properly cool the servers in the event of an equipment failure, Steinberg said. But it would have been far more costly for the customer to have its data centre go down because of cooling problems. This helps explain why we run simulations for many of our proposals. Simulation provides a fast, relatively inexpensive, and accurate method of evaluating data centre cooling performance without going to the time and expense required to actually install and test the equipment.