THE INFORMATION SOURCE FOR THE DATA CENTER INDUSTRY Data Center Knowledge Guide to Optimization in the Virtual Data Center Realizing the full potential of a mission critical investment March 2013 Brought to you by
Executive Summary The modern data center faces challenges from dynamic technology lifecycles and a demanding business environment. Evolving from server closets to multi-million dollar facilities, the modern data center incorporates a number of innovations that contribute to its effective and efficient operation. Facility and IT infrastructure must be agile and keep pace with the business, while maintaining mission-critical distinction. To foster an agile, proactive approach towards facility and IT operations, data center owners and operators must adopt suitable engineering tools and techniques. Monitoring, management and simulation techniques help to: ffunderstand the cadence of business. f ffunction as a link to illustrate the efficiency of the data center investment. f fidentify, assess and determine the risk of vulnerabilities. f fdevelop business and technical planning scenarios to deal with risk and uncertainty. ffidentify and mitigate lost capacity. Using the right tools and techniques for the right job equates to monitoring, simulation and DCIM (Data Center Infrastructure Management) working in harmony to protect and sustain the data center investment. Using simulation techniques in a data center prior to or in production can identify conditions that lead to unnecessary costs, lost capacity, and service interruptions. This paper will explore the value that monitoring, simulation and DCIM offer, and the additional value of integrating these engineering tools and techniques. It explores the importance of promoting an environment where IT and Facilities departments present a unified approach towards the data center. The engineering tools available to data center managers also present the risk of inaccurate interpretations and unreliable forecasting. Finally, this paper explores the misnomer of perceived capacity, the complexities of capacity management, and having an integrated simulation platform act as a catalyst for re-gaining lost capacity in the data center. Contents Engineering Tools...3 Simulation...3 DCIM...4 Integration...4 Risk Management...5 The Dangers of Extrapolation...5 Predictive Analytics...5 The Virtual Facility................ 6 Capacity Management...6 Investment Efficiency...7 Summary...7 2
Engineering Tools A data center monitoring system is a fundamental instrument used for IT and Facilities alike, to help avoid and recover form operational outages. It is put in place to provide real-time data on the operational conditions that impact the ability to do business. A growing number of elements are being examined in the data center from the chip level of individual components, through data center infrastructure components, all the way to factors external to the data center that have the potential to impact facilities. Real-time data is gathered from sensors looking at facility conditions such as temperature, humidity, water, power, intrusion detection, and IT monitoring looking at equipment availability, network parameters, and a variety of services. Engineering tools for monitoring range from monitoring via SNMP (Simple Network Management Protocol) for small components or software services, to complete building management systems. The diversity and range of things that are now monitored has fueled innovation for how they are monitored. Monitoring methods range from wireless solutions, to new software standards, and have added asset management with RFID tags. Monitoring is an important tool for aggregating and reviewing real-time data about the enterprise, and responding accordingly. Simulation Another engineering tool in the data center is simulation. Simulation is powerful technique that combines aspects of data center design, management, strategy and optimization. It enables a great deal of insight into design efficiencies, operational simulations such as airflow, power and other efficiency calculations, and overall what-if scenario planning. It is a vital practice for the data center, where multiple complex processes and interdependent systems are at work. Derived insights from simulations shed light on design flaws, and let the operator visualize the facility in three dimensions. Modern simulation techniques can also aggregate the wealth of data being captured throughout the data center, presenting a real-time view into operational parameters, conditions A key element of data center simulation that is most often thought of is CFD (Computational Fluid Dynamics). CFD provides a scientific and comprehensive design approach to data center cooling management and helps visualize current conditions and analyze proposed modifications. and alerts. It can digest a variety of inputs from monitoring, DCIM, CAD drawings and other resources, and then provide rich output in the form of visualizations and predictive analytics. A fully engaged simulation analysis will empower a real-time, 3-dimensional capacity model of the data center. A key element of data center simulation that is most often thought of is CFD (Computational Fluid Dynamics). It has been used for many years and in many industries to simulate things like racecars, aircrafts and most manufactured goods. CFD provides a scientific and comprehensive design approach to data center cooling management and helps visualize current conditions and analyze proposed modifications. Cooling is a primary aspect of both designing and operating a data center, and CFD serves design and operations in exploring cooling path management, identifying and assessing the impact of various risks, and (virtually) demonstrating the effect a change will have without making it in production. CFD visualizations can analyze model variants and the impact of configuration changes in the data center. This allows for validation in the virtual facility before implementation in the real facility. In existing facilities, as long as the model is built accurately and calibrated, operators can review current conditions, troubleshoot problem areas, or evaluate the potential impact of design upgrades. 3
DCIM DCIM (Data Center Infrastructure Management) is the integration of IT and facility disciplines to centralize monitoring, management, and intelligent capacity planning of critical systems in a data center. It is viewed as a convergence across logical and physical layers within the data center. DCIM solutions are implemented to address the growing complexity in IT systems, improve management and capacity planning, improve visibility, improve availability, and balance supply with demand. Having grown exponentially in the past several years DCIM is sure to expand in adoption, as infrastructure visibility, energy consumption tracking and many other aspects demonstrate the strategic value of it. DCIM INTEGRATION IT Disciplines DCIM Facility Disciplines Centralizes monitoring, management, and intelligent capacity planning of critical systems in the data center. Integration The large investment made in a mission critical facility warrants a commensurate investment in the tools selected to develop and operate it. These tools are the means that will shape the success of implemented data center designs, and serve as the operational backbone. Monitoring provides the essential visibility into operations that is needed, DCIM provides an even further holistic view, and simulation has the ability to bring it all together, complementing and integrating them into an essential approach to managing the virtual facility. The concept of a virtual facility, as pioneered by Future Facilities, is a full 3-dimensional mathematical representation of physical and logical attributes of a data center. This comprehensive visualization captures all simulation data from past, present and future states. The virtual facility captures all facility and IT attributes and monitoring points, and then produces predictive analytics, allowing the operator to visualize and solve capacity problems before committing to change. Every characteristic of a data center plays a part in the greater data center production, and as such a complete simulation includes building, room, rack and equipment level views for architectural geometry, cooling and ventilation, power and IT configurations. Future Facilities 6Sigma suite of design and management software incorporates data input from 2D and 3D CAD drawings, as well as integrating with asset management tools, monitoring and DCIM software. Models are built with the purpose of evaluating the risks and benefits of a data center design, in a virtual environment. With 6Sigma software the data center can not only be easily built, but also continuously modified as changes are made, and operated with live data fed into the model. In addition to CFD simulations the virtual facility portrays the comprehensive data center environment including assets, network distribution, cable management and power feeds, among others. 4
Risk Management A data center owner and operator face a tremendous amount of risk, in all tangible and intangible aspects of managing an effective facility. Redundant, highly available infrastructure is engineered, and processes are carefully mapped and executed to ensure maximum availability. Data center monitoring and DCIM solutions generate valuable data about assets for IT and data center infrastructure. Relying on assumptions and spreadsheets to mitigate risk and manage capacity throughout the life of the data center, while certainly an economical route, will surely lead to inescapable failure. A principle of risk management is to be dynamic, iterative and responsive to change. To accomplish this in the complex ecosystem of a data center is challenging, but is achievable when armed with tools to respond and techniques that enable predictions. Integrating monitoring and DCIM solutions with simulation allows for properly identifying and characterizing risks, assessing vulnerabilities, and empowers both IT and facilities to respond dynamically. Data center monitoring and DCIM solutions generate valuable data about assets for IT and data center infrastructure. Utilized properly, monitoring presents the opportunity to identify issues as they occur in production and react accordingly. Independently it is an important tool for managing operations and awareness, however when integrated with DCIM and simulation it packs a powerful, visual punch. The Dangers of Extrapolation The speculative nature of data center operators, coupled with aggregated, real-time data about a facility leads to false perceptions about future trends. Engineering schools now have courses in simulation, that instruct on both proper use and how to interpret results. They are also taught that extrapolation is unreliable, in that trends cannot be predicted. When a person views monitoring data and identifies patterns it is human nature to extrapolate and believe that they can solve the mystery of data center issues, and worse yet, predict the future. With millions of dollars invested and more on the line, the educated guess should not be accepted as a foundation for operating the mission critical facility. With the increase in DCIM tools as well as the volume of things to monitor, interpolation of data can also be commonplace. Interpolation is the estimation of an unknown quantity between two known quantities (historical data), or drawing conclusions about missing information from the available information. In either case, the notion of the data speaking for itself involves dangerous assumptions and conclusions that lead to costly and irresponsible decisions. Finance is another area where this kind of prognostication takes place. Trends are identified, guesses are made, and money is on the line. Since the initial outlay and ongoing costs of a data center are substantial and heavily scrutinized for being run effectively, data centers are often led by Chief Financial Officers or others in the finance department. In the finance industry, an actuarial scientist uses mathematical and statistical methods to assess risk. Like the actuarial scientist, the data center operator can use simulation, backed by actual data and analytics to develop effective forecasts that predict short and long-term trends. There is a lot of room for science in analyzing the complexity and amount of data center data. However most people seem blissfully unaware of what applying this level of analytics can do for efficiency gains, energy savings and return on investment. Using simulation influences decision management, by discovering, building predictive analytic models, and implementing a continuous improvement cycle. Predictive Analytics The massive amounts of data collected through monitoring and other systems, or in popular recent terms big data, should not just be collected and viewed, but be processed with powerful analytic tools to assist in a mathematically-based approach to predict trends. Predictive analytics gives an operator actionable insight that contributes towards a true understanding of short and long term trends. The power of predictive analytics lies in complete integration of engineering tools and techniques. Alone, each tool has many benefits and serves its purpose. Integrating monitoring and DCIM data with simulation practices can then leverage that data, presenting a comprehensive virtual facility and empowering predictive analytics. 5
The Virtual Facility Addressing data center challenges is specifically what the 6Sigma suite of software from Future Facilities is designed to do. Its Virtual Facility is a predictive methodology and visual communication tool for use by Facilities and IT management. Building a simulated data center in 6Sigma is relevant at any time from design, construction, and commissioning across to on-going operation of the facility. The 6Sigma suite helps bridge communication gaps among all data center stakeholders, with a unified vision of the data center where simulation earnestly reflects the production environment. It works with a large vendor community, including IT and infrastructure equipment manufacturers, various consultancies providing architectural, mechanical and electrical services. The Virtual Facility is a platform to build simulations of a data center that serve IT and Facility managers to administer and optimize data center resources. Utilizing the full potential of a data center is a difficult target to reach, especially when IT assets are in a constant state of flux. Both engineering experimentation and facility management benefit from modeling changes in the simulated facility, with the ability to test the cause and effect of any change prior to final implementation. Data center optimization can take place without putting the data center at risk by using the virtual model. Capacity Management When a data center project is initiated and designs finalized, the expectation and business case for investment is almost always a perceived utilization rate of 100 percent. All of the IT and supporting infrastructure are assumed to be perfect, and will operate exactly according to plan. Once built and in production however, reality sets in and the only the thing that stays the same is a constantly changing environment. Only when this margin on capacity is closing is it realized that the faulty, human-derived forecasts have concealed the lost compute capacity. Even with a generally accepted percentage of lost capacity, it ultimately equates to lost money, and an irreversible impact to return on investment. Lost capacity is the fragmentation that happens between assumed utilization and actual utilization and typically fragments initial return on investment estimates. IT requirements based only in short and long term assumptions have a significant impact on capacity fragmentation and the ability to properly manage the supporting facility power, cooling and space. Similarly, the data center power, cooling and space are based in certain assumptions about IT requirements, which scale with the business, and change dramatically over time. Within IT spending in technology domains for a business, data centers account for a substantial portion which means capacity losses equate to significant figures for additional cost and loss of revenue. Gartner has found that 70% of data center facilities have failed to meet their capacity requirements without some level of renovation, expansion or relocation. In an Uptime Institute survey of 21,000 data center operators, 54% of respondents ranked data center capacity as the driver for their long term approach to data center energy efficiency outweighing by far all other drivers. Simulation techniques are used to validate designs of new data centers before being constructed. The same methods can be utilized years down the road to re-validate, based on current operating conditions. Facility design permutations implemented and dynamic load distributions can cause major problems over time and make capacity management extremely difficult. This ultimately leads to lost efficiency, lost capacity and increased costs. Operational validation then combines monitoring data and simulation to look at the continuously changing data center, for maximizing IT service availability and long-term capacity. The 6Sigma suite from Future Facilities presents the comprehensive virtual facility, as a dynamic and predictive simulation of the physical and logical data center. Real-time monitoring and asset data is entered, and availability for IT services and compute capacity is produced. Additionally the CFD component visualizes airflow and helps identify problems and evaluate changes. Together it bridges IT and Facility functions to have a unified engineering tool aid in driving the most efficient and effective facility. 6
Investment Efficiency Data centers cost between $10 and $30 million per megawatt and are engineered for maximum availability. With this level of investment it is imperative for IT and Facility departments to work together and thoroughly explore as many avenues as possible to deliver maximum ROI to stakeholders. For the nominal cost of engineering tools there are a number of examples that demonstrate how they deliver investment efficiency. 1 2 3 An investment in simulation, fed by monitoring and DCIM data is most likely self-funded being offset by the savings that it has helped enable. it has helped enable. Identifying energy efficiencies within simulated facility models can make considerable financial gains. Leverage integrated engineering tools like monitoring, DCIM and simulation techniques to prevent costly down time. The cost of inaction and relying on faulty intuitions about mission-critical IT and infrastructure will more than likely be more costly than the minor investment in the engineering tools and profiting from predictive analytics. Summary Demonstrate the business value of monitoring, management and simulation effectually by putting big data to work in building a virtual facility, and then using its predictive analytics to avoid lost capacity and achieve maximum efficiency. Applying operational efficiencies across business functions, with the benefit of predictive analytics, evolves the enterprise. Ultimately, business is an exercise in risk management. To accurately address the shared IT and Facility risks, the right tools must be implemented. Employing simulation for a data center builds on the base level of big data provided by monitoring, by incorporating additional analytical automation. Every decision an organization makes will impact the risk it is willing to withstand. Streamlining data center operations while maintaining uptime and a flow of business is a dubious task. Keeping IT infrastructure agile and responding to a highly dynamic business environment involves operating at a rapid rate of change and keeping a close check on expenditures. The high cost of data center design, construction and operations means business maintains a close watch the investment, while a proactive business partnership approach between IT and Facility groups will manage a return on expectation. 4 Identify cost optimization opportunities in a virtual facility, by weighing short and longterm consequences of facility upgrades, energy savings programs and IT deployments. These cost drivers help avoid lost capacity and larger costs such as new construction or expansion projects. Initial calculations and business plans outlining the return on investment for the data center acted as the proof that it was a sound business decision. Engineering tools, integrated to properly depict a virtual facility act as proof that the investment is being protected and that maximum benefit is being driven from installed capacity. 7