1 HOW TO SELECT A COLOCATION PROVIDER THE TOP 10 CRITERIA TO DISCUSS WHEN TOURING A DATA CENTER Choosing a data center or colocation provider to house your company s critical IT infrastructure is a huge decision. Unfortunately, colocation providers don t make it easy to differentiate between them. Many data center tours will have the same components: a sales rep will show you their data center s battery rooms, cooling equipment, security measures, and generators, all the while assuring you that they are highly secure, reliable, well maintained, invulnerable to natural disasters, and capable of high density computing. After a few tours, it may seem as though there isn t much difference between one data center and the next. This couldn t be further from the truth. It s important to know what to look for so you can make an informed decision. Migrating your IT environment is risky, often expensive, and certainly time consuming you want to make the right choice the first time. What is a data center? Data centers exist to provide a place to house a company s information technology (IT) and network equipment. A data center s mission is to create reliability, mitigate risk, and provide uptime for the technology and applications that it enables. Before selecting a provider, be sure to discuss these criteria to get a clear picture of the data center s capabilities in order to make the best choice for your company. 1. What is your current track record of delivering 100% continuous critical systems availability or uptime? Unplanned downtime can be dangerously expensive for your business. According to a recent study by Data Center Knowledge, companies lose an average of $7,900 per minute of an outage. One hiccup can cost careers, damage your company s reputation, and lose customers. Generally, the critical systems infrastructure in a data center is specific to the critical electrical and mechanical (cooling) systems that are controlling and maintaining the availability of power, cooling, and humidity to the IT equipment. Do not make the mistake of assuming that your availability or uptime will be based solely on the critical systems design and redundancies of the data center. Data center infrastructure design doesn t guarantee uptime; it is only one part of the equation that results in
2 continuous availability. Design can be viewed as a prediction of uptime, but human error and mismanagement will often foil that prediction. Ask these questions to find out what the data center s true continuous critical systems uptime record is and the results will speak for themselves. Focus on results, not a prediction! What is the current length of time that you have delivered 100% electrical and mechanical (cooling) critical systems availability (i.e. what is your track record)? What is your definition of continuous critical systems availability or uptime? Does your continuous critical systems uptime track record include or exclude maintenance windows? How many planned downtime events have any portion of the IT equipment within the data center experienced since the data center opened for operations? How many unplanned downtime events have any portion of the IT equipment areas within the data center experienced since the data center opened for operations? In cases of an unplanned downtime event(s) in the past: Were the customers proactively notified? Was a detailed and accurate report identifying the root cause provided? Were updates, resolution details, after-action reports, and plans for future mitigation provided? 2. How often are maintenance windows declared? Many data centers or colocation providers over-use or manipulate maintenance windows to make their availability results look better or to avoid SLA penalties. Regardless, planned or unplanned downtime in a maintenance window is still downtime for the customer or end-user. Understanding how and why maintenance windows are used can often shed light on critical systems design, capacity management, operational capabilities, or potential issues with each. These questions will help shed light on your data center s maintenance window procedures and how they may affect your business.
3 What types of maintenance windows are used? Ask to see the last five years of reports for declared scheduled, emergency, or unscheduled maintenance windows. Is one or both (in the case of redundant circuits or 2N distribution) of our circuits down during those windows? Will any parameters of the Service Level Agreements be infringed upon? How many times has a declared maintenance window required downtime for the customer? How many times has a declared maintenance window resulted in unplanned or unforeseen downtime for the customer? Does the continuous critical systems availability or uptime track record exclude maintenance windows? 3. If a major cause of unplanned downtime or outages in a data center is attributable to human error, how do they mitigate against it? A data center s mission is to create reliability, mitigate risk, and provide uptime for the technology and applications that it enables. However, much of the data center industry has adopted society s attitude that human error is unavoidable. Data centers continue to try to design around human error, which is why there is oftentimes more focus on the critical systems design than the management and operational strategies of a data center. A comprehensive operational strategy carried out by well-trained personnel with an unyielding mindset is the key to continuous uptime. A mindset that emphasizes process discipline, procedural compliance, attention to detail, and individual ownership of the data center s mission must exist across the entire organization. Best practices and checklists are all great tactics, but without a thorough strategy and an operational mindset, they will often be rendered ineffective. Lack of discipline, lack of attention to detail, lack of ownership of the mission, and poor management decisions can creep in. Best practices alone will not mitigate or eliminate unplanned downtime. Poor management decisions also qualify as human error. Choosing low cost over quality, lowest bid, and minimal or outsourced staffing or maintenance programs are
4 often rationalized as good business decisions, but are actually contrary to accomplishing the mission of delivering uptime. Outsourcing the maintenance and lifecycle strategies to third party vendors may be a cost effective business decision, but mindset and ownership for the mission may be lost. The failure of management to take the right actions to accomplish the mission of a data center is human error no matter what the rationale may be. Human error can be eliminated. Do not underestimate the impact of operational mindset and ownership of the data center s mission. There are very few checklists that can identify if a data center has taken the right approach to mitigating human error, but indications will be there for those who know what to look for and where. Once an operational mindset is in alignment with the right operational strategies, a culture focused on accomplishing the mission will permeate the data center s entire organization. Use these questions to understand the data center s operational strategies and mindset in order to make an informed judgment about their ability to mitigate human error. Does the data center or provider have 7x24 operations staffs that are all full time employees? How much of the operations, maintenance, and cleaning is accomplished and owned by the data center s operations staff? How much of the operations and maintenance activities are outsourced to vendors or third parties? What are the skill sets, experience and qualification requirements for your data center operations staff? What are operational strategies and policies for: Documentation of all operational process controls? The use of documented procedures, validation, revision and approval? Initial and continuous training for the entire staff? Staff training on processes and procedures that mitigate or eliminate errors and ensure high levels of service delivery? Change management and control? Equipment selection and purchase?
5 Equipment commissioning and integrated systems testing of the critical system s infrastructure? Preventive and predictive maintenance strategy combined with meaningful testing and trend analysis? Equipment lifecycle strategy? Routine inspections of the data center and all equipment? Cleanliness standards? The use of infrastructure standards for the uniform identification and management of equipment throughout the data center? Service request, problem notification, escalation and resolution? Replacing equipment before it fails? Continuous, comprehensive and accurate monitoring and data collection for all critical and essential systems? (e.g. DCIM) Critical systems infrastructure capacity planning and management? Risk Mitigation? 4. What are the current capacities and current usage of both electrical and mechanical systems? In some data centers, the capacity of individual components and systems is often not measured or managed effectively. Rationalizing their actions as business decisions, loads are not balanced or redundant capacities are encroached upon. Using capacities that are meant for redundancy or fail-over could mean that they are unavailable when they are needed. Improper capacity management is a common cause of outages and a leading cause of cascading events in a failure scenario. There are many ways to track and manage capacities in the critical electrical and mechanical (cooling) distribution systems, ranging from a spreadsheet to other more extensive methods. Make sure your data center has enough room to allow your business and its compute requirements to grow. Never settle for more than a 90% load on the Uninterruptable Power Supply (UPS) systems, which could indicate the data center may be reaching the end of capacity and potentially at risk of a cascading failure. Migrating your infrastructure footprint to another data center location because your current data center doesn t have the scalability or capacity to fit your expanding needs will cost time and financial resources that can be avoided if capacity management and current available capacities are explored during the pre-selection process. What are the present loads on all of the capacity components of the critical electrical and cooling systems?
6 How do you manage capacity in the electrical distribution system? How do you manage capacity in the mechanical (cooling) distribution system? What is the policy and process for allocating power and cooling to a customer or end-user? 5. Do you have DCIM in place to effectively manage your operations on a day to day basis? A comprehensive Data Center Infrastructure Management (DCIM) system that continuously monitors all critical systems is vital to a highly reliable data center. Monitoring many key data points throughout the critical systems infrastructure allows data center operators to be cognizant of changing conditions while proactively managing capacities, trending specific parameters and making well informed decisions. DCIM system(s) can have a wide variety of uses and vary from data center to data center. Be sure that your data center s DCIM includes the following: Individual component and system capacity monitoring and management Threshold alerting and alarming Automatic escalations Real-time Power Usage Effectiveness (PUE) measurement Real-time branch circuit power usage Real-time delivery temperature and humidity measurement Dashboard views Integrated panel schedule management Predictive maintenance and trend analysis 6. Do you have a means for the customer to view adherence to service levels and or Service Level Agreements (SLAs) in real time? Quite often as a customer or colocated customer of a data center you are paying for a service level. Transparency and visibility into the parameters of the services being provided and the achievement of the agreed upon service level should be a requirement. What are the methods of reporting on adherence to Service Level Agreements? What tools or reports will be provided for the customer to
7 document service level achievement? 7. What are the highest risks of natural disasters for the area, and what has the data center done to mitigate their impact? Location, location, location is the common adage for houses and businesses alike, but it holds especially true for data centers. There are no perfect locations or perfect situations, however, there are always better places than others. Certain risks can be mitigated or eliminated by site selection, while certain other risks cannot and should be considered unacceptable no matter how conveniently located the data center is to your business. No data center is immune to natural disasters, but colocating in a region that is routinely or even periodically exposed to natural disasters is a questionable business decision. Ask Midwest data centers about their vulnerability to tornadoes, coastal data centers about their vulnerability to hurricanes, and most data centers about their vulnerability to earthquakes and floods. Earthquakes and floods can be devastating. Ask what precautions they have in place, as well as what their strategy is to maintain uptime in the event of a natural disaster. Is the data center in a hurricane or tornado prone area? What is the probability of seismic activity and does the data center have any mitigation measures in place? What is the data center s location in respect to the 500 and 100 year flood plains? What is the maximum FEMA predictive flood elevation for the location of the data center? What are the risk mitigation measures that are in place for all potential natural disasters? 8. Has the data center earned any certifications? There are a variety of certification programs covering data centers, each with its own merits and possibly some drawbacks. Regardless, they are all useful tools to help guide your decision. Be sure to ask the data center for proof of their certifications and their audit criteria under a non-disclosure agreement. This analysis of the quality, reliability and security of the services provided is invaluable information. Are these certifications audited regularly?
8 9. Is the data center and provider financially sound and committed? Look for a data center that has good financial backing and isn t reliant on just a few tenants. Colocating with a data center whose top five clients are all in the same industry and represent more than half of their revenue is risky business. If the loss of just one customer could deal a large financial blow, reconsider. Obviously, if you ask to see their balance sheet and income statements and the numbers already look forbidding, look elsewhere. Understanding their future business plan is also necessary. Running and maintaining a data center is expensive, and you ll be the one who suffers if they suddenly need to cut their operations team or support staff. May we review the financial reports? Does the colocation provider own the physical structure, building and property? Is the provider leasing the above from a third party? Are there plans for selling the colocation business? 10. Is the colocation provider s primary focus on data center infrastructure services? Many providers are truly telecommunications or IT managed service providers using data center and colocation services as an enabler to get you to purchase additional telecommunications or managed services. Providers are rarely good at doing everything well and when they try to be everything to everyone, the quality of service delivery and customer care are negatively affected, hurting the customer in turn. Data centers that are truly carrier, cloud, and managed services neutral and do not provide their own competing services tend to attract the best of breed companies and service providers. With all these businesses together in one data center, an open marketplace is formed for all the colocated customers to exchange services between themselves. A colocation data center should almost act like a shopping mall, offering multiple choices of providers and services for all of its colocated customers. Are data center and colocation services their primary focus and core competency?
9 Are there multiple carriers, cloud and managed service providers in the data center that you can do business with? Is there a neutral business to business market place that our company can be part of as a colocated customer? This list doesn t encompass every criteria and question you could ask your potential data center provider, but the questions above should help you cut through the data center s sales pitch and guide you in the right direction. Hopefully, after you ve toured a few data centers and inquired about all the relevant criteria and questions, you ll be able to select the best data center or colocation provider to fit your company and its IT requirements. Author: Rob McClary, Senior Vice President & General Manager of FORTRUST