Microsoft s Datacenter Best Practices Darryl Chantry Datacenter Solutions Architect Worldwide Datacenter Center of Excellence
We Are Unique in Our Comprehensive Approach Interactive entertainment Search/ advertising Mobile Traditional IT and cloud Modern desktop
Microsoft s Cloud Environment Consumer and Small Business Services Software as a Service (SaaS) Enterprise Services Third-party Hosted Services Platform as a Service (PaaS) Infrastructure as a Service (IaaS) Global Foundation Services Data Centers Operations Global Network Security
Global Foundation Services Global Network Dark Fiber, Routing, Switching, Load-Balancing Data Centers Design, Build, Operate Global Capacity Microsite Strategy Lower DC to DC costs Security & Compliance Modular Cloud-Scale Designs 300+ Product Teams Microsoft IT (1900 LOB Apps) Cloud Hosting O365/Windows Azure/CRM Lowest $/MW, Rapid Deployment Geo-independent design Tools & Automation Utility Pricing Cost Transparency MOC Microsoft Operation Centers ISO27001, SSAE16, FISMA SCRY & System Center 2012
Microsoft Data Center Scale Microsoft has more than 10 and less than 100 DCs worldwide Quincy Chicago Dublin Amsterdam Japan Des Moines Boydton Hong Kong San Antonio Singapore Multiple global CDN locations Quincy, Washington 27MW+ITPACs 100% Hydro power San Antonio, Texas 27MW Recycled water for cooling Chicago, Illinois 30MW Water side economization, Containers Dublin, Ireland 29 MW Outside air cooling, PODs "Data Centers have become as vital to the functioning of society as power stations." The Economist
Microsoft s credentials in the datacenter
More Than Just Server Racks
A Closer Look Traditional Data Center Infrastructure Generators IT Load Substation Transformer UPS Chiller CRAC Water Supply Cooling Towers Condenser
Microsoft cloud computing journey
Microsoft s Data Center Evolution 1989-2005 2007 2008 2011+ Generation 1 Generation 2 Generation 3 Generation 4 ~2 PUE 1.4 1.6 PUE 1.2 1.5 PUE 1.12 1.20 PUE Colocation Density Containment Modular Server Capacity 20 year Technology Rack Density and Deployment Minimized Resource Impact Containers, PODs Scalability & Sustainability Air & Water Economization Differentiated SLAs ITPACs & Colocations Reduced Carbon, Rightsized Faster Time to Market Outside Air Cooled
Modular Data Center Strategy Modular Pre-Assembled Components (PACs) are built via a dynamic supply chain, shipped onsite and assembled Shorter lead time and lower cost Significant efficiency gains through open-air cooling (adiabatic airside economization) Smart scalability 400-2500 servers at a time Evap IT
Generating & Distributing Cool Air & Water Energy In = Heat Out Removing heat is critical to operations Environmental control is a major source of energy and water consumption Innovative approaches airside economization, adiabatic cooling increase overall efficiency over traditional computer room air conditioning (CRAC)
Key Ingredient: POWER Primary sizing of data center capacity, typically in MW Electrical switch gear: From MV (10KV) to down-facility voltage 400 / 230V (110V) AC DC concepts exists but are harder to handle Efficient DC designs require a delicate balance: Electrical Capacity Physical Space Cooling Capacity
Microsoft s Server Design Strategy for its Cloud Infrastructure
1 6 Considerations
Optimizing Servers and Storage
Performance vs. Power 18 Target Load Performance Power Actual Average Active Load Ssj_ops Power (W) 100% 99.6% 725,620 172 4,230 90% 90.6% 659,864 160 4,136 80% 80.0% 583,048 147 3,962 70% 70.2% 511,207 136 3,760 60% 60.3% 439,188 128 3,428 50% 50.2% 366,118 120 3,045 40% 39.9% 290,866 112 2,589 30% 30.1% 219,614 105 2,097 20% 20.2% 147,012 96.6 1,522 10% 10.1% 73,322 85.8 854 Active Idle 0 53.6 0 Σ ssj_ops / Σpower= 3,052 Idle: 32% of peak power 10% load: 50% of peak power 20% load: 56% of peak power 50% load: 70% of peak power 80% load: 85% of peak power Performance to Power Ratio 5x performance for 40% more power!
at System Level Assumption: Server Price = $2000 + CPUs, Server Power = 150W + CPUs Performance System Price System Power Perf/W Perf/W/$ 1.6 1.56 1.0 1.0 1.0 1.0 1.0 1.07 1.10 1.24 0.87 0.79 1.18 0.76 0.46 2.33 GHz (50W) 2.67 GHz (80W) 3.167 GHz (120W) The sweet spot is often at low power processors, especially when system price and power are considered 19 Your mileage may vary!
Purchase Scenarios Large Bulk Buys Optimized for specific application (Bing, Azure, etc.) Multi-vendor Request For Proposals Competitive Bidding Large Quantity Purchase Variable frequency based on rhythm of business New RFP for each round Generic Standards Catalog Small set of configurations based on a modular platform design Refreshed every 12 to 18 months Multi-vendor Request For Proposals Competitive Bidding Purchases throughout the year (quarterly forecast) Configuration mix varies No minimum order size (Rack or single server) 2 0
2 1 Upper Power Domain Lower Power Domain 2 to 4 Top of Rack Switches (2 to 4 RU) 96 Servers (Up to 48 RU) 2 Bulk Power (2x 3RU) Battery Pack 4 Groups of 4 Batteries (13.2 V Nominal) 160 Batteries for 16.8 kw Input Voltage: Output Voltage: Capacity: 415/480 VAC, 3Ø 12 VDC 4.5 KW Phase Balanced: ±2%
Microsoft s SCRY measurement tool aligns actual resource use with charge back model From Allocating by Space Tracking Power Tracking Carbon Tracking Utilization $ To Allocating by Power SCRY Billing & Cost Allocation
Driving Operational Excellence Datacenter & Cloud Infrastructure
Driving Operational Excellence Investments $15B+ Investment in Cloud Infrastructure State-of-Art Data Centers One of Largest Networks Globally Geo-Replicated Customer Data People 2,000+ people in cloud infrastructure engineering and operations Office 365 Windows Azure 30,000+ software engineers involved in Cloud-based activities Carbon Footprint Reduction Power Usage Effectiveness (PUE) 1.12-1.20
Operations Framework Repositories Contacts Assets Knowledge Articles Service Disruption Indicators (Monitoring or User-Provided) MOC Service Stack Incident Management Knowledge Management Problem Management Account Management Change Management Service Measurement & Reporting
26
Comprehensive Compliance Framework INDUSTRY STANDARDS AND REGULATIONS ISO/IEC 27001:2005 EU Model Clauses FISMA/NIST 800-53 Sarbanes-Oxley PCI-DSS HIPAA, etc CONTROLS FRAMEWORK Identify and integrate Regulatory requirements Customer requirements Assess and remediate Eliminate or mitigate gaps in control design PREDICTABLE AUDIT SCHEDULE Test effectiveness and assess risk Attain certifications and attestations Improve and optimize Examine root cause of non-compliance Track until fully remediated CERTIFICATION AND ATTESTATIONS ISO / IEC 27001:2005 certification SSAE 16/ISAE 3402 SOC 1, 2 and 3 PCI DSS certification FISMA certification and accreditation And more
Resilient Applications Reliable Infrastructure Purpose built and tuned for properties resilient to infrastructure failures General purpose colocations for all properties that require reliable infrastructure 28
Steps to build a Modern Datacenter Tools = capabilities = skills Prioritize stack integration over high end features whatever you don't buy, you'll end up building, and supporting You will need more skills not less Prioritize PEOPLE A lot more Statistics, Quality Assurance, Service Definition and Business Architecture is a good thing Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 Step 7 SLA Extreme Standardization Base lining / Health Management Service Management Process Engineering & Pre-Production Automation & Orchestration Change Control Self Service Quality Configuration Units Design for catastrophe Agree on service delivered Quality Assurance Pre production Automate best practices Build a consistent quality delivery cycle Provide user with a provisioning & service level portal
Datacenter Services from MCS Application Owner Datacenter Manager CIO IT Initiative Owner Infrastructure Manager Service Provider Application Transformation Datacenter Management Datacenter Transformation Datacenter Consolidation and Migration Datacenter Infrastructure Cloud Service Provider
Application Transformation CloudPacks (New!) Application Owner CLOUD-ENABLED APPLICATIONS DCS enables application owners to take advantage of the benefits of a Hyper-Venabled private cloud
Cloud Governance & Services 1 2 3 4 5 6 7 Planning & Envisioning Business Strategy Service Strategy Service Design Service Implementation Service Operation Service Optimization & Improvement Workshops: Discovery workshop Envisioning workshop Architectural Design Sessions Specialist Offerings: Datacenter Infrastructure Datacenter Management Datacenter Automation Datacenter Transformation Datacenter Consolidation & Migration Business Dependency Network Personas Scenarios Resourcing Plan Project Plan Reference Model Reference Architecture Pricing Model Service Definitions Service Catalog Service Lifecycle Management Product Line Architecture (SP, Lync, Exchange) AD FIM ADFS Windows 2012 SC 2012 SP1 High-level System Poster Testing Methodology Testing Scenarios SharePoint Cloud Pack v0.7 DCS 2.0 SP1 On-boarding Proactive Operation Program Operation Consulting Cloud Service Provider
Microsoft Data Center Resources www.globalfoundationservices.com http://www.windowsazure.com http://www.office365.com