Cloud-Scale Datacenters Tarmo Tikerpäe DC SSP Microsoft Corporation 1
5.8+ billion worldwide queries each month 250+ million active users 400+ million Active accounts 2.4+ million emails per day 8.6+ trillion objects in Windows Azure storage 48+ million users in 41 markets 50+ million active users 1 in 4 Enterprise customers 50+ billion Minutes of connections handled each day 200+ Cloud Services 1+ billion customers 20+ million businesses 90+ markets worldwide 2
Huge infrastructure scale is the enabler Microsoft has datacenter capacity around the world and we re growing Quincy Cheyenne Chicago Des Moines San Antonio Boydton Dublin Amsterdam Hong Kong Singapore Shanghai Japan Brazil 35+ factors in site selection: Proximity to customers Energy, Fiber Infrastructure Skilled workforce Australia 3
More than just server racks 4
Datacenter evolution 1989-2005 2007 2009 2012 Future Generation 1 Generation 2 Generation 3 Generation 4 Generation 5 2.0+ PUE 1.4 1.6 PUE 1.2 1.5 PUE 1.12 1.20 PUE 1.07 1.19 PUE Colocation Density Containment Modular Integrated Server Capacity 20 year Technology Rack Density & Deployment Minimized Resource Impact Containers, PODs Scalability & Sustainability Air & Water Economization Differentiated SLAs ITPACs & Colocations Reduced Carbon Right-Sized Faster Time-to-Market Outside Air Cooled Integrated System Resilient Software Common Infrastructure Operational Simplicity Flexible & Scalable 5
Evolving services to cloud-scale Delivering Cloud-Scale services requires a radical restructuring of technology, processes and people Seats Talent Data Quality Data Access Assessment Supply Chain Budget Architecture Application Integration Infrastructure Reach Enterprise IT 10,000 Custodians Directional Pull Physical Process Fixed Cost Silo d Loose Overhead Regional Enterprise Custodians Enterprise Hardware Talent Reliability Cloud-Scale 1,000,000,00 0 Designers Foundational Push Statistical Strategic Rates Integrated Tight Enabler Global Cloud-Scale Designers Cloud-Scale Software Hardware Deployment Enterprise Availability Operability Silo d Reliability Security Network Downtime Network Availability Design Enterprise Deployment Time MTBF System Admin Enterprise IT Architectures Custom Manual Infrastructure MTBF Hardware Audit Impacting 99.999% Primary/Backup Weeks UI Cloud-Scale Operability Integrated Cloud-Scale MTTR Cloud-Scale Commodity Automated Service MTTR Software Intrinsic Irrelevant 99.9% Active/Active Minutes API 6
Think, design, deliver differently at cloudscale Cloud-Scale IS Resilient services Solving hardware problems in software Self-healing solutions that route around failures Enabling the business Cloud-Scale is NOT Brittle software or networks Relying on redundancy or special purpose hardware People watching screens and responding to alerts Reacting to the business 7
Resilient service architecture software hardware operations business Developer Operations Public, Private, Hybrid Cloud Hardware Datacenter Software Features Playbook Abstraction Layer Resources Efficient Performance Dev-Ops Model Realistic availability assumptions Chaos Monkey Bug Fix Incident Triage Capabilities Service Health SLA Compliance Incident & Event Management Usage Model Cost Control Workload Placement Capacity Advertisement Real Time Availability Maintenance Windows Workload migration Physical Placement Integrated Automation Full Stack TCO Standards Capacity Supply Chain Availability Manual Processes Decision Support Runtime Telemetry Machine Learning 8
Efficient by design Big data analytics One million device sample pool Server Optimization Remove unnecessary components Smart power choices 415 volt distribution High efficiency DC conversions Elevated Supply Temperatures 10 C 32 C; Time Weighted Average < 24 C Outside air cooling Chiller-less adiabatic cooling Extremely low energy consumption Inlet Temp Inlet Temperature and Impact on Hard Disk Failure Rates HDD's in Front, ΔT 1 C Buried HDDs Design, ΔT 20 C cold de- rated to ΔT 10 C hot HDD Case Temp Relative AFR HDD Case Temp Relative AFR 10 C 50 F 11 C 100% 30 C 100% 15 C 59 F 16 C 100% 34 C 100% 20 C 68 F 21 C 100% 38 C 100% 25 C 77 F 26 C 100% 41 C 106% 30 C 86 F 31 C 100% 45 C 131% 35 C 95 F 36 C 100% 49 C 153% 40 C 104 F 41 C 106% 53 C 189% 45 C 113 F 46 C 138% 56 C 231% 50 C 122 F 51 C 179% 60 C 281% S. Sankar, K. Vaid, M. Shaw Impact of Temperature on Hard Disk Drive Reliability in Large Datacenters Microsoft, IEEE, 2011 9
Energy innovation In-rack fuel cell research Natural gas converted directly to electricity to power servers Wastewater treatment methane recovery pilot Dramatic improvement in holistic efficiency Beyond PUE removes losses inherent in energy production and delivery Efficient energy supply chain from source to motherboard Increased datacenter reliability Fewer moving parts, fewer potential points of failure. Increased global commonality Lower infrastructure costs Elimination of electrical distribution, power conditioning, and back-up infrastructure Substa(on Substa(on Datacenter Fuel Cell Server CPU Radically simplified supply chain delivers more data with less Datacenter resources Server CPU 10
Growing networks to cloud scale Geo-Redundant Service/ Application Design All nodes active, all nodes stateless Network Device Count Growth 1 2 3 4 5 6 7 Top 3 Most Connected Networks in the World DC-to-Internet Backbone DC-to-DC Backbone Dark Fiber Cache Node Edge Nodes Peer with over 2000 ISP s globally Multiple Terabits, Over 50 Points of Presence globally Global backbone connecting MS Datacenter to the Internet Multiple Terabits of Capacity Dark fiber based DC-DC backbone to enable high bandwidth between Datacenters Tens of thousands of Route Miles of owned Dark Fiber Backbone Million+ 10G DWDM Route Miles of capacity deployed Hosting Services collocated at User location (metro) Multiple Terabits of Edge Interconnect capacity Directly connected to more than 2000 networks with over 4,000 connections Decoupled DCs Separation of CPU s Storage, SQL Services IT Capacity Unit = STAMP DC Capacity Unit or Workload Appliance 11
Infrastructure compliance capabilities ISO / IEC 27001:2005 Certification SSAE 16/ISAE 3402 SOC 1, AT101 SOC 2 and 3 HIPAA/HITECH Perimeter Security Multi-factor authentication Fire Suppression Extensive Monitoring PCI Data Security Standard Certification FedRAMP P-ATO, FISMA Certification & Accreditation Various State, Federal, and International Privacy Laws EU Data Protection Directive (95/46/EC) California SB1386 12
What s the benefit for customers ü You can be confident our cloud services have the scale, reliability, and protection to meet your growing needs ü Our ongoing quest for cost-effective datacenter designs and operational efficiency helps us offer better value in our cloud ü Our approach in designing resilient software helps ensure your services are persistently available ü The investments we make allow us to continually advance our own operations, and our sharing improves the industry as a whole 13
Security at our foundation Product team coordination TWC teams work with all Microsoft Cloud Services to review their security posture Threat models review TWC teams analyze the product teams threat models to verify that they are complete and current Security bugs review All security bugs are reviewed and addressed using standard bug bars Tools use validation TWC teams ensure that product teams have correctly and appropriately made use of the tools, documented code, and patterns and practices available to them Operational Security Assurance and Standards Compliance Training Requirements Design Implementation Verification Release Response 14
Management System Information Security Management System PREDICTABL E AUDIT SCHEDULE INFORMATION SECURITY MANAGEMEN T FORUM RISK MANAGEMEN T PROGRAM INFORMATION SECURITY POLICY PROGRAM COMPLIANC E FRAMEWOR K Test and Audit ISO / IEC 27001:2005 certification SSAE 16/ISAE 3402 SOC 1 AT101 SOC 2 and 3 PCI DSS certification FedRAMP P-ATO, FISMA certification and accreditation And more 15
Governance and controls framework Audits Requirements Test Plans Microsoft Security Policy Online Services Security Standards Control Activities Monitoring Security Baselines & Narratives Standard Operating Procedures 16
Control framework domains DOMAINS STRUCTURE 1. Information Security Policies 2. Organization of Information Security 3. Human Resources Security 4. Asset Management 5. Access Control 6. Cryptography 7. Physical and Environmental Security 8. Operations Security 9. Communications Security 10. System Acquisition, Development, and Maintenance 11. Supplier Relationships 12. Information Security Incident Management 13. Information Security Aspects of Business Continuity Management 14. Compliance 17
Control framework structure DOMAINS STRUCTURE 63 Policy Objectives 631 Unique Control Activities Audit Requirements Control Owner Documents/ Records Testing Procedures Historical Health Data Importance Data Maturity Data 18
requirements Example: awareness training CONTROL OBJECTIVE Security awareness training for all employees, contractors, and third-party users must be provided: When granted access to resources When organizational policies and procedures change ISO/IEC 27001:2005 A.5.2.2 SOX COBIT DS7 Trainees will be expected to understand these policies and procedures as they relate to relevant job function and protection of sensitive information HIPAA 164.308(a)(5)i PCI-DSS version 2.0 12.6.1 19
Defense-in-depth: infrastructure security PHYSICAL NETWORK HOST SECURITY APPLICATIO N DATA Identity and access management Configuration and Vulnerability Scanning 24x7x365 Incident Response 20
Infrastructure compliance capabilities (as of January 2014) ISO / IEC 27001:2005 Certification SSAE 16/ISAE 3402 SOC 1, AT101 SOC 2 and 3 Attestation HIPAA/HITECH PCI Data Security Standard Certification FISMA Certification & Accreditation FedRAMP P-ATO by the Joint Authorization Board Various State, Federal, and International Privacy Laws (95/46/EC aka EU Data Protection Directive; California SB1386; etc.) ü ü ü ü ü ü ü 21
Cloud innovation OPPORTUNITY FOR SECURITY & COMPLIANCE BENEFITS Pre-adoption Benefits realized concern 60% 94% cited concerns around experienced security data benefits security they as didn t a barrier previously to adoption have on-premise 62% 45% concerned that the said privacy protection cloud increased would as result a result in a of lack moving of data to the control cloud SECURTIY Design/Operation Infrastructure Network Identity/access Data PRIVACY COMPLIANCE Windows Azure
G4S & E- viper Richard Wallace, Technology Director at G4S explains: Security is at the core of everything that we do. We believe Windows Azure is the safest environment we could use to host the eviper system - we conducted a 170- control point assessment and found that Windows Azure was more secure than our exismng infrastructure partners. I ooen get told that I do not know where the data is held. The reality is that I know exactly where the data is held and I know that the data cannot be accessed by anyone other than G4S. By working on a MicrosoO cloud we are saving money and also have the ability to scale up operamons around the world without a huge capital outlay on data centre infrastructure.
Cloud customer compliance needs Customers ultimately responsible for ensuring their compliance obligations are met Microsoft will share its certifications and audit reports to allow customers to establish reliance Responsibility: Data Classification and Accountability Application Level Controls Operating System Controls Host Level Controls Identity and Access Management Network Controls Physical Security IaaS PaaS CLOUD CUSTOMER SaaS CLOUD PROVIDER 24
Considerations for choice in cloud services provider Require that the provider has attained third-party certifications and audits, e.g., ISO/ IEC 27001:2005 Consider the ability of vendors to accommodate changing security and compliance requirements Know the value of your data and processes and the security and compliance obligations you need to meet Ensure data and services can be brought back in house if necessary Ensure a clear understanding of security and compliance roles and responsibilities for delivered services Require transparency in security policies and operations 25
Where is data stored? Who accesses and what is access? How to get notified and what do we share? Clear data maps and geographic boundary information provided Will commit to data at rest in the United States and Europe Azure customers may specific the region where data is stored (Europe, US, Asia,) Core customer data accessed only for troubleshooting and malware prevention Core customer data access limited to key personnel on an exception basis Transparent about subcontractors Flow through of commercial terms to our subcontractors We notify you of changes in data map information We notify you about new subcontractors We make available a summary of our audit reports upon request
benefit for customers? ü Our investments in security technologies and procedures help protect information from unauthorized access, use, or disclosure ü With the increasing sophistication and volume of attacks, our risk-based controls help us to offer better protection at scale ü Our compliance framework, certifications, and attestations can support you in designing a program to meet your compliance needs ü These capabilities allow you to trust the cloud services we provide 27
28 28
Microsoft Datacenter Resources Microsoft Datacenters Web Site & Team Blogs microsoft.com/datacenters Windows Azure Trust Center windowsazure.com/trustcenter Office 365 Trust Center trustoffice365.com 29
2014 Microsoft Corporation. All rights reserved. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION. 30