COM 444 Cloud Computing Lec 4: Cloud Platform Architecture over Virtualized Data Centers Data Center Design and Networking Prof. Dr. Halûk Gümüşkaya haluk.gumuskaya@gediz.edu.tr haluk@gumuskaya.com http://www.gumuskaya.com Data Center Design and Networking 1. What is a Data Center? 2. What does a Data Center Look Like? 3. Warehouse-Scale Data Center Design 4. Power and Cooling Requirements 5. Data-Center Interconnection Networks 6. Design Considerations for WSC 7. Data Centers around the World Computing Engineering Department 1 2 What is a Data Center (Cloud)? A single-site cloud (aka Datacenter ) consists of Compute nodes (grouped into racks) Switches connecting the racks A network topology, e.g., hierarchical Storage (backend) nodes connected to the network Front-end for submitting jobs Software services A geographically distributed cloud consists of Multiple such sites Each site perhaps with a different structure and services What( s new) in Today s Clouds? Four major features: 1. Massive scale 2. On-Demand Access Pay-as-you-go, no upfront commitment Anyone can access it 3. Data-Intensive Nature What was MBs has now become TBs, PBs and XBs. 4. New Cloud Programming Paradigms MapReduce/Hadoop, NoSQL/Cassandra/MongoDB and many others. High in accessibility and ease of programmability Lots of open-source projects 3 4
Servers on Clusters Clusters: Commodity computers connected by commodity Ethernet switches: 1. More scalable than conventional servers 2. Much cheaper than conventional servers 20X for equivalent vs. largest servers 3. Dependability via extensive redundancy 4. Few operators for 1000s servers Careful selection of identical HW/SW Virtual Machine Monitors simplify operation Data Center Design and Networking 1. What is a Data Center? 2. What does a Data Center Look Like? 3. Warehouse-Scale Data Center Design 4. Power and Cooling Requirements 5. Data-Center Interconnection Networks 6. Design Considerations for WSC 7. Data Centers around the World 5 6 What does a Datacenter Look Like? Cloud is built on Massive Datacenters What does a Datacenter Look Like? Data centers Cooling plant (size of a football field) Front Back In Some highly secure (e.g., financial info) 7 8 Google data center in The Dalles, Oregon A single data center can easily contain 10,000 racks with 100 cores in each rack (1,000,000 cores total)
Cloud is built on Massive Datacenters What if even a Data Center is not Big Enough? Range in size from edge facilities to megascale (100K to 1M servers) Economies of Scale Approximate costs for a small size center (1K servers) and a larger, 400K server center Technology Network Storage Administration Cost in smallsized Data Center $95 per Mbps/ Month $2.20 per GB/ Month ~140 servers/ Administrator Cost in Large Data Center $13 per Mbps/ month $0.40 per GB/ month >1000 Servers/ Administrator Ratio The larger the data center, the lower the operational cost 7.1 5.7 7.1 This data center is 11.5 times the size of a football field 9 Network of Data Centers Build additional data centers Where? How many?. 10 Global Distribution Trend: Modular Data Center: Warehouse-Scale Computer Data centers are often globally distributed Example above: Google data center locations (inferred) For more info: http://www.google.com/about/datacenters/ Microsoft has about 100 data centers, large or small, which are distributed around the globe. Why? Need to be close to users (physics!) Cheaper resources Protection against failures 11 Modular Data Center in Shipping Containers Need more capacity? Just deploy another container! 12
Warehouse-Scale Computer (WSC) Larger Datacenter Growth Provides Internet Services Search, social networking, online maps, video sharing, online shopping, email, cloud computing, etc. Differences with HPC Clusters : HPC clusters have higher performance processors and network HPC clusters emphasize thread-level parallelism, WSCs emphasize request-level parallelism. Differences with Data Centers: Datacenters consolidate different machines and software into one location Datacenters emphasize Virtual Machines and Hardware Heterogeneity in order to serve varied customers 13 One at a time: 1 system Racking & networking: 14 hrs ($1,330) Rack at a time: ~ 40 systems Install & networking:.75 hrs ($60) Container at a time: ~1,000 systems No packaging to remove No floor space required Power, network, & cooling only Weatherproof & easy to transport Datacenter construction takes 24+ months Both new build & DC expansion require regulatory approval 14 Data Center Videos to Watch Data Center Design and Networking Inside Google's Data Center (CBS News, November 2012): http://www.youtube.com/watch?v=pbx7rgqegg8 A virtual walk through Facebook s Datacenter in Prineville, Oregon (Facebook OpenCompute) Source: Gigaom article from 2012 - http://gigaom.com/cleantech/a-rare-look-inside-facebooksoregon-data-center-photos-video/ Microsoft GFS Datacenter Tour http://www.youtube.com/watch?v=hoxa1l1pqiw 1. What is a Data Center? 2. What does a Data Center Look Like? 3. Warehouse-Scale Data Center Design 4. Power and Cooling Requirements 5. Data-Center Interconnection Networks 6. Design Considerations for WSC 7. Data Centers around the World Timelapse of a Datacenter Construction on the Inside (Fortune 500 company) 15 16
Racks Equipment (e.g., servers) are typically placed in racks. Equipment are designed in a modular fashion to fit into rack units (1U, 2U etc.). A single rack can hold up to 42 1U servers. The Architecture of a Small Server Cluster ( ~ 1000 servers) interconnected by an Ethernet switch and housed in a warehouse or in a container environment A blade server is a stripped down computer with a modular design Server in 1U or blade enclosure format 7 rack with Ethernet switch Small cluster with a cluster level Ethernet switch/router 17 Rack-level switch can use 1- or 10Gbps links Typical elements in warehouse-scale systems 18 Architecture of WSC WSC often use a hierarchy of networks for interconnection. Networking fabric of WSCs is often organized as the 2-level hierarchy. 1-Gbps Ethernet switches with up to 48 ports are essentially a commodity component, costing less than $30/Gbps per server to connect a single rack. Each rack holds up to 42 1U servers connected to a rack switch Rack switches are uplinked to switch higher in the hierarchy Uplink has 48 / n times lower bandwidth, where n = # of uplink ports Goal is to maximize locality of communication relative to the rack Standard Data Center Networking for the Cloud to Access the Internet 19 20
Data Center Networking Data Center Networking Internet load balancer: application-layer routing receives external client requests directs workload within data center returns results to external client (hiding data center internals from client) rich interconnection among switches, racks: increased throughput between racks (multiple routing paths possible) increased reliability via redundancy Load balancer Access router Border router Load balancer Tier 1 switches B Tier 1 switches Tier 2 switches A C Tier 2 switches TOR switches TOR switches Server racks Server racks 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 21 22 Storage and Array Switch Storage options: Use disks inside the servers, or Network Attached Storage (NAS) through Infiniband. WSCs generally rely on local disks. Google File System (GFS) uses local disks and maintains at least 3 replicas. Switch that connects an array of racks Array switch should have 10X the bisection bandwidth of rack switch. Cost of n-port switch grows as n2. Often utilize content addressable memory chips and FPGAs. (Courtesy of Hennessy and Patterson, 2012) 23 Memory and Storage Hierarchy of a WSC (Courtesy of Luiz Andre Barroso and Urs Holzle, Google Inc., 2009) 24
A Programmer s View of Storage Hierarchy of a Typical WSC A server consists of a number of processor sockets, each with a multicore CPU internal cache hierarchy local shared and coherent DRAM a number of directly attached disk drives. The DRAM and disk resources within the rack are accessible through the first-level rack switches (assuming some sort of remote procedure call API to them) All resources in all racks are accessible via the clusterlevel switch. Bandwidth and Latency between these Layers 25 26 Performance Accross Blades Consider bandwidth and latency across blades Network is usually the bottleneck Example: Quantifying Latency, Bandwidth, and Capacity Assume a system with 2,000 servers, each with 8 GB of DRAM and four 1-TB disk drives. Each group of 40 servers is connected through a 1-Gbps link to a rack-level switch that has an additional 8 1- Gbps ports used for connecting the rack to the clusterlevel switch. Network latency numbers assume a socket-based TCP- IP transport, and networking bandwidth values assume that each server behind an oversubscribed set of uplinks is using its fair share of the available cluster-level bandwidth. 27 28
Latency, Bandwidth, and Capacity of a WSC WSC Memory Hierarchy Servers can access DRAM and disks on other servers using a NUMA-style interface 29 (Courtesy of Hennessy and Patterson, 2012) 30 Data Center Design and Networking Typical Datacenter Layout 1. What is a Data Center? 2. What does a Data Center Look Like? 3. Warehouse-Scale Data Center Design 4. Power and Cooling Requirements 5. Data-Center Interconnection Networks 6. Design Considerations for WSC 7. Data Centers around the World 31 32
Power Consumption in Servers Power and Cooling Requirements Cooling system also uses water (evaporation and spills) E.g. 70,000 to 200,000 gallons per day for an 8 MW facility Power cost breakdown Chillers: 30-50% of the power used by the IT equipment Air conditioning: 10-20% of the IT power, mostly due to fans How many servers can a WSC support? Each server: Nameplate power rating gives maximum power consumption To get actual, measure power under actual workloads Oversubscribe cumulative server power by 40%, but monitor power closely 33 34 Measuring Efficiency of a WSC Efficiency of a WSC Power Utilization Effectiveness (PEU) = Total facility power / IT equipment power Median PUE on 2006 study was 1.69 Performance Latency is important metric because it is seen by users Bing study: users will use search less as response time increases Service Level Objectives (SLOs)/Service Level Agreements (SLAs) E.g. 99% of requests be below 100 ms Figure 4.9 The cooling system in a raised-floor data center with hot-cold air circulation supporting water heat exchange facilities (Courtesy of Hennessy and Patterson, 2012) 35 (Courtesy of Luiz Andre Barroso and Urs Holzle, Google Inc., 2009) 36
Green Cloud Data Centers 37 Keeping Computers Cool 38 Data Center Design and Networking 1. What is a Data Center? 2. What does a Data Center Look Like? 3. Warehouse-Scale Data Center Design 4. Power and Cooling Requirements 5. Data-Center Interconnection Networks 6. Design Considerations for WSC Requirements of Interconnection Network The data-center interconnection network design must meet 5 special requirements: Low latency High bandwidth Low cost Message-Passing Interface (MPI) communication support Fault tolerance The design of an inter-server network must satisfy both point-to-point and collective communication patterns among all server nodes. 39 40
Application Traffic Support The network topology should support all MPI communication patterns. Both point-to-point and collective MPI communications must be supported. The network should have high bisection bandwidth to meet this requirement. For example, one-to-many communications are used for supporting distributed file access. One can use one or a few servers as metadata master servers which need to communicate with slave server nodes in the cluster. To support the MapReduce programming paradigm, the network must be designed to perform the map and reduce functions at a high speed. Network Expandability Data centers are not built by piling up servers in multiple racks today. Instead, data-center owners buy server containers while each container contains several hundred or even thousands of server nodes. The owners can just plug in the power supply, outside connection link, and cooling water, and the whole system will just work. This is quite efficient and reduces the cost of purchasing and maintaining servers. One approach is to establish the connection backbone first and then extend the backbone links to reach the end servers. 41 42 Google Container Based Data Center Google Container Based Data Center http://www.youtube.com/watch?v=zrwpsfplx8i 43 44
Fault Tolerance and Graceful Degradation The interconnection network should provide some mechanism to tolerate link or switch failures. In addition, multiple paths should be established between any two server nodes in a data center. Fault tolerance of servers is achieved by replicating data and computing among redundant servers. Similar redundancy technology should apply to the network structure. One the software side, the software layer should be aware of network failures. Packet forwarding should avoid using broken links. In case of failures, the network structure should degrade gracefully amid limited node failures. Hotswappable components are desired. Two Approaches to Building Data-Center-Scale Networks Switch-centric: The switches are used to connect the server nodes. It does not affect the server side. Server-centric: The server-centric design does modify the operating system running on the servers. Special drivers are designed for relaying the traffic. Switches still have to be organized to achieve the connections. 45 46 A Fat-Tree Interconnection Network for Data Centers The failure of an aggregation switch and core switch will not affect the connectivity of the whole network. The failure of any edge switch can only affect a small number of end server nodes. 47 A Fat-Tree Interconnection Network for Data Centers The topology is organized into two layers. Server nodes are in the bottom layer, & edge switches are used to connect the nodes in the bottom layer. The upper layer aggregates the lower-layer edge switches. A group of aggregation switches, edge switches, and their leaf nodes form a pod. Core switches provide paths among different pods. The fat-tree structure provides multiple paths between any two server nodes. This provides fault-tolerance capability with an alternate path in case of some isolated link failures. The extra switches in a pod provide higher bandwidth to support cloud applications in massive data movement.. 48
Modular Data Center in Shipping Containers A modern data center is structured as a shipyard of server clusters housed in truck-towed containers. Inside the container, hundreds of blade servers are housed in racks surrounding the container walls. The SGI ICE Cube container van house 46,080 processing cores or 30 PB of storage per container. Large-scale data center built with modular containers appear as a big shipping yard of container trucks. Motivations for Container-Based Data Center This container-based data center was motivated by demand for Lower power consumption, Higher computer density, and Mobility to relocate data centers to better locations with lower electricity costs, better cooling water supplies, and cheaper housing for maintenance engineers. Sophisticated Cooling Technology Enables up to 80% reduction in cooling costs compared with traditional warehouse data centers. Both chilled air circulation and cold water are flowing through the heat exchange pipes to keep the server racks cool and easy to repair. 49 50 Interconnection of Modular Data Centers Container-based data-center modules are meant for construction of even larger data centers using a farm of container modules. A Server-Centric Network for a Modular Data Center Some proposed designs of container modules: Guo, et al. have developed a server-centric BCube network ( next figure ) for interconnecting modular data centers. The servers are represented by circles, and switches by rectangles. The BCube provides a layered structure. The bottom layer contains all the server nodes and they form Level 0. Level 1 switches form the top layer of BCube 0. 51 Figure 4.12 BCube: A high performance, server-centric network for building modular datacenters. (Courtesy of C. Guo, et al, ACM SIGCOMM Computer Communication Review, Oct. 2009. [25]). The BCube provides a kernel module in the server OS to perform routing operations. The kernel module supports packet forwarding while the incoming packets are not destined to the current node. Such modification of the kernel will not influence the upper layer applications. 52
Inter-Module Connection Networks The BCube is commonly used inside a server container. The containers are considered the building blocks for data centers. Thus, despite the design of the inner container network, one needs another level of networking among multiple containers. In the next figure, Wu, et al. have proposed a network topology for intercontainer connection using the aforementioned BCube network as building blocks. The proposed network was named MDCube (for Modularized Datacenter Cube ). This network connects multiple BCube containers by using high-speed switches in the BCube. Similarly, the MDCube is constructed by shuffling networks with multiple containers. 53 Modularized Datacenter Cube Figure 4.13 A 2-D MDCube is constructed from 9 BCube containers. (Courtesy of. Wu, et al, ACM CoNEXT 09, Dec. 2009, [77]). 54 Data Center Design and Networking Design Considerations for WSC 1. What is a Data Center? 2. What does a Data Center Look Like? 3. Warehouse-Scale Data Center Design 4. Power and Cooling Requirements 5. Data-Center Interconnection Networks 6. Design Considerations for WSC 7. Data Centers around the World 55 Cost-performance Small savings add up Energy efficiency Affects power distribution and cooling Work per joule Dependability via redundancy Network I/O Interactive and batch processing workloads Ample computational parallelism is not important Most jobs are totally independent Request-level parallelism Operational costs count Power consumption is a primary constraint when designing system Scale and its opportunities and problems Can afford customized systems since WSC require volume purchase 56
WSCs offer Economies WSCs offer economies of scale that cannot be achieved with a datacenter: 5.7 times reduction in storage costs 7.1 times reduction in administrative costs 7.3 times reduction in networking costs This has given rise to cloud services such as Amazon Web Services Utility Computing Based on using open source virtual machine and operating system software Data-Center Management Issues Making common users happy The data center should be designed to provide quality service to the majority of users for at least 30 years?? Controlled information flow Information flow should be streamlined. Sustained services and high availability (HA) are the primary goals. Multiuser manageability The system must be managed to support all functions of a data center, including traffic flow, database updating, and server maintenance. Scalability to prepare for database growth The system should allow growth as workload increases. The storage, processing, I/O, power, and cooling subsystems should be scalable. (Courtesy of Hennessy and Patterson, 2012) 57 58 Data-Center Management Issues (cont.) Reliability in virtualized infrastructure Failover, fault tolerance, and VM live migration should be integrated to enable recovery of critical applications from failures or disasters. Low cost to both users and providers The cost to users and providers of the cloud system built over the data centers should be reduced, including all operational costs. Security enforcement and data protection Data privacy and security defense mechanisms must be deployed to protect the data center against network attacks and system interrupts and to maintain data integrity from user abuses or network attacks. Green information technology Saving power consumption and upgrading energy efficiency are in high demand.. 59 Challenges/Issues in Cloud Computing 60
Challenges in Cloud Computing (1) Concerns from The Industry (Providers) Replacement Cost Exponential increase in cost to maintain the infrastructure Vendor Lock-in No standard API or protocol can be very serious Standardization No standard metric for QoS is limiting the popularity Security and Confidentiality Trust model for cloud computing Control Mechanism Users do not have any control over infrastructures 61 Challenges in Cloud Computing (2) Concerns from Research Community: Conflict to legacy programs With difficulty in developing a new application due to lack of control Provenance How to reproduce results in different infrastructures Reduction in Latency No specially designed interconnect used Very low controllability in layout of interconnect due to abstraction Programming Model Hard to debug where programming naturally error-prone Details about infrastructure are hidden QoS Measurement Especially for ubiquitous computing where context changes 62 Data Center Design and Networking 1. What is a Data Center? 2. What does a Data Center Look Like? 3. Warehouse-Scale Data Center Design 4. Power and Cooling Requirements 5. Data-Center Interconnection Networks 6. Design Considerations for WSC 7. Data Centers around the World Colocation Data Centers Currently there are 3056 colocation data centers from 95 countries in the index. 63 http://www.datacentermap.com/datacenters.html 64
Colocation Data Centers Currently there are 3056 colocation data centers from 95 countries in the index. Colocation Turkey Currently there are 29 colocation data centers from 7 areas in Turkey (Türkiye). http://www.datacentermap.com/datacenters.html 65 http://www.datacentermap.com/datacenters.html 66 Data Center Map The data centers listed are just the ones updated by users and editors. In addition, corporate data centers are conspicuously missing in the list For instance the ones set up by multinationals like Google, Microsoft and Intel. However, the site offers a comprehensive list of data centers grouped in a country-by-country list, which gives a clear picture of the distribution of datacenters globally. Emerson Report: State of the Data Center 2011 67 68
Locations of Google Data Centers Acknowledgements The slides have been based in-part upon original slides of a number of books and Profesors including: Distributed and Cloud Computing: From Parallel Processing to The Internet of Things, K. Hwang, G. Fox and J. Dongarra, Morgan Kaufmann Publishers, 2012. The Datacenter as a Computer, An Introduction to the Design of Warehouse-Scale Machines, L. A. Barroso, U. Hölzle (Google Inc.), (Mark D. Hill, Series Editor), Morgan & Claypool, 2009. High Performance Datacenter Networks, Architectures, Algorithms, and Opportunities, D. Abts, J. Kim, (Mark D. Hill, Series Editor), Morgan & Claypool, 2011. http://www.google.com/about/datacenters/inside/locations/ 69 70