Concepts Introduced in Chapter 6. Warehouse-Scale Computers. Important Design Factors for WSCs. Programming Models for WSCs



Similar documents
Lecture 24: WSC, Datacenters. Topics: network-on-chip wrap-up, warehouse-scale computing and datacenters (Sections )

Chapter 7: Distributed Systems: Warehouse-Scale Computing. Fall 2011 Jussi Kangasharju

Cloud Computing Data Centers

Last time. Data Center as a Computer. Today. Data Center Construction (and management)

COM 444 Cloud Computing

Data Center Content Delivery Network

Cloud Computing Trends

How Microsoft IT Developed a Private Cloud Infrastructure

Part V Applications. What is cloud computing? SaaS has been around for awhile. Cloud Computing: General concepts

Lecture 02a Cloud Computing I

Today: Data Centers & Cloud Computing" Data Centers"

Hadoop Cluster Applications

OmniCube. SimpliVity OmniCube and Multi Federation ROBO Reference Architecture. White Paper. Authors: Bob Gropman

Zadara Storage Cloud A

Microsoft Private Cloud Fast Track

TOP FIVE REASONS WHY CUSTOMERS USE EMC AND VMWARE TO VIRTUALIZE ORACLE ENVIRONMENTS

Ch. 4 - Topics of Discussion

Feature Comparison. Windows Server 2008 R2 Hyper-V and Windows Server 2012 Hyper-V

SAN Conceptual and Design Basics

Data Centers and Cloud Computing

The Total Cost of (Non) Ownership of a NoSQL Database Cloud Service

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

Windows Server 2003 Migration Guide: Nutanix Webscale Converged Infrastructure Eases Migration

SQL Server Storage Best Practice Discussion Dell EqualLogic

Navigating Among the Clouds. Evaluating Public, Private and Hybrid Cloud Computing Approaches

Hypertable Architecture Overview

Virtual Servers VIRTUAL DATA CENTER OVERVIEW VIRTUAL DATA CENTER BENEFITS

Introduction to AWS Economics

COMLINK Cloud Technical Specification Guide DEDICATED SERVER

MaxDeploy Ready. Hyper- Converged Virtualization Solution. With SanDisk Fusion iomemory products

Silver Peak s Virtual Acceleration Open Architecture (VXOA)

Cloud Computing and Amazon Web Services

Multi-Datacenter Replication

Optimizing Web Infrastructure on Intel Architecture

Emerging Technology for the Next Decade

Overview of Requirements and Applications for 40 Gigabit and 100 Gigabit Ethernet

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

Understanding Object Storage and How to Use It

Flash Memory Arrays Enabling the Virtualized Data Center. July 2010

GigaSpaces Real-Time Analytics for Big Data

Cloud Computing - Architecture, Applications and Advantages

Data Centers and Cloud Computing. Data Centers

Pivot3 Reference Architecture for VMware View Version 1.03

Journal of science STUDY ON REPLICA MANAGEMENT AND HIGH AVAILABILITY IN HADOOP DISTRIBUTED FILE SYSTEM (HDFS)

Colocation, Hybrid Cloud & Infrastructure As A Service

Map Reduce / Hadoop / HDFS

CDH AND BUSINESS CONTINUITY:

Simplifying Storage Operations By David Strom (published 3.15 by VMware) Introduction

Microsoft Private Cloud Fast Track Reference Architecture

Hadoop Architecture. Part 1

The Cloud Hosting Revolution: Learn How to Cut Costs and Eliminate Downtime with GlowHost's Cloud Hosting Services

Fault Tolerance & Reliability CDA Chapter 3 RAID & Sample Commercial FT Systems

June Blade.org 2009 ALL RIGHTS RESERVED

Juniper Networks QFabric: Scaling for the Modern Data Center

What is Cloud Computing? Tackling the Challenges of Big Data. Tackling The Challenges of Big Data. Matei Zaharia. Matei Zaharia. Big Data Collection

Managing Data Center Power and Cooling

Deployment Options for Microsoft Hyper-V Server

Network Infrastructure Services CS848 Project

EMC XtremSF: Delivering Next Generation Performance for Oracle Database

Amazon Cloud Storage Options

Disaster Recovery Design Ehab Ashary University of Colorado at Colorado Springs

Building the Virtual Information Infrastructure

Flash Storage Roles & Opportunities. L.A. Hoffman/Ed Delgado CIO & Senior Storage Engineer Goodwin Procter L.L.P.

Hadoop & its Usage at Facebook

Program Summary. Criterion 1: Importance to University Mission / Operations. Importance to Mission

Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle

Cisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database

Data Centers and Cloud Computing. Data Centers

How To Run A Cloud Computer System

Hadoop Distributed File System. Dhruba Borthakur Apache Hadoop Project Management Committee

TÓPICOS AVANÇADOS EM REDES ADVANCED TOPICS IN NETWORKS

Windows Server 2008 R2 Hyper-V Server and Windows Server 8 Beta Hyper-V

Storage Architectures for Big Data in the Cloud

Welcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components

Cloud Optimize Your IT

Enabling an agile Data Centre in a (Fr)agile market

ZADARA STORAGE. Managed, hybrid storage EXECUTIVE SUMMARY. Research Brief

Network Attached Storage. Jinfeng Yang Oct/19/2015

A SWOT ANALYSIS ON CISCO HIGH AVAILABILITY VIRTUALIZATION CLUSTERS DISASTER RECOVERY PLAN

Pivot3 Desktop Virtualization Appliances. vstac VDI Technology Overview

CSE-E5430 Scalable Cloud Computing Lecture 2

Contents. 1. Introduction

TÓPICOS AVANÇADOS EM REDES ADVANCED TOPICS IN NETWORKS

RAID. RAID 0 No redundancy ( AID?) Just stripe data over multiple disks But it does improve performance. Chapter 6 Storage and Other I/O Topics 29

HADOOP MOCK TEST HADOOP MOCK TEST I

Apache Hadoop FileSystem and its Usage in Facebook

Introduction to Cloud : Cloud and Cloud Storage. Lecture 2. Dr. Dalit Naor IBM Haifa Research Storage Systems. Dalit Naor, IBM Haifa Research

PCIe Over Cable Provides Greater Performance for Less Cost for High Performance Computing (HPC) Clusters. from One Stop Systems (OSS)

Jeffrey D. Ullman slides. MapReduce for data intensive computing

Luxembourg June

ioscale: The Holy Grail for Hyperscale

Transcription:

Concepts Introduced in Chapter 6 Warehouse-Scale Computers introduction to warehouse-scale computing programming models infrastructure and costs cloud computing A cluster is a collection of desktop computers or servers connected together by a local area network to act as a single larger computer. A warehouse-scale computer (WSC) is a cluster comprised of tens of thousands of servers. The cost may be on the order of $150M for the building, electrical and cooling infrastructure, the servers, and the networking equipment that houses 50,000 to 100,000 servers. A WSC can be used to provide internet services. search - Google social networking - Facebook video sharing - YouTube online sales - Amazon cloud computing services - Rackspace and many more applications Important Design Factors for WSCs Programming Models for WSCs WSC goals and requirements in common with servers. cost-performance - work done per dollar energy eciency - work done per joule dependability via redundancy network I/O interactive and batch processing workloads WSC aspects that are distinct from servers. Ample parallelism is always available in a WSC. Operational costs represent a greater fraction of the cost of a WSC. Customization is easier for the scale of an WSC. MapReduce (or the open source Hadoop) is the most popular framework for batch processing in a WSC. Map applies a programmer-supplied function to each logical input record to produce a set of key-value pairs. Reduce collapses these values using another programmer-supplied function. Both tasks are highly parallel.

Programming Models for WSCs (cont.) Storage for a WSC There is often a high variability in performance between the dierent WSC servers due to a variety of reasons. varying load on servers le may or may not be in a le cache distance over network can vary hardware anamolies A WSC will start backup executions on other nodes when tasks have not yet completed and take the result that nishes rst. Rely on data (le) replication to help with read performance and availability. A WSC also has to cope with variability in load. servers entire WSC Often WSC services are performed with in-house software to reduce costs and optimize for performance. A WSC uses local disks inside the servers as opposed to network attached storage (NAS). The Google le system (GFS) uses local disks and maintains at least three replicas to improve dependability by covering not only disk failures, but also power failures to a rack or a cluster of racks by placing the replicas on dierent clusters. A read is serviced by one of the three replicas, but a write has to go to all three replicas. WSC Networking WFC Location A WSC uses a hierarchy of networks for interconnection. The standard rack holds 48 servers connected by a 48-port Ethernet switch. A rack switch has 2 to 8 uplinks to a higher switch. So the bandwidth leaving the rack is 6 (48/8) to 24 (48/2) times less than the bandwidth within a rack. There are array switches that are more expensive to allow higher connectivity. There may also be Layer 3 routers to connect the arrays together and to the Internet. The goal of the software is to maximize locality of communication relative to the rack. proximity to Internet backbone optical bers proximity to users of service to reduce Internet access latency electricity availability and cost property tax rate low risk from environmental disasters stability of country low temperature to decrease cooling cost

WSC Power and Cooling Measuring WSC Eciency power usage of just the WSC IT equipment 33% for processors 30% for DRAM 10% for disks 5% for networking 22% for other components within the servers Air conditioning is used to cool server room, requiring 10%-20% of IT equipment power due mostly to fans. Chilled water is often used to cool the air, requiring 30% to 50% of IT equipment power. Outside cooling towers can leverage lower outside temperature. power power utilization eectiveness (PUE) is a widely used simple metric. Median PUE reported in a 2006 study was 1.69. PUE = total_facility_power IT _equipment_power performance Bandwidth is an important metric as there may be many simultaneous user requests or metadata generation batch jobs. Latency is also an important metric as it is seen by users when they make requests. Users will use a search engine less as the response time increases. Also users are more productive in responding to interactive information when the system response time is faster as they are less distracted. Google WSC Innovations to Improve Energy Eciency Cost of a WSC Modied server containers. Separated hot and cold chambers to reduce variation in air temperative, which allows air to be delivered at higher temperatures due to less severe worst-case hot spots. Operating servers at higher temperatures allowed use of cooling towers instead of the more inecient traditional chillers. Shrunk distince of air circulation loop to reduce energy required to move air. Located WSCs in more temperate climates to allow more use of evaporative cooling. Deployed extensive monitoring to measure actual PUE. Designed motherboards that only need a single 12-volt supply so that a UPS could be provided using standard batteries with each server. Google PUE was 1.23 in 2007 and was 1.12 in 2011. capital expenditures (CAPEX) CAPEX is the cost to build a WSC, which includes the building, power and cooling infrastructure, and initial IT equipment (servers and networking equipment). operational expenditures (OPEX) OPEX is the cost to operate a WSC, which includes buying replacement equipment, electricity, and salaries.

Advent of Cloud Computing Improvements in Operational Techniques Cloud computing can be thought of as providing computing as a utility, where a customer pays for only what they use, just as we do for electricity. Cloud computing relies on increasingly larger WSCs which provides several benets if properly set up and operated. improvements in operational techniques economies of scale reduces customer risks of over-provisioning or under-provisioning WSCs have led to innovation in system software to provide high reliability. failover - Automatically restarting an application that fails without requiring administrative intervention. rewall - Examines each network packet to determine whether or not it should be forwarded to its destination. virtual machine - A software layer that executes applications like a physical machine. Protection against denial-of-service attacks. Economies of Scale Reducing Customer Risks WSCs oer economies of scale that cannot be achieved with a data center. 5.7 times reduction in storage costs 7.1 times reduction in administrative costs 7.3 times reduction in networking costs volume discount price reductions PUE of perhaps 1.2 versus PUE of 2.0 for a data center better utilization of WSC by being available to the public WSCs reduce risks of over-provisioning or under-provisioning, particularly for start-up companies. Providing too much equipment means overspending. Providing too little equipment means demand may not be able to be met, which can give a bad impression to potential new customers.

Amazon Web Services Fallacies and Pitfalls Amazon oered Amazon Simple Storage Service (Amazon S3) and Amazon Elastic Computer Cloud (Amazon EC2) in 2006. Relied on virtual machines. Provides better protection for users. Simplied software distribution within a WSC. The ability to reliably kill a virtual machine made it easier to control resource usage. Being able to limit use of resources simplied providing multiple price points for customers. Improved exibility in server conguration. Relied on open source software. Provided service at very low cost. No contract required. Fallacy: Capital costs of a WSC faculity are higher than the servers that it houses. Pitfall: Trying to save power with inactive low power modes versus active low power modes. Pitfall: Using too wimpy a processor when trying to improve WSC cost-performance. Fallacy: Replacing all disks with Flash memory will improve cost-performance of a WSC.