Scalable Internet Services and Load Balancing



Similar documents
Scalable Internet Services and Load Balancing

Lecture 3: Scaling by Load Balancing 1. Comments on reviews i. 2. Topic 1: Scalability a. QUESTION: What are problems? i. These papers look at

TRILL Large Layer 2 Network Solution

CS514: Intermediate Course in Computer Systems

1. Comments on reviews a. Need to avoid just summarizing web page asks you for:

Testing & Assuring Mobile End User Experience Before Production. Neotys

Database Scalability and Oracle 12c

Web Caching and CDNs. Aditya Akella

Large-Scale Web Applications

Agenda. Distributed System Structures. Why Distributed Systems? Motivation

Lustre Networking BY PETER J. BRAAM

How To Balance A Load Balancer On A Server On A Linux (Or Ipa) (Or Ahem) (For Ahem/Netnet) (On A Linux) (Permanent) (Netnet/Netlan) (Un

How To Understand The Power Of The Internet

Load Balancing and Sessions. C. Kopparapu, Load Balancing Servers, Firewalls and Caches. Wiley, 2002.

DEPLOYMENT GUIDE Version 1.1. DNS Traffic Management using the BIG-IP Local Traffic Manager

Why SSL is better than IPsec for Fully Transparent Mobile Network Access

Netsweeper Whitepaper

Internet Content Distribution

Network Virtualization and Data Center Networks Data Center Virtualization - Basics. Qin Yin Fall Semester 2013

SiteCelerate white paper

Table of Contents. Cisco How Does Load Balancing Work?

Application Delivery Networking

Server Traffic Management. Jeff Chase Duke University, Department of Computer Science CPS 212: Distributed Information Systems

Introduction: Why do we need computer networks?

Building a Highly Available and Scalable Web Farm

Load-Balancing Introduction (with examples...)

Content Delivery Networks

Cloud Computing. Chapter 4 Infrastructure as a Service (IaaS)

Flexible SDN Transport Networks With Optical Circuit Switching

Content Delivery Networks

Software Defined Networking What is it, how does it work, and what is it good for?

Network Management and Monitoring Software

Amazon Web Services Primer. William Strickland COP 6938 Fall 2012 University of Central Florida

Cluster Computing. ! Fault tolerance. ! Stateless. ! Throughput. ! Stateful. ! Response time. Architectures. Stateless vs. Stateful.

Measuring the Web: Part I - - Content Delivery Networks. Prof. Anja Feldmann, Ph.D. Dr. Ramin Khalili Georgios Smaragdakis, PhD

Cloud Computing. Chapter 8 Virtualization

SANE: A Protection Architecture For Enterprise Networks

The Sierra Clustered Database Engine, the technology at the heart of

Router Architectures

Non-blocking Switching in the Cloud Computing Era

ScaleArc idb Solution for SQL Server Deployments

Intel Ethernet Switch Load Balancing System Design Using Advanced Features in Intel Ethernet Switch Family

Data Center Content Delivery Network

White Paper A10 Thunder and AX Series Load Balancing Security Gateways

Cisco Application Networking for IBM WebSphere

GLOBAL SERVER LOAD BALANCING WITH SERVERIRON

Availability Digest. Redundant Load Balancing for High Availability July 2013

Exploiting Remote Memory Operations to Design Efficient Reconfiguration for Shared Data-Centers over InfiniBand

NLoad Balancing Stackable Switch

TRILL for Service Provider Data Center and IXP. Francois Tallet, Cisco Systems

How Solace Message Routers Reduce the Cost of IT Infrastructure

DATA COMMUNICATOIN NETWORKING

In Memory Accelerator for MongoDB

安 瑞 科 技 物 聯 網 對 應 用 交 付 器 (ADC) 的 需 求 及 應 用 實 例 徐 乃 丁 博 士 研 發 副 總 裁 / 技 術 長

TRILL for Data Center Networks

VMware Virtual SAN 6.2 Network Design Guide

Avaya P330 Load Balancing Manager User Guide

TÓPICOS AVANÇADOS EM REDES ADVANCED TOPICS IN NETWORKS

Software Defined Networking What is it, how does it work, and what is it good for?

Hewlett Packard - NBU partnership : SAN (Storage Area Network) или какво стои зад облаците

Global Server Load Balancing

Lecture 7: Data Center Networks"

Windows 8 SMB 2.2 File Sharing Performance

Datacenter Operating Systems

Mirror File System for Cloud Computing

Data Centers and Cloud Computing. Data Centers

Scalability of web applications. CSCI 470: Web Science Keith Vertanen

Network Attached Storage. Jinfeng Yang Oct/19/2015

Brocade One Data Center Cloud-Optimized Networks

Walmart s Data Center. Amadeus Data Center. Google s Data Center. Data Center Evolution 1.0. Data Center Evolution 2.0

High-Performance DNS Services in BIG-IP Version 11

Architecting Data Center Networks in the era of Big Data and Cloud

Data Center Infrastructure of the future. Alexei Agueev, Systems Engineer

How To Understand The Power Of A Content Delivery Network (Cdn)

Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle

F5 and Oracle Database Solution Guide. Solutions to optimize the network for database operations, replication, scalability, and security

Binary search tree with SIMD bandwidth optimization using SSE

LSI SAS inside 60% of servers. 21 million LSI SAS & MegaRAID solutions shipped over last 3 years. 9 out of 10 top server vendors use MegaRAID

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

FWSM introduction Intro 5/1

Adapting Distributed Hash Tables for Mobile Ad Hoc Networks

AND Recorder 5.4. Overview. Benefits. Datenblatt

MEASURING WORKLOAD PERFORMANCE IS THE INFRASTRUCTURE A PROBLEM?

Superior Disaster Recovery with Radware s Global Server Load Balancing (GSLB) Solution

ZEN LOAD BALANCER EE v3.04 DATASHEET The Load Balancing made easy

Cloud Optimize Your IT

Lightweight DNS for Multipurpose and Multifunctional Devices

Load Balancing Security Gateways WHITE PAPER

Private Cloud Storage for Media Applications. Bang Chang Vice President, Broadcast Servers and Storage

MikroTik RouterOS Workshop Load Balancing Best Practice. Warsaw MUM Europe 2012

Accelerating Network Virtualization Overlays with QLogic Intelligent Ethernet Adapters

Getting More Performance and Efficiency in the Application Delivery Network

Transcription:

Scalable Services and Load Balancing Kai Shen Services brings ubiquitous connection based applications/services accessible to online users through Applications can be designed and launched quickly and immediately open to many many potential users Trends: Centralization of applications/services Scalability requirements: many simultaneous user accesses; large amount of hosted data, 11/12/2014 CSC 257/457 - Fall 2014 1 11/12/2014 CSC 257/457 - Fall 2014 2 Step 1 Crawling Crawling get all these Web pages out there: First retrieve some root pages; Parse their content and follow hyperlinks to retrieve more pages; Repeat the last step. There are many choices at each step of crawling Depth first search vs. breadth first search? Hyperlink with high likelihood pointing to a high quality page? High in links Pointed to from high quality pages Spam identification and avoidance Performance Analysis for Crawling What are the resources involved? CPU processing for TCP/HTTP protocol handling and the parsing of page content writing to disk storage network bandwidth to remote web sites Assume average page size 10KB raw processing power of a single CPU 1000 pages/sec I/O to a single disk 100 seeks/sec up to 100 pages/sec network bandwidth from/to the T1 link (1.5Mbit/s) 12 pages/sec T3 link (45Mbit/s) 360 pages/sec 11/12/2014 CSC 257/457 - Fall 2014 3 11/12/2014 CSC 257/457 - Fall 2014 4 1

Step 2 Indexing Step 3 Online Search Indexing crawled raw web pages are not easy to search. we index them to formats that are easy to search. As part of indexing, we need to give each page an ID using a hash function. Computer: Page #123 Page #357 Firewall Web server/ Query handler Local-area network Index server Networks: Page #124 Page #468 Page server Fast intersection? Scalability, reliability 11/12/2014 CSC 257/457 - Fall 2014 5 11/12/2014 CSC 257/457 - Fall 2014 6 Partitioning and Replication Index servers (partition 1) Data Center Networking Thousands of servers with hierarchical network switching Local-area network Index servers (partition 2) Web server/ Query handlers Link or network layer protocols? Redundant backbone links Equal Cost Multi Path (ECMP) routing Page servers 11/12/2014 CSC 257/457 - Fall 2014 7 11/12/2014 CSC 257/457 - Fall 2014 8 2

Load Balancing over Servers Popular sites like Google or Facebook receive tens or hundreds of millions of hits per day. A large number of replicated servers are used at these sites. Key question: how to balance client requests over these servers? Goals: Performance Ease of deployment 11/12/2014 CSC 257/457 - Fall 2014 9 Load Balancing on Servers Technique 1 DNS Rotation IP address of Facebook.com? IP address of Facebook.com? DNS server 11/12/2014 CSC 257/457 - Fall 2014 10 Discussions on DNS Rotation Advantages Require almost no change on the existing architecture Load Balancing on Servers Technique 2 Cooperative Offloading Problems DNS Caching Rigid load balancing policy can t balance based on runtime load changes slow or no adjustment in response to failures 11/12/2014 CSC 257/457 - Fall 2014 11 11/12/2014 CSC 257/457 - Fall 2014 12 3

Discussions on Cooperative Offloading Can be combined with the DNS rotation. Advantages: More flexible policy is possible Be more responsive to runtime workload and server failures (to a certain degree) Problems: Need software changes on servers Longer delay 11/12/2014 CSC 257/457 - Fall 2014 13 Cooperative Offloading with TCP Handoff [Pai et al. ASPLOS1998] What does 1.3 do? What does 1.4 do? What does client do? 1.3 1.3 1.4 for CNN.com Where is the TCP state? All packets in a TCP connection must offload to one server! 11/12/2014 CSC 257/457 - Fall 2014 14 TCP Handoff vs. Cooperative Offloading Load Balancing on Servers Technique 3 Load Balancing Delays Improvement over cooperative offloading but not perfect Software changes on the servers Not perfect but improvement over cooperative offloading 1.1 1.1 Firewall LB 128.111.1.1 1.2 1.2 All packets in a TCP connection must offload to one server. for CNN.com 11/12/2014 CSC 257/457 - Fall 2014 15 11/12/2014 CSC 257/457 - Fall 2014 16 4

More About Load Balancing Summary Load balancing policies in LB routers (Goal: transparency, plugand play) Simple rotation Least number of active connections Shortest response time How deep do we look into the network protocol stack? Network layer (IP)? Transport layer (TCP/UDP)? Application layer? Scalable servers partitioning replication Load balancing for servers DNS rotation Cooperative offloading (w. TCP handoff) Load balancing router Changes required on the components: DNS server?? Web server?? Client???? 11/12/2014 CSC 257/457 - Fall 2014 17 11/12/2014 CSC 257/457 - Fall 2014 18 5