LARGE SCALE INTERNET SERVICES



Similar documents
Chapter 2 TOPOLOGY SELECTION. SYS-ED/ Computer Education Techniques, Inc.

Web Application Hosting Cloud Architecture

Multiple Public IPs (virtual service IPs) are supported either to cover multiple network segments or to increase network performance.

Large-Scale Web Applications

Content Delivery Networks (CDN) Dr. Yingwu Zhu

Scalability of web applications. CSCI 470: Web Science Keith Vertanen

Cisco Application Networking for IBM WebSphere

How To Balance A Load Balancer On A Server On A Linux (Or Ipa) (Or Ahem) (For Ahem/Netnet) (On A Linux) (Permanent) (Netnet/Netlan) (Un

Web Caching and CDNs. Aditya Akella

Availability Digest. Redundant Load Balancing for High Availability July 2013

Techniques for Scaling Components of Web Application

Web DNS Peer-to-peer systems (file sharing, CDNs, cycle sharing)

POWER ALL GLOBAL FILE SYSTEM (PGFS)

Testing & Assuring Mobile End User Experience Before Production. Neotys

MEASURING WORKLOAD PERFORMANCE IS THE INFRASTRUCTURE A PROBLEM?

How To Understand The Power Of A Content Delivery Network (Cdn)

Wikimedia architecture. Mark Bergsma Wikimedia Foundation Inc.

Performance Prediction, Sizing and Capacity Planning for Distributed E-Commerce Applications

Distributed Systems. 25. Content Delivery Networks (CDN) 2014 Paul Krzyzanowski. Rutgers University. Fall 2014

High Performance Cluster Support for NLB on Window

Wikimedia Architecture Doing More With Less. Asher Feldman Ryan Lane Wikimedia Foundation Inc.

Content Delivery Networks. Shaxun Chen April 21, 2009

AppDirector Load balancing IBM Websphere and AppXcel

Chapter 10: Scalability

White Paper. ThinRDP Load Balancing

Building a Highly Available and Scalable Web Farm

AN EFFICIENT LOAD BALANCING ALGORITHM FOR A DISTRIBUTED COMPUTER SYSTEM. Dr. T.Ravichandran, B.E (ECE), M.E(CSE), Ph.D., MISTE.,

Internet Content Distribution

Why do Internet services fail, and what can be done about it? 1

Siemens PLM Connection. Mark Ludwig

Oracle WebLogic Foundation of Oracle Fusion Middleware. Lawrence Manickam Toyork Systems Inc

LOAD BALANCING TECHNIQUES FOR RELEASE 11i AND RELEASE 12 E-BUSINESS ENVIRONMENTS

Avaya P333R-LB. Load Balancing Stackable Switch. Load Balancing Application Guide

Deployment Topologies

Distributed Systems. 23. Content Delivery Networks (CDN) Paul Krzyzanowski. Rutgers University. Fall 2015

Exploiting Remote Memory Operations to Design Efficient Reconfiguration for Shared Data-Centers over InfiniBand

Cloud Based Application Architectures using Smart Computing

The Critical Role of an Application Delivery Controller

High-Availability and Scalability

High Availability Technical Notice

Migration and Building of Data Centers in IBM SoftLayer with the RackWare Management Module

High Availability Implementation for JD Edwards EnterpriseOne

Background. Industry: Challenges: Solution: Benefits: APV SERIES CASE STUDY Fuel Card Web Portal

Centrata IT Management Suite 3.0

1. Comments on reviews a. Need to avoid just summarizing web page asks you for:

Migration and Building of Data Centers in IBM SoftLayer with the RackWare Management Module

Load Balancing Web Applications

Oracle Collaboration Suite

Private Cloud Storage for Media Applications. Bang Chang Vice President, Broadcast Servers and Storage

.NET UI Load Balancing & Failover

CommuniGate Pro White Paper. Dynamic Clustering Solution. For Reliable and Scalable. Messaging

Enterprise GIS Architecture Deployment Options. Andrew Sakowicz

Amazon Web Services Yu Xiao

This presentation covers virtual application shared services supplied with IBM Workload Deployer version 3.1.

Load-Balancing Introduction (with examples...)

Chapter 1 - Web Server Management and Cluster Topology

DEPLOYMENT GUIDE Version 1.1. Deploying F5 with IBM WebSphere 7

Network Virtualization and Data Center Networks Data Center Virtualization - Basics. Qin Yin Fall Semester 2013

From Internet Data Centers to Data Centers in the Cloud

Gladinet Cloud Enterprise

Common Server Setups For Your Web Application - Part II

Exploring Oracle E-Business Suite Load Balancing Options. Venkat Perumal IT Convergence

Building a Systems Infrastructure to Support e- Business

CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level. -ORACLE TIMESTEN 11gR1

.NET UI Load Balancing & Clustering

Measuring the Web: Part I - - Content Delivery Networks. Prof. Anja Feldmann, Ph.D. Dr. Ramin Khalili Georgios Smaragdakis, PhD

安 瑞 科 技 物 聯 網 對 應 用 交 付 器 (ADC) 的 需 求 及 應 用 實 例 徐 乃 丁 博 士 研 發 副 總 裁 / 技 術 長

DATA COMMUNICATOIN NETWORKING

Distributed Systems. 24. Content Delivery Networks (CDN) 2013 Paul Krzyzanowski. Rutgers University. Fall 2013

LinuxWorld Conference & Expo Server Farms and XML Web Services

Fixed Price Website Load Testing

Web Hosting. Definition. Overview. Topics. 1. Overview of the Web

Ecomm Enterprise High Availability Solution. Ecomm Enterprise High Availability Solution (EEHAS) Page 1 of 7

IBM EXAM QUESTIONS & ANSWERS

How To Run A Web Farm On Linux (Ahem) On A Single Computer (For Free) On Your Computer (With A Freebie) On An Ipv4 (For Cheap) Or Ipv2 (For A Free) (For

Synology High Availability (SHA)

Scalable Internet Services and Load Balancing

High Availability with Postgres Plus Advanced Server. An EnterpriseDB White Paper

Overlay Networks. Slides adopted from Prof. Böszörményi, Distributed Systems, Summer 2004.

BASICS OF SCALING: LOAD BALANCERS

Introduction to Azure: Microsoft s Cloud OS

Building High Performance, High-Availability Clusters

Best Practices for Implementing High Availability for SAS 9.4

Accelerating High-Speed Networking with Intel I/O Acceleration Technology

VERITAS Cluster Server v2.0 Technical Overview

be architected pool of servers reliability and

SCALABLE DATA SERVICES

InterWorx Clustering Guide. by InterWorx LLC

Networking Topology For Your System

Transcription:

1 LARGE SCALE INTERNET SERVICES 2110414 Large Scale Computing Systems Natawut Nupairoj, Ph.D.

Outline 2 Overview Background Knowledge Architectural Case Studies Real-World Case Study

3 Overview

Overview 4 Internet services become very essential and popular Google serves hundreds of millions of search requests per day Main requirements Availability Scalability

Internet Service Application Characteristics 5

6 Background Knowledge

Multi-Tier Architecture 7

Web Based Architecture Revisited 8 search.jsp Web Browser Request Response Web Server search.jsp - params HTML page AppServer

9

System Availability How to ensures a certain absolute degree of operational continuity during a given measurement period Availability includes ability of the user community to access the system, whether to submit new work, update or alter existing work, or collect the results of previous work Model of Availability Active-Standby: HA Cluster or Failover Cluster Active-Active: Server Load Balancing 2110684 - Basic Infrastructure

HA Cluster Redundant servers and other components Only one server is active (master) One server is standing-by Shared storages Pro: Simple Half software license costs Con: Double hardware cost with single performance 2110684 - Basic Infrastructure

Server Load Balancing 12 Spread work between two or more computers, network links, CPUs, hard drives, or other resources, in order to get optimal resource utilization, throughput, or response time Approaches DNS Round-Robin Reverse Proxy Load Balancer 2110684 - Basic Infrastructure

DNS Round-Robin 13

DNS Round-Robin 14 Pro: Inexpensive Con: Load distribution, but not high availability Problem with DNS caching

Reverse Proxy 15 2110684 - Basic Infrastructure

Server Load Balancing 16 Special equipment load balancer to distribute request to servers Clients will see only single virtual host based on virtual IP 2110684 - Basic Infrastructure

Stateful vs. Stateless Servers 17 Stateful server server maintains some persistent data Allow current request to relate to one of the earlier requests, session Stateless server server does not keep data A request is independent from earlier requests Example: Web server, NFS 2110684 - Basic Infrastructure

Stateful Servers 18 connect use db1 select * from disconnect Database Server Example: Database, FTP Server has to maintain some session information of each connection Current request may depend on previous requests Consume server s resources (memory, TCP port, etc.) Lead to limit number of clients it can service If connection is broken, the service is interrupted 2110684 - Basic Infrastructure

Stateless Servers 19 connect Server does not maintain information of each connection GET /index.html disconnect connect GET /i/logo.jpg Web Server Connect-request-replydisconnect cycle Consume less server s resources disconnect Lead to large number of clients it can service Example: Web server, NFS 2110684 - Basic Infrastructure

Web Caching 20 Utilize the fact that LAN has more bandwidth and less accessing latency than WAN t = accessing latency + data size / bandwidth Web pages usually have some popularity User usually goes back-and-forth between pages Users tend to share the same interest (fashion)

Web Page Popularity 21 Source: http://www.useit.com/alertbox/zipf.html

Web Caching Mechanism 22 Source: http://en.wikibooks.org/wiki/computer_networks/http

Web Caching Location 23 Source: http://knowledgehub.zeus.com/articles/2009/08/05/cache_your_website_for_just_a_second

24 Architectural Case Studies D. Oppenheimer and D. Patterson, Architecture and Dependability of Large-Scale Internet Services, IEEE Internet Computing, Sept-Oct 2002 2110684 - Basic Infrastructure

Case Studies 25 Online - an online service/internet portal (Hotmail) Content - a global content-hosting service (File sharing) ReadMostly - a high-traffic Internet service with a very high read-to-write ratio (Wikipedia)

Site Architecture 26 Load balancing servers Front-end servers Run stateless codes to service requests and gather data from back-end servers Web server / AppServer Back-end servers Provide persistent data (databases, files, emails, user profiles) Should utilize RAID-based storages

Online Site Front-end: functional partitioned Back-end: single file, single database 27

Front-end: all the same Back-end: data partitioned 28

ReadMostly Front-end: all the same Back-end: full replication 29

30 Real-World Case Study: ebay R. Shoup and D. Pritchett, The ebay Architecture, SD Forum 2006

ebay 31 Lots of workloads 212 millions registered users 1 billion page views a day 2 petabytes of data Large number of servers 15,000 AppServers (IBM WebSphere) 100 database servers (Oracle) Utilize Akamai (CDN) for static contents

CDN: Akamai 32 Source: http://en.wikipedia.org/wiki/akamai_technologies Reduce bottlenecks by utilizing geographic Client gets contents from the nearest servers (geographically)

ebay Architecture Design Principles 33 Application Tier Segmented by function Horizontal load-balancing Minimize dependencies Data Tier Data partitioned by functional areas Minimize database work No stored procedure / business logic in database Move CPU-intensive work to applications (no join, sort, etc.) AppServers are cheap, databases are bottlenecks

ebay Architecture 34 Source: R. Shoup and D. Pritchett, The ebay Architecture, SD Forum 2006

References 35 D. Oppenheimer and D. Patterson, Architecture and Dependability of Large-Scale Internet Services, IEEE Internet Computing, Sept-Oct 2002 S. Hanselman, A reminder on "Three/Multi Tier/Layer Architecture/Design" brought to you by my late night frustrations, http://www.hanselman.com/blog/areminderonthreemultitierlayerarchitecturedesi gnbroughttoyoubymylatenightfrustrations.aspx, June 2004 R. Shoup and D. Pritchett, The ebay Architecture, SD Forum 2006