IERG 4080 Building Scalable Internet-based Services

Similar documents

CS 188/219. Scalable Internet Services Andrew Mutz October 8, 2015

Understanding Slow Start

Single Pass Load Balancing with Session Persistence in IPv6 Network. C. J. (Charlie) Liu Network Operations Charter Communications

Application Delivery Networking

S y s t e m A r c h i t e c t u r e

Topics. 1. What is load balancing? 2. Load balancing techniques 3. Load balancing strategies 4. Sessions 5. Elastic load balancing

Building a Systems Infrastructure to Support e- Business

Load Balancing and Sessions. C. Kopparapu, Load Balancing Servers, Firewalls and Caches. Wiley, 2002.

DEPLOYMENT GUIDE Version 1.0. Deploying the BIG-IP LTM with Apache Tomcat and Apache HTTP Server

Application Note. Active Directory Federation Services deployment guide

DNS Resolving using nslookup

ZEN LOAD BALANCER EE v3.04 DATASHEET The Load Balancing made easy

How To Manage A Network On A Network With A Global Server (Networking)

3/21/2011. Topics. What is load balancing? Load Balancing

Configuring Nex-Gen Web Load Balancer

Benchmarking Zonemaster Sandoche Balakrichenan (Afnic) & Einar Lonn (IIS)

Outline VLAN. Inter-VLAN communication. Layer-3 Switches. Spanning Tree Protocol Recap

ZEN LOAD BALANCER EE v3.02 DATASHEET The Load Balancing made easy

How To Balance A Load Balancer On A Server On A Linux (Or Ipa) (Or Ahem) (For Ahem/Netnet) (On A Linux) (Permanent) (Netnet/Netlan) (Un

DMZ Network Visibility with Wireshark June 15, 2010

Nginx 1 Web Server Implementation

Measuring the Web: Part I - - Content Delivery Networks. Prof. Anja Feldmann, Ph.D. Dr. Ramin Khalili Georgios Smaragdakis, PhD

Lab Exercise SSL/TLS. Objective. Step 1: Open a Trace. Step 2: Inspect the Trace

Load Balancing using Pramati Web Load Balancer

NEFSIS DEDICATED SERVER

CS312 Solutions #6. March 13, 2015

Scalable Linux Clusters with LVS

HOSTING PYTHON WEB APPLICATIONS. Graham Dumpleton PyCon Australia Sydney 2011

Reverse Proxy Guide. Version 2.0 April 2016

ENTERPRISE DATA CENTER CSS HARDWARE LOAD BALANCING POLICY

Managing Virtual Servers

Application Note. Lync 2010 deployment guide. Document version: v1.2 Last update: 12th December 2013 Lync server: 2010 ALOHA version: 5.

Building a Highly Available and Scalable Web Farm

Proxy Server, Network Address Translator, Firewall. Proxy Server

Load Balancing. Outlook Web Access. Web Mail Using Equalizer

A host-based firewall can be used in addition to a network-based firewall to provide multiple layers of protection.

A Standard Modest WebSite

Apache Tomcat. Load-balancing and Clustering. Mark Thomas, 20 November Pivotal Software, Inc. All rights reserved.

Availability Digest. Redundant Load Balancing for High Availability July 2013

Web Browsing Examples. How Web Browsing and HTTP Works

Domain Name System (DNS)

Computer System Management: Hosting Servers, Miscellaneous

Snapt Balancer Manual

Content Delivery Networks

Transport and Network Layer

Content Switching Module for the Catalyst 6500 and Cisco 7600 Internet Router

Cisco PIX vs. Checkpoint Firewall

SiteCelerate white paper

BASICS OF SCALING: LOAD BALANCERS

Lab - Observing DNS Resolution

Chapter 37 Server Load Balancing

DEPLOYMENT GUIDE Version 1.2. Deploying the BIG-IP System v10 with Microsoft IIS 7.0 and 7.5

Introduction to Network Operating Systems

DNS at NLnet Labs. Matthijs Mekking

Domain Name System (DNS) Fundamentals

Dissertation Title: SOCKS5-based Firewall Support For UDP-based Application. Author: Fung, King Pong

How-to: DNS Enumeration

Load Balancing for Microsoft Office Communication Server 2007 Release 2

Description: Objective: Attending students will learn:

socketio Documentation

PES. High Availability Load Balancing in the Agile Infrastructure. Platform & Engineering Services. HEPiX Bologna, April 2013

Veriton. Getting the Best out of Hardware Load Balancers in an Oracle Environment. What is a Load Balancer? Agenda. What s out there?

FortiOS Handbook - Load Balancing VERSION 5.2.2

Installing and Setting up Microsoft DNS Server

LOAD BALANCING TECHNIQUES FOR RELEASE 11i AND RELEASE 12 E-BUSINESS ENVIRONMENTS

Gajaba: Dynamic Rule Based Load Balancing Framework

How do I get to

DNS (Domain Name System) is the system & protocol that translates domain names to IP addresses.

SAP WEB DISPATCHER Helps you to make decisions on Web Dispatcher implementation

ClusterLoad ESX Virtual Appliance quick start guide v6.3

Chapter 5. Data Communication And Internet Technology

Microsoft Lync Server 2010

Lab 2. CS-335a. Fall 2012 Computer Science Department. Manolis Surligas

Content Delivery Networks

HAProxy. Ryan O'Hara Principal Software Engineer, Red Hat September 17, HAProxy

ExamPDF. Higher Quality,Better service!

NETASQ MIGRATING FROM V8 TO V9

Network Security TCP/IP Refresher

Overview. Securing TCP/IP. Introduction to TCP/IP (cont d) Introduction to TCP/IP

Emerald. Network Collector Version 4.0. Emerald Management Suite IEA Software, Inc.

Integrating the F5 BigIP with Blackboard

Load balancing Microsoft IAG

Deploying the BIG-IP LTM v10 with Microsoft Lync Server 2010 and 2013

Pass Through Proxy. How-to. Overview:..1 Why PTP?...1

Deploying F5 with Microsoft Forefront Threat Management Gateway 2010

Firewall Load Balancing

Use Domain Name System and IP Version 6

7 Easy Steps to Implementing Application Load Balancing For 100% Availability and Accelerated Application Performance

1. Introduction 2. Getting Started 3. Scenario 1 - Non-Replicated Cluster 4. Scenario 2 - Replicated Cluster 5. Conclusion

Intelligent Load Balancing SSL Acceleration and Equalizer v7.0

DEPLOYMENT GUIDE Version 1.1. Deploying the BIG-IP LTM v10 with Citrix Presentation Server 4.5

CS514: Intermediate Course in Computer Systems

EE Easy CramBible Lab DEMO ONLY VERSION EE F5 Big-Ip v9 Local Traffic Management

Table of Contents. Chapter 1: Installing Endpoint Application Control. Chapter 2: Getting Support. Index

Configuring DNS. Finding Feature Information

Deliuery Networks. A Practical Guide to Content. Gilbert Held. Second Edition. CRC Press. Taylor & Francis Group

Link Load Balancing :50:44 UTC Citrix Systems, Inc. All rights reserved. Terms of Use Trademarks Privacy Statement

CONSUL AS A MONITORING SERVICE

Transcription:

Department of Information Engineering, CUHK Term 1, 2015/16 IERG 4080 Building Scalable Internet-based Services Lecture 4 Load Balancing Lecturer: Albert C. M. Au Yeung 30 th September, 2015

Web Server + Application Server 2

Python Application Server Gunicorn WSGI Server Client HTTP Request HTTP Response Nginx Web Server Server Application developed using Python + Flask 3

WSGI WSGI refers to Web Server Gateway Interface Specify the interface through which a server and an application communicate If an application is written according to the specification, it will be able to run on any server developed according to the same specification Applications and servers that use the WSGI interface are said to be WSGI compliant WSGI applications can be stacked Ref: http://wsgi.readthedocs.org/en/latest/ 4

WSGI Why WSGI? Web servers are not capable of running Python applications For Apache, there is a module named mod_python, which enables Apache to execute Python codes However, mod_python is - not a standard specifications - no longer under active development Hence, the Python community came up with WSGI as a standard interface for Python Web applications 5

WSGI The Web server is configured to forward certain requests to the WSGI server (specified by a URL) Web Server WSGI Interface (callable objects) WSGI Server WSGI Application / Framework Ref: http://www.fullstackpython.com/wsgi-servers.html 6

WSGI Server vs. WSGI Application Why do we have WSGI servers and WSGI applications? It is just another example of de-coupling: Applications focus on how to get things done (e.g. business logic, updating databases, serving dynamic content, etc.) Servers focus on how to route requests, handle simultaneous connections, optimise computing resources, etc. As a Web application developer, you can focus on developing the functions and features, without worrying about how to interface with the Web server 7

Communication between Server and App When a new request comes to the WSGI sever: 1. The server invokes the corresponding function in the application 2. Parameters are passed to the application using environment variables 3. The server also provides a callback function to the application 4. The application processes the request 5. The application returns the response to the server using the callback function provided by the server 8

Example A simple WSGI-compatible application that returns Hello World def application(environ, start_response): start_response( 200 OK, [( Content-Type, text/plain )]) yield Hello World\n environ contains parameters that the server passes to the application (e.g. parameters in the query string) start_response is a callback function provided by the server, the application uses it to return the HTTP response 9

Developing WSGI Applications You do not need to directly implementing the WSGI interface in your application, as there are many frameworks that will help you development an application more easily In Assignment 1, you will try to build one using the Flask framework (http://flask.pocoo.org/) Relatively easy to pick up Debug mode that assists your development Many plugins and modules 10

Developing WSGI Applications Other options: Django (https://www.djangoproject.com/) A comprehensive Web framework following the model-view-controller (MVC) architectural pattern Bottle (http://bottlepy.org/) A micro-framework like Flask, but more lightweight and requires no dependencies on other modules For more, see https://wiki.python.org/moin/webframeworks/ 11

WSGI Middleware WSGI Applications can be stacked. (Why do we want to do that?) WSGI Application WSGI Server WSGI Middleware WSGI Application WSGI Application 12

Load Balancing 13

Load Balancing The act of distributing workloads across multiple computing nodes Why? Avoid overloading a single node Maximise utilization of different nodes Optimise usage of resources Increase reliability and availability 14

Load Balancing - Overview How should we implement this? Load Balancer Clients Servers 15

Load Balancing There are different ways to implement a load balancer: DNS / Hardware / Software Implement it on different layers Random / Round-robin / dynamic scheduling 16

Load Balancing Algorithms Random Round-Robin Distribute load equally by using a rotating scheme Weighted Round-Robin A performance weight is assigned to each server Least Connections Sends requests to a server with the fewest number of connections Fastest Response Time Select the server that responded in the shortest time 17

Server Health Checking A load balancer might need to check whether the servers are operating normally and are able to give responses. How can we check if one of the servers has died? 1. Observe the network traffic 2. Probe the server for a quick response What are the pros and cons of these two? How can health checking be done? 18

Server Health Checking Ping Send an ICMP message to the server and check for response TCP Connection Attempt to establish a TCP connection to the server (on port 80) HTTP GET (Header) Check the status code of a GET request HTTP GET (Content) Check the content returned by the server for a GET request Question: What are the limitations of the first three methods? 19

DNS Load Balancing 20

DNS Load Balancing DNS - Domain Name System A directory service of the Internet 1 www.cuhk.edu.hk 3 2 137.189.11.173 DNS Server Send HTTP request to 137.189.11.173 Has an A record mapping www.cuhk.edu.hk to 137.189.11.173 CUHK Web Server 21

DNS Load Balancing A simple way of implementing load balancing: Create two or more A records in the DNS zone The DNS server sends the client a list of records in random order or in a round-robin fashion The client attempts to connect to the application server using the first IP address in the list 22

DNS What are A records? Address record: mapping a domain name to an IPv4 address Other records: AAAA: domain name to IPv6 address CNAME: alias of a domain name MX: mail exchange (identifies the mail server of the domain) NS: name server record 23

DNS Load Balancing Round-robin DNS Load Balancing Has two A records mapping www.cuhk.edu.hk to 137.189.11.173 and 137.189.11.171 137.189.11.171 137.189.11.173 Client A CUHK Web Server 1 137.189.11.171 DNS Server 137.189.11.173 Client B CUHK Web Server 2 137.189.11.171 137.189.11.173 24

DNS Load Balancing $ dig www.youtube.com ; <<>> DiG 9.9.5-3ubuntu0.5-Ubuntu <<>> www.youtube.coma ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 33788 ;; flags: qr rd ra; QUERY: 1, ANSWER: 5, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 512 ;; QUESTION SECTION: ;www.youtube.com. IN A ;; ANSWER SECTION: www.youtube.com. 21423 IN CNAME youtube-ui.l.google.com. youtube-ui.l.google.com. 146 IN A 64.233.187.190 youtube-ui.l.google.com. 146 IN A 64.233.187.136 youtube-ui.l.google.com. 146 IN A 64.233.187.91 youtube-ui.l.google.com. 146 IN A 64.233.187.93 25

DNS Load Balancing Simple, but a few limitations: Stickiness: It takes time to propagate changes in DNS records adding or removing servers can be slow Loading can not be balanced accurately due to DNS caching (e.g. when a lot of users are from the same ISP) It does not take into account transaction time, server load, network congestions, etc. No fault tolerance (DNS server does not know if a server is operating or not) 26

DNS Load Balancing While having some limitations, DNS load balancing is an important method for achieving availability The only way to divert traffic to different data centres 27

Hardware / Software Load Balancing 28

Load Balancing on Different Layers Load balancing can be done at the different layers in the OSI model Layer 4 (Transport Layer) Load Balancing Relatively simple Balancing server load without inspecting the content of messages Distribute traffic based on servers response time Routing is based on inspecting the first packet of the data stream 29

Layer 4 Load Balancing The load balancer forwards requests to one of the servers, and the servers will send responses via the load balancer Layer 4 Load Balancer 30

Layer 4 Load Balancing The client sends requests to an IP address (identifying the load balancer) The load balancer translate the destination address of the request to one of the servers IP address, and translate it back in the response. (Bi-directional NAT) Layer 4 Load Balancer 31

Layer 4 Load Balancing What are the advantages of Layer 4 Load Balancing? Simple: easy to implement Efficient: load balancer only inspect the first packet However It cannot maintain application session information It cannot route requests to different servers based on their content (e.g. requesting static content vs. dynamic content) 32

Layer 7 Load Balancing Layer 7 (Application Layer) Load Balancing Application-level load balancing Parse requests in application layer (e.g. HTTP), and distribute them to servers based on the content of the request (e.g. the URL or the cookie) Relatively high overhead in parsing the metadata Mostly HTTP (because of the popularity of Web apps) 33

Layer 7 Load Balancing HTTP requests are terminated at the Layer-7 switch Ref: Cardellini, Casalicchio, Colajanni, and Yu. 2002. The state of the art in locally distributed Webserver systems. ACM Computer. Survey, 34, 2 (June 2002), 263-311. 34

Layer 7 Load Balancing Characteristics of Layer 7 Load Balancing More CPU-intensive (e.g. for parsing HTTP content) Can also apply compression, encryption or caching on the content to be delivered Does not require all servers in the backend to have serve the same content (compare with Layer 4 load balancing) Layer 7 load balancers are also called Application Delivery Controllers 35

Reverse Proxy Layer 7 load balancing is also called reverse proxy. Why? What is a proxy? A proxy server (It sends request to another server on behalf of the client) 36

Reverse Proxy Reverse proxy retrieves resources from different servers on behalf of a client. A reverse proxy (load balancer) Retrieves resources from different servers at the backend 37

Abilities of Layer 7 Load Balancers One important ability of a layer 7 load balancer is SSL termination The load balancer has to be able to decrypt the request in order to inspect the content It therefore must be configured with a valid SSL certificate HTTPS requests Unencrypted requests Load balancer 38

Common Load Balancers Many solutions available in the market nowadays Hardware LBs Products of F5, Citrix and Cisco, etc. Software LBs Apache, Nginx, HAProxy, Squid, etc. Cloud-based Services Amazon Elastic Load Balancer (load balancing as a service) 39

Load Balancing With Nginx 40

Load Balancing using Nginx Nginx is a Web server but can also be configured as a reverse proxy server It can proxy requests to another HTTP server or a non-http server It supports the following non-http protocol: FastCGI, uwsgi, SCGI, memcached It can buffer responses from servers to improve performance (when the client is slow) 41

Configuring Nginx Nginx can be configured by editing the configuration files In Ubuntu, configuration files are usually stored under /etc/nginx/ A main configuration file named nginx.conf One or more configuration files for each of the sites hosted by the server (see /etc/nginx/site-available and /etc/nginx/site-enabled ) Ref: http://nginx.org/en/docs/http/load_balancing.html 42

Configuring Nginx The main context Basic structure of a configuration file for Nginx The event context user www-data; worker_processes 4; pid /run/nginx.pid; events {... } http { server { location {... } }... } The http, server and location contexts 43

Configuring Nginx A simple setup for a Web server serving content from a particular directory server { listen 80 default_server; listen [::]:80 default_server ipv6only=on; location / { root /data/www; } } Assume that the server is up at http://www.myserver.com/, then a request to http://www.myserver.com/images/x.jpg will retrieve an image named x.jpg from the directory /data/www/images 44

Configuring Nginx Serving files from different directories server { listen 80 default_server; listen [::]:80 default_server ipv6only=on; location / { root /data/www; } location /images/ { root /data/images; } } 45

Configuring Nginx Configuring Nginx as a reverse proxy load balancer is simple. However, before that we introduce the term upstream Upstream Servers Load Balancer Downstream Upstream 46

Configuring Nginx Load balancing example in Nginx Note: When no load balancing method is specified, the roundrobin method will be used http { upstream myservers { server s1.myserver.com; server s2.myserver.com; server s3.myserver.com; } } server { listen 80; } location / { proxy_pass http://myservers; } 47

Configuring Nginx The example in the previous slide is illustrated in the following diagram: s1 s2 Nginx s3 48

Configuring Nginx Load balancing using the least connection method http { upstream myservers { least_conn; server s1.myserver.com; server s2.myserver.com; server s3.myserver.com; } server { listen 80; } } location / { proxy_pass http://myservers; } 49

Persistence Persistence (Stickiness) Very often, we need the same server to serve the same client for a series of requests (why?) Round-robin and least connected methods do NOT guarantee that the same client will be served by the same server Persistence (or stickiness) refers to the ability of the load balancing to forward requests to the same server 50

Configuring Nginx Use IP hashing as the load balancing method to achieve persistence in Nginx http { upstream myservers { ip_hash; server s1.myserver.com; server s2.myserver.com; server s3.myserver.com; } server { listen 80; } } location / { proxy_pass http://myservers; } 51

Configuring Nginx Other functions include: Health checks of servers Buffering server response Routing requests to applications (e.g. to a Python Web app) You will explore these features in assignments. Ref: https://www.nginx.com/resources/admin-guide/load-balancer/ 52

Assignments & Project 53

Assignment Plan There will be a total of 5 assignments 1. Building a simple Web application using Nginx, Python and Flask 2. Implementing a load balancer and experimenting with different settings 3. Supporting your application with databases and caches 4. Building an Asynchronous task queue 5. Load testing your application 54

Assignment Plan 4 Asynchronous Task Queue 2 1 1 Client 5 Load Balancer Nginx Web Server Python Application 3 Database Server 55

Project In the project, you will extend the system you build in the assignments to a functional service. You can choose from the following two topics: Personal notebook For storing notes, articles, messages, etc. for users Personal album For storing photos or images for users You have the flexibility on what functions you want to implement (more on next slide) 56

Project In the project, we will have a list of required functions and a list of optional functions You must implement the required functions (e.g. user account, searching, add/remove) You get bonus points for implementing the optional functions (or functions not listed in the project specification) In addition, pick one function of your application, and discuss the potential scalability problem, and implement a solution for it 57

End of Lecture 4 58