Load-balancing web servers presented at AAU by Peter Dolog, Fall 2009, lecture 5, Web Engineering Scalable Internet Services, Fall 2006 Thorsten von Eicken Department of Computer Science University of California at Santa Barbara
2 Problem statement One web server isn t enough Scaling performance Tolerating failures Rolling upgrades Making many web servers look like one Users can t tell the difference Search engines can t tell the difference (servers can t tell the difference) Why it is hard Keeping the data replicated and consistent Redundant sites have multiple locations
3 Multiple concerns Directing traffic globally to datacenters Directing traffic locally to servers Managing data replication and consistency
4 Solution #1: redirect Idea: redirect to aux servers Each server has its own name (www1.foo.com, www2.foo.com, etc.) www.foo.com redirects to one of the others Example: [buddy /] telnet foo.com 80 Trying 216.64.159.149... Connected to foo.com. Escape character is '^]'. GET / http/1.0 HTTP/1.1 301 Moved Permanently Date: Thu, 13 Apr 2000 06:13:48 GMT Server: Apache/1.3.9 (Unix) secured_by_raven/1.4.1 ApacheJServ/1.1b1 Location: http://www1.foo.com/index.html Connection: close Content-Type: text/html <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <HTML><HEAD> <TITLE>301 Moved Permanently</TITLE> </HEAD><BODY> <H1>Moved Permanently</H1> The document has moved <A HREF="http://www1.foo.com/index.html">here</A>.<P> <HR> <ADDRESS>Apache/1.3.9 Server at 216.64.159.149 Port 80</ADDRESS> </BODY></HTML> Connection closed by foreign host.
5 Network design Assumptions Co-location facility offers Ethernet uplink Each server has one Ethernet interface switch www www1 www2 www3 www4 redirector
6 Redirect Advantages Easy to implement Can customize load balancing algorithm Location independent Disadvantages Which machine is www? What if it goes down? Visible to user: bookmarks, search engines,
7 Solution #2: round-robin DNS Idea: round-robin DNS Each web server has its own IP address Map www.yahoo.com to a different IP each time Example: [buddy /] host www.yahoo.com www.yahoo.com CNAME www.yahoo.akadns.net www.yahoo.akadns.net A 204.71.200.68 www.yahoo.akadns.net A 204.71.202.160 www.yahoo.akadns.net A 204.71.200.74 www.yahoo.akadns.net A 204.71.200.75 www.yahoo.akadns.net A 204.71.200.67 [buddy /] host www.yahoo.com www.yahoo.com CNAME www.yahoo.akadns.net www.yahoo.akadns.net A 204.71.200.75 www.yahoo.akadns.net A 204.71.200.67 www.yahoo.akadns.net A 204.71.200.68 www.yahoo.akadns.net A 204.71.202.160 www.yahoo.akadns.net A 204.71.200.74 [buddy /]
8 Network design Assumptions Co-location facility offers Ethernet uplink Each server has one Ethernet interface switch ns1 www (1) www (2) www (3) www (4) DNS server
9 Round-robin DNS Advantages Easy Cheap Can customize DNS Disadvantages Caching of DNS resolutions Got TTL time to live field in secs Many DNS resolutions/sec Proxies Affect many users at once (lowers load balancing granularity)
10 Browser/OS DNS behavior Problem Gethostbyname (and other DNS APIs) don t return TTLs (or any other info beyond IP address) Windows 9x doesn t have a DNS resolver cache If multiple DNS A records are provided Order of multiple A records is preserved (netlib) If IP address currently in use fails, use next IP address. Repeat. If all IP addresses fail, produce "server not responding" error Caching & round-robin All DNS records have a TTL (time to live) All DNS responses contain the TTL Caching DNS servers compute the remaining TTL DNS caches / caching DNS servers Windows Internet Explorer Some ISPs used to override TTLs (e.g. force to 60 minutes) Very uncommon now too many web sites depend on low TTLs DNS traffic is very low anyway Win2000 & XP contain DNS caching service honors TTLs ipconfig /displaydns ipconfig /flushdns But pre-sp2 does not re-resolve within 2 minutes and this is reset for every refresh Windows 9x had no DNS cache IE6 on Win2000 & XP does not cache DNS A records Caches CNAME records IE4, 5, 6 on Win9X cache A records for 30 minutes irrespective of TTL IE3 caches DNS records for 24 hours irrespective of TTL Firefox/Mozilla/Netscape Cache DNS records for 1 minute irrespective of TTL Earlier versions cached DNS records for 15 minutes irrespective of TTLs
11 Solution #3: load bal switch AKA: TCP load balancing Idea: Rewrite TCP packets to direct them to one of many back-end servers Smart NAT device Implementation Products: Cisco Content Services Switch (formerly Arrowpoint) Citrix Netscaler F5 Big IP Use ASICs to perform packet rewriting (for performance) Note: most of these products can act as a load balancing proxy as well
12 Network design Assumptions Co-location facility offers Ethernet uplink Each server has one Ethernet interface Load-bal switch has 2 Ethernet interfaces Load-bal switch uses NAT (network address translation) public IP load bal switch www1 www2 www3 www4 www5
13 NAT: network address translation Purpose: change IP address of source/dest Home/office use: allow many hosts to share public IP address Datacenter use: hide many servers behind public IP addresses NAT device changes headers on the fly: Server IP address Server ports Server TCP sequence numbers How to load balance based on HTTP header info E.g: break up URI namespace, session persistence, HTTP URI arrives in 3 rd packet from client (typically) Solution: load balancer accepts connection and later NATs through
TCP connection set-up w/nat client -> switch 2.94950 216.64.159.149 -> 208.50.157.136 IP D=208.50.157.136 S=216.64.159.149 LEN=60, ID=48397 2.94950 216.64.159.149 -> 208.50.157.136 TCP D=80 S=1421 Syn Seq=899863543 Len=0 Win=32120 switch -> client 2.95125 208.50.157.136 -> 216.64.159.149 IP D=216.64.159.149 S=208.50.157.136 LEN=48, ID=26291 2.95125 208.50.157.136 -> 216.64.159.149 TCP D=1421 S=80 Syn Ack=899863544 Seq=1908949446 Len=0 client -> switch 2.98324 216.64.159.149 -> 208.50.157.136 IP D=208.50.157.136 S=216.64.159.149 LEN=40, ID=48400 2.98324 216.64.159.149 -> 208.50.157.136 TCP D=80 S=1421 Ack=1908949447 Seq=899863544 Len=0 client -> switch 2.98395 216.64.159.149 -> 208.50.157.136 IP D=208.50.157.136 S=216.64.159.149 LEN=154, ID=48401 2.98395 216.64.159.149 -> 208.50.157.136 TCP D=80 S=1421 Ack=1908949447 Seq=899863544 Len=114 2.98395 216.64.159.149 -> 208.50.157.136 HTTP GET /eb/images/ec_home_logo_tag.gif HTTP/1.0 switch -> server 0.00000 216.64.159.149 -> 10.16.100.121 IP D=10.16.100.121 S=216.64.159.149 LEN=48, ID=26292 0.00000 216.64.159.149 -> 10.16.100.121 TCP D=80 S=1421 Syn Seq=899863543 Len=0 Win=32120 Options server -> switch 0.00001 10.16.100.121 -> 216.64.159.149 IP D=216.64.159.149 S=10.16.100.121 LEN=44, ID=22235 0.00001 10.16.100.121 -> 216.64.159.149 TCP D=1421 S=80 Syn Ack=899863544 Seq=2156657894 Len=0 switch -> server 0.00131 216.64.159.149 -> 10.16.100.121 IP D=10.16.100.121 S=216.64.159.149 LEN=154, ID=48401 0.00131 216.64.159.149 -> 10.16.100.121 TCP D=80 S=1421 Ack=2156657895 Seq=899863544 Len=114 0.00131 216.64.159.149 -> 10.16.100.121 HTTP GET /eb/images/ec_home_logo_tag.gif HTTP/1.0 server -> switch 0.00134 10.16.100.121 -> 216.64.159.149 IP D=216.64.159.149 S=10.16.100.121 LEN=40, ID=22236 0.00134 10.16.100.121 -> 216.64.159.149 TCP D=1421 S=80 Ack=899863658 Seq=2156657895 Len=0 switch -> client 2.98619 208.50.157.136 -> 216.64.159.149 IP D=216.64.159.149 S=208.50.157.136 LEN=40, ID=22236 2.98619 208.50.157.136 -> 216.64.159.149 TCP D=1421 S=80 Ack=899863658 Seq=1908949447 Len=0 server -> switch 0.00298 10.16.100.121 -> 216.64.159.149 IP D=216.64.159.149 S=10.16.100.121 LEN=1500, ID=22237 0.00298 10.16.100.121 -> 216.64.159.149 TCP D=1421 S=80 Ack=899863658 Seq=2156657895 Len=1460 0.00298 10.16.100.121 -> 216.64.159.149 HTTP HTTP/1.1 200 OK switch -> client 2.98828 208.50.157.136 -> 216.64.159.149 IP D=216.64.159.149 S=208.50.157.136 LEN=1500, ID=22237 2.98828 208.50.157.136 -> 216.64.159.149 TCP D=1421 S=80 Ack=899863658 Seq=1908949447 Len=1460 2.98828 208.50.157.136 -> 216.64.159.149 HTTP HTTP/1.1 200 OK 14
15 Solution #4: load bal proxy AKA Layer 7 load balancing Idea: use a reverse proxy in front of web servers Terminate HTTP requests: act like a web server Issue back-end HTTP requests to real web servers to get responses Pros/cons: Allows for clean implementation, not stitching connections together Requires more resources Implementations: Many hardware products: Netscaler, BigIP F5, Use ASICs for SSL acceleration Many web server proxy modules: apache, lighttpd,
16 Network design Assumptions Co-location facility offers Ethernet uplink Each server has one Ethernet interface proxy switch www (1) www (2) www (3) www (4) Actually, (some) servers could be remote!
17 Connection pooling Idea: multiplex many client connections onto few server connections In addition, buffer responses clients lb server Benefits: Avoid TCP (&SSL) set-up Reduce idle connection state on servers Reduce write-out time on servers
18 Detecting server failures Observing traffic Are requests being serviced? Problems: Some requests simply take long (e.g. back-end connection to remote service) Probing the server Various protocols (what do they check?): ICMP ping: test network & kernel TCP connection set-up: process is running HTTP HEAD (or GET): is serving pages SNMP metrics: server load Probe parameters Interval Failure count Failure retry
19 Load balancing algorithms Measuring the load: proxy sees all requests/responses Number of active requests per server Number of requests per second per server Avg response time per server Bandwidth per server Load balancing algorithms: Balancing the above metrics Admin can dial-in server load ratios Differing server hardware Ramp server down, ramp server up Based on URI (e.g. /images, or /cgi-bin)
20 Session persistence Idea Always direct a user to the same back-end server Typical purposes Per-user session state: shopping cart Improve caching Recognize user based on: IP address (not a solution) Can change Can be the same for many users Cookie (HTTP) Can be turned off URL encoding Hard to parse in load balancer (http://./ / /? & &SID=01234& SSL session Not guaranteed to stay the same for successive requests (it s just a performance optimization, not an HTTP session)
21 TLS1.0 / SSL3 Idea: Majority of protocols use a byte stream Provide encrypted byte stream transparently Secure Socket Layer initially developed by Netscape TLS 1.0 = SSL 3.1 IETF blessed standard, RFC 2246 Socket interface: connect, write, read, close record boundaries not preserved TLS interfaces: Provides socket interface Uses socket interface: sends and receives records over a reliable byte stream
22 TLS Overview (w/ server cert) client server
23 Issues with TLS load balancing Inspecting HTTP headers URI-based load balancing Cookies Other headers Location of SSL certificates In load balancer? Encryption to back-end servers Re-encrypt to back-end? Virtual hosting doesn t work! Each web site requires its own IP address Server certificate must be presented before HTTP headers arrive
24 Load balancer redundancy What if load-balancer fails? Load-balancer primary-backup fail-over Issues: IP address take-over, established flows, load history load bal load bal switch switch www1 www2 www3 www4 www5 www6
25 Load balancing internet feeds Assumptions One server farm Two links, e.g. Ethernet from co-lo facility,ds-3 (45Mbps) from ISP (Verio, Sprint, MCI, ) Sprint Verio router road bal www1 www2 www3
26 Load balancing internet feeds Problem: routing Outgoing packets: easy, pick the better uplink By cost By reputation of ISP By analysis of AS route (e.g. directly connected), or AS hop count By performance measurement Incoming packets: hard, need to tell clients how to route Cannot tell individual clients how to reach web site! Use prepending to reduce traffic on link Negotiate usage of community strings to have ISP modify route propagation
27 Border Gateway Protocol Primarily based on Address Space (AS) Numbers Each network has an ASN Announces to neighbors which ASNs it can route to Route table Maps destination IP subnet to AS route E.g. 207.154.101/24 -> 3542 701 3617 Route metric is number of AS hops Route control Prepending: 207.154.101/24 -> 3542 701 3617 3617 3617 Local pref: assign local priorities to override hop count 207.154.101/24 -> 701 3617 / pref 100 207.154.101/24 -> 3542 701 3617 / pref 90 Community strings: tag announced routes to neighbor To tell neighbor what local pref to associate, by convention!
28 Geographic distribution GSLB: Global Server Load Balancing Wishes: Serve diverse geographical regions with local servers Balance load across datacenters to avoid performance issues Provide disaster-tolerance (e.g. datacenter failure) Problems: Network topology does not map well to geography Routing metrics count hops BGP routing metrics count Autonomous Systems (AS)
29 GSLB Solutions 1. Client DNS query to local DNS server 2. DNS server query to authoritative DNS server (GSLB) 3. GSLB gathers status/load from each datacenter Usually asynchronous 4. Probe RTT, traceroute, or BGP hop count from each datacenter back to client's DNS server 5. DNS response with best datacenter s IP address
GSLB measurements Use routing metrics Look at TTL of incoming DNS requests Look at hop counts in BGP routes Measure real performance Typically TCP SYN-ACK to ACK delay Easy for site to which client was directed How about for the sites not picked? Send some percentage of requests to wrong site Aggregate measurements over time Assume things don t change that quickly Aggregate clients in subnets Use IP -> country/state/city mappings Use service that has a global internet performance map E.g. Akamai 30
31 GSLB and availability don t mix Reference: http://www.tenereillo.com/gslbpageofshame.htm Availability axiom: The only way to achieve high-availability for browser based clients is to include the use of multiple A records DNS record reordering: The DNS protocol does not require DNS servers/caches to preserve the order of records (and most don t) Result: For performance want to send browser to one datacenter For availability need to send browser to multiple datacenters Cannot indicate ordering Unless multiple datacenters are available in each geographic area, one has to make a choice between performance and availability!
Prototypical architecture internet rt1 rt2 rt2 rt1 lb1 lb2 lb2 lb1 sw1 sw2 sw2 sw1 www1 www5 www5 www1 www2 www6 www6 www2 www4 www9 WEST COAST www9 www4 EAST COAST 32
33 Summary Fault-tolerance & redundancy are difficult Lots of ways to overlook an important detail Missing documentation on how complex systems work Difficult to test Local load balancing is easy But making it work in the app can be very hard And lots of bugs in devices Global load balancing is hard All approaches are crude May or may not work depending on app