Chapter 10: Scalability
Contents Clustering, Load balancing, DNS round robin
Introduction Enterprise web portal applications must provide scalability and high availability (HA) for web services in order to serve thousands of users hitting a corporate web site. Scalability ability to support increasing numbers of users by adding additional servers to the cluster. High availability providing redundancy in the system, prevent any web site outage problems occurring due to a Single Point of Failure (SPoF) in the system Deploying a web portal application in a cluster environment aims to to achieve scalability, high availability (fault tolerance) Load balancers sit in front of application servers in cluster to distribute the load between the cluster nodes by redirecting web traffic to an appropriate cluster member, detect any server failures.
Clustering A cluster a group of application servers that works closely together as if it were a single entity. Two methods of clustering: vertical scaling increasing the number of servers running on a single machine, horizontal scaling increasing the number of machines in the cluster. Horizontal scaling is more reliable than vertical scaling, since there are multiple machines involved in the cluster environment, as compared to only one machine. Servers in a J2EE cluster are usually configured using one of the three options. independent approach: each application server has its own file system with its own copy of the application files. shared file system: the cluster uses a single storage device that all application servers use to obtain application files. managed approach: an administrative server controls access to application content and is responsible for "pushing, update appropriate application content to managed servers. Clustering can be done at various tiers in a J2EE application, including at the database tier. Some database vendors offer clustered databases that support data replication between multiple database servers by providing client transparency where the client (usually a servlet container or an application server) doesn't have to know to which database server it's connecting to get the data. Examples of JDBC clustering are Oracle9i's Real Application Clusters (RAC) and Clustered JDBC (C-JDBC) Example of clustered web server system
Load balancing Load balancing a mechanism where the server load is distributed to different nodes within the server cluster, based on a load balancing policy. Load balancers act as single points of entry into the cluster and as traffic directors to individual web or application servers.
Load balancing
Load balancing Some algorithm to perform web balancing Round-robin Random Weight-based Minimum load Last access time Programmatic parameter-based (where the load balancer can choose a server based upon method input arguments)
Load balancing popular methods of load balancing in a cluster DNS round robin provides a single logical name, returning any IP address of the nodes in the cluster. inexpensive, simple, and easy to set up but it doesn't provide any server affinity or high availability. hardware load balancing load balancer shows a single IP address for the cluster The load balancer receives each request and rewrites headers to point to other machines in the cluster. If we remove any machine in the cluster, the changes take effect immediately. server affinity and high availability very expensive and complex to set up.
Load balancing: DNS round robin Generally a single IP addressis mapped to a site name. Ex: www.loadbalancedsite.com 203.24.23.3 To balance server loads using DNS, a site is hosted by different machines DNS server contains the mappings to different IP addresses: www.loadbalancedsite.com 203.34.23.3 www.loadbalancedsite.com 203.34.23.4 www.loadbalancedsite.com 203.34.23.5
Load balancing: DNS round robin First request, DNS server returns 203.34.23.3 Second request, DNS server returns 203.34.23.4 Fourth request, DNS server returns 203.34.23.3 All of the requests have been evenly distributed among all of the machines in the cluster. Disadvantage No support for server affinity No ability to manage a user's requests depending on whether session information is maintained on the server or at an underlying, database level. No support for high availability