The growth of online Internet services during the past decade has increased the

Similar documents

Advanced ColdFusion 4.0 Application Development Server Clustering Using Bright Tiger

Fast Robust Hashing. ) [7] will be re-mapped (and therefore discarded), due to the load-balancing property of hashing.

Introduction the pressure for efficiency the Estates opportunity

Fixed income managers: evolution or revolution

The Design and Performance of an Adaptive CORBA Load Balancing Service

We focus on systems composed of entities operating with autonomous control, such

Chapter 3: e-business Integration Patterns

Art of Java Web Development By Neal Ford 624 pages US$44.95 Manning Publications, 2004 ISBN:

Australian Bureau of Statistics Management of Business Providers

WHITE PAPER BEsT PRAcTIcEs: PusHIng ExcEl BEyond ITs limits WITH InfoRmATIon optimization

Teamwork. Abstract. 2.1 Overview

Load Balancing in Distributed Web Server Systems with Partial Document Replication *

Early access to FAS payments for members in poor health

Take me to your leader! Online Optimization of Distributed Storage Configurations

Bite-Size Steps to ITIL Success

TMI ING Guide to Financial Supply Chain Optimisation 29. Creating Opportunities for Competitive Advantage. Section Four: Supply Chain Finance

Pay-on-delivery investing

GREEN: An Active Queue Management Algorithm for a Self Managed Internet

3.3 SOFTWARE RISK MANAGEMENT (SRM)

SNMP Reference Guide for Avaya Communication Manager

TCP/IP Gateways and Firewalls

IMPLEMENTING THE RATE STRUCTURE: TIERING IN THE FEE-FOR-SERVICE SYSTEM

Design Considerations

Secure Network Coding with a Cost Criterion

Chapter 2 Traditional Software Development

Order-to-Cash Processes

SPOTLIGHT. A year of transformation

Learning from evaluations Processes and instruments used by GIZ as a learning organisation and their contribution to interorganisational learning

The guaranteed selection. For certainty in uncertain times

Industry guidance document Checkout workstations in retail - safe design and work practices

ICAP CREDIT RISK SERVICES. Your Business Partner

Avaya Remote Feature Activation (RFA) User Guide

l l ll l l Exploding the Myths about DETC Accreditation A Primer for Students

The BBC s management of its Digital Media Initiative

Lecture 7 Datalink Ethernet, Home. Datalink Layer Architectures

Virtual trunk simulation

Big Data projects and use cases. Claus Samuelsen IBM Analytics, Europe

Pricing and Revenue Sharing Strategies for Internet Service Providers

VALUE TRANSFER OF PENSION RIGHTS IN THE NETHERLANDS. June publication no. 8A/04

APIS Software Training /Consulting

Integrating Risk into your Plant Lifecycle A next generation software architecture for risk based

effect on major accidents

READING A CREDIT REPORT

CERTIFICATE COURSE ON CLIMATE CHANGE AND SUSTAINABILITY. Course Offered By: Indian Environmental Society

Subject: Corns of En gineers and Bureau of Reclamation: Information on Potential Budgetarv Reductions for Fiscal Year 1998

US A1 (19) United States (12) Patent Application Publication (10) Pub. N0.: US 2011/ A1 Sheer (43) Pub. Date: Aug.

Older people s assets: using housing equity to pay for health and aged care

DECEMBER Good practice contract management framework

Business schools are the academic setting where. The current crisis has highlighted the need to redefine the role of senior managers in organizations.

The Design of an Adaptive CORBA Load Balancing Service

Face Hallucination and Recognition

Recent Trends in Workers Compensation Coverage by Brian Z. Brown, FCAS Melodee J. Saunders, ACAS

SABRe B2.1: Design & Development. Supplier Briefing Pack.

Scheduling in Multi-Channel Wireless Networks

Distribution of Income Sources of Recent Retirees: Findings From the New Beneficiary Survey

Introduction to XSL. Max Froumentin - W3C

ASSET MANAGEMENT OUR APPROACH

Delhi Business Review X Vol. 4, No. 2, July - December Mohammad Talha

Let s get usable! Usability studies for indexes. Susan C. Olason. Study plan

Informatica PowerCenter

Niagara Catholic. District School Board. High Performance. Support Program. Academic

Qualifications, professional development and probation

A Description of the California Partnership for Long-Term Care Prepared by the California Department of Health Care Services

Best Practices for Push & Pull Using Oracle Inventory Stock Locators. Introduction to Master Data and Master Data Management (MDM): Part 1

Cloud Meets Contact Center: From Zero to Hero in 14 Days!

Wide-Area Traffic Management for. Cloud Services

CONTRIBUTION OF INTERNAL AUDITING IN THE VALUE OF A NURSING UNIT WITHIN THREE YEARS

Human Capital & Human Resources Certificate Programs

Sensing Meets Mobile Social Networks: The Design, Implementation and Evaluation of the CenceMe Application

Quality Monitor HEALTH QUALITY ONTARIO 2012 REPORT ON ONTARIO S HEALTH SYSTEM

IT Governance Principles & Key Metrics

Market Design & Analysis for a P2P Backup System

Oracle Project Financial Planning. User's Guide Release

Hybrid Interface Solutions for next Generation Wireless Access Infrastructure

An Integrated Data Management Framework of Wireless Sensor Network

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 31, NO. 12, DECEMBER

ELECTRONIC FUND TRANSFERS YOUR RIGHTS AND RESPONSIBILITIES

Multi-Robot Task Scheduling

Teach yourself Android application development - Part I: Creating Android products

Transcription:

IEEE DS Onine, Voume 2, Number 3 March 2001 Strategies for CORBA Middeware-Based Load Baancing Ossama Othman, Caros O'Ryan, and Dougas C. Schmidt University of Caifornia, Irvine The growth of onine Internet services during the past decade has increased the demand for scaabe and dependabe distributed computing systems. E-commerce and onine stock trading systems are exampes. These systems concurrenty serve many cients that transmit a arge, often "bursty," number of requests and have stringent quaity of service requirements. To protect hardware investments and avoid overcomitting resources, such systems scae incrementay by connecting servers through high-speed networks. An increasingy popuar and cost effective technique to improve networked server performance is oad baancing, where hardware or software mechanisms determine which server wi execute each cient request. Load baancing heps improve system scaabiity by ensuring that cient appication requests are distributed and processed equitaby across a group of servers. Likewise, it heps improve system dependabiity by adapting dynamicay to system configuration changes that arise from hardware or software faiures. Load baancing aso heps ensure a resources are used efficienty and that new servers are purchased or cyces are eased ony when necessary. This artice describes the strategies and architectures for CORBA oad baancing, introduces a CORBA oad baancing service we designed, and provides performance benchmarks. A second instament of this artice (appearing in the Apri 2001 issue of DS Onine) provides more detai regarding the specific oad baancing service we designed using TAO The ACE (Adaptive Communication Environment) ORB. IEEE Distributed Systems Onine Pubished by the IEEE Computer Society 1541-4922/01/$17.00 @ 2001 IEEE 1

Load baancing for distributed systems Load baancing mechanisms distribute cient workoad equitaby among back-end servers to improve overa system responsiveness. These mechanisms can be provided in any or a of the foowing ayers in a distributed system: Network-based oad baancing: IP routers and domain name servers that service a poo of host machines provide this type of oad baancing. For exampe, when a cient resoves a hostname, the DNS can assign a different IP address to each request dynamicay based on current oad conditions. The cient then contacts the designated back-end server, unaware that a different server coud be seected for its next DNS resoution. Routers can aso bind a TCP fow to any back-end server based on the current oad conditions and then use that binding for the duration of the fow. High voume Web sites often use network-based oad baancing at the network ayer (ayer 3) and transport ayer (ayer 4). Layer 3 and 4 oad baancing (referred to as switching 1 ) use the IP address/hostname and port, respectivey, to determine where to forward packets. Load baancing at these ayers is somewhat imited, however, because they do not take into account the content of cient requests. Instead, higher-ayer mechanisms the so caed ayer 5 switching described beow perform oad baancing in accordance with the content of requests, such as pathname information within a URL. OS-based oad baancing: Distributed OSs provide this type of oad baancing through custering, oad sharing (oad sharing shoud not be confused with oad baancing; for exampe, processing resources can be shared among processors but not necessariy baanced), and process migration 2 mechanisms. Custering is a cost effective way to achieve high avaiabiity and high performance by combining many commodity computers to improve overa system processing power. Processes can then be distributed transparenty among computers in the custer. Custers generay empoy oad sharing and process migration. Process migration mechanisms baance oad across processors or, more generay, across network nodes by transferring the state of a process between nodes. 3 Transferring process states requires significant patform infrastructure support to hande patform differences between nodes. It can aso imit appicabiity to programming anguages based on virtua machines, such as Java. Middeware-based oad baancing: This type of oad baancing is performed in middeware, often on a per-session or per-request basis. For exampe, ayer 5 switching 1 has become a popuar technique to determine which Web server shoud receive a cient request for a particuar URL. This strategy can aso detect "hot spots" frequenty accessed URLs so that additiona resources can be aocated to hande the arge number of requests for such URLs. 2

Advantages of middeware-based oad baancing Network-based and OS-based oad baancing architectures suffer from severa imitations incuding ack of fexibiity and adaptabiity. The ack of fexibiity arises from the inabiity to support appication-defined metrics at run time when making oad baancing decisions. The ack of adaptabiity occurs due to the absence of oadreated feedback from a given set of repicas (dupicate instances of a particuar object on a server that is managed by a oad baancer), as we as the inabiity to contro if and when a given repica shoud accept additiona requests. Neither network-based nor OS-based oad baancing soutions provide as straightforward, portabe, and economica a means of adapting oad baancing decisions based on appication-eve request characteristics, such as content and duration, as middeware-based oad baancing does. In contrast, middeware-based oad baancing offers severa advantages over network- or OS-based oad baancing. Middeware-based oad baancing architectures particuary those based on standard CORBA 4 have the foowing advantages: Middeware-based oad baancing can be used in conjunction with speciaized network-based and OS-based oad baancing mechanisms. It can aso be appied on top of commodity-off-the-shef networks and OSs, which heps reduce cost. In addition, it can provide semanticay-rich customization hooks to perform oad baancing based on a wide range of appication-specific oad baancing conditions, such as runtime I/O versus CPU overhead conditions. This artice focuses on middeware-based oad baancing supported by CORBA object request brokers. ORB middeware ets cients invoke operations on distributed objects without concern for object ocation, programming anguage, OS patform, communication protocos and interconnects, and hardware. 5 Moreover, ORBs can determine which cient requests to route to which object repicas on which servers. An exampe of CORBA middeware-based oad baancing To iustrate the benefits of middeware-based oad baancing, consider the foowing CORBA-based onine stock trading system (see Figure 1). A distributed 3

onine stock trading system creates sessions through which trading is conducted. This system consists of mutipe back-end servers (repicas) that process the session creation requests that cients send over a network. A repica is an object that can perform the same tasks as the origina object. Server repicas that perform the same operations can be grouped together into back-end server groups, which are aso known as repica groups or object groups. Figure 1. A distributed onine stock trading system paces heavy demands on resources whie requiring instantaneous scaabiity and 24-7 dependabiity. Figure 1 repicates a session factory 6 in an effort to reduce the oad on any given factory. The oad in this case is a combination of the average number of session creation requests per unit time and the tota amount of resources empoyed currenty to create sessions at a given ocation. Loads are then baanced across a repicas in the session factory repica group. The repicas need not reside at the same ocation. The soe purpose of session factories is to create stock trading sessions. Therefore, factories need not retain state, (they are stateess). Moreover, in this type of system, cient requests arrive dynamicay not deterministicay and the duration of each request many not be known a priori. These conditions require that the distributed onine stock trading system redistribute requests to repicas dynamicay. Otherwise, one or more repicas might 4

become overoaded whie others wi be underutiized. In other words, the system must adapt to changing oad conditions. In theory, appying adaptivity in conjunction with mutipe back-end servers can increase the system's scaabiity and dependabiity; reduce the initia investment when the number of cients is sma; and et the system scae up gracefuy to hande more cients and processing workoad in arger configurations. In practice, achieving this degree of scaabiity and dependabiity requires a sophisticated oad baancing service. Ideay, this service shoud be transparent to existing onine stock trading components. Moreover, if incoming requests arrive dynamicay, a oad baancing service might not benefit from a priori QoS specifications, scheduing, or admission contro and must therefore adapt dynamicay to changes in runtime conditions. Strategies and architectures for CORBA oad baancing Athough CORBA provides soutions for many distributed system chaenges, such as predictabiity, security, transactions, and faut toerance, it sti acks standard soutions to tacke other important chaenges distributed systems architects and deveopers face. Strategies We cassify strategies for designing CORBA oad baancing services aong the foowing orthogona dimensions: Cient binding granuarity A oad baancer binds a cient request to a repica each time it makes a oad baancing decision. Specificay, a cient's requests are bound to the repica the oad baancer seects. Cient binding mechanisms incude modifications to the standard CORBA services, ad hoc proprietary protocos and interfaces, or use of the LOCATION_FORWARD message in the OMG standard GIOP protoco. Regardess of the mechanism, we can cassify cient binding according to its granuarity as foows: 5

Per-session Cient requests wi continue to be issued to the same repica for the duration of a session (in the context of CORBA, a session defines the period of time during which a cient is connected to a given server for the purpose of invoking remote operations on objects in that server), and is usuay defined by the cient's ifetime. 7 Per-request Each cient request wi be forwarded to a potentiay different repica that is, bound to a repica each time a request is invoked. On-demand Cient requests can be re-bound to another repica whenever the oad baancer deems it necessary. This design forces a cient to issue its requests to a repica other than the one to which it currenty sends requests. Baancing poicy When designing a oad baancing service, it is important to seect an appropriate agorithm that decides which repica wi process each incoming request. For exampe, appications in which a requests generate neary identica amounts of oad can use a simpe round-robin agorithm, whie appications in which the oad generated by each request cannot be predicted in advance might require more advanced agorithms. In genera, we can cassify oad baancing poicies into one of two categories: Nonadaptive A oad baancer can use nonadaptive poicies, such as a simpe round-robin agorithm or a randomization agorithm, to seect which repica wi hande a particuar request. Adaptive A oad baancer can use adaptive poicies that utiize runtime information, such as the amount of ide CPU avaiabe on each back-end server, to seect the repica that wi hande a particuar request. Architectures for CORBA oad baancing By combining the strategies we just described in various ways, you can create the aternative oad baancing architectures we describe beow. Figure 2 iustrates three of the primary architectures. 6

Figure 2. Various architectures for oad baancing: (a) nonadaptive per-session, (b) adaptive per-request, and (c) adaptive on-demand. Nonadaptive per-session architectures One way to design a CORBA oad baancer is make to the oad baancer seect the target repica when a cient server session is first estabished that is, when a cient obtains an object reference to a CORBA object (namey the repica) and connects to that object, as Figure 2a shows. Note that the baancing poicy in this architecture is nonadaptive, because the cient interacts with the same server to which it was directed originay, regardess of that server's oad conditions. This architecture is suitabe for oad baancing poicies that impement round-robin or randomized baancing agorithms. Different cients can be directed to different object repicas either using a middeware activation daemon, such as a CORBA Impementation Repository, 8 or a 7

ookup service, such as the CORBA Naming or Trading services. For exampe, ORBIX 9 provides an extension to the CORBA Naming Service that returns references to object repicas in either a random or round-robin order. Load baancing services based on a per-session cient binding architecture can satisfy requirements for appication transparency, increased system dependabiity, minima overhead, and CORBA interoperabiity. The primary benefit of per-session cient binding is that it incurs ess runtime overhead than the aternative architectures we describe in the foowing sections. Nonadaptive per-session architectures do not, however, satisfy the requirement to hande dynamic cient operation request patterns adaptivey. In particuar, forwarding is performed ony when the cient binds to the object that is, when it invokes its first request. Therefore, overa system performance might suffer if mutipe cients that impose high oads are bound to the same server, even if other servers are ess oaded. Unfortunatey, nonadaptive per-session architectures have no provisions to reassign their cients to avaiabe servers. Nonadaptive per-request architectures A nonadaptive per-request architecture shares many characteristics with the nonadaptive per-session architecture. The primary difference is that a cient is bound to a repica each time a request is invoked in the nonadaptive per-request architecture, rather than just once during the initia request binding. This architecture has the disadvantage of degrading performance due to increased communication overhead. Nonadaptive on-demand architectures Nonadaptive on-demand architectures have the same characteristics as their persession counterparts. However, nonadaptive on-demand architectures aow reshuffing of cient bindings at an arbitrary point in time. Note that run-time information, such as CPU oad, is not used to decide when to rebind cients. Instead, cients coud be re-bound at reguar time intervas, for exampe. Adaptive per-session architecture This architecture is simiar to the nonadaptive per-session approach. The primary difference is that an adaptive per-session can use runtime oad information to seect the repica, thereby aeviating the need to bind new cients to heaviy oaded repicas. This strategy ony represents a sight improvement, however, because the oad generated by cients can change after binding decisions are made. In this situation, the adaptive on-demand architecture offers a cear advantage, because it 8

can respond to dynamic changes in cient oad. Adaptive per-request architectures Figure 2b shows a more adaptive request architecture for CORBA oad baancing. This design introduces a front-end server, which is a proxy 10 that receives a cient requests. In this case, the front-end server is the oad baancer. The oad baancer seects an appropriate back-end server repica in accordance with its oad baancing poicy and forwards the request to that repica. The front-end server proxy waits for the repica's repy to arrive and then returns it to the cient. Informationa messages caed oad advisories are sent from the oad baancer to repicas when attempting to baance oads. These advisories cause the repicas to either accept requests or redirect them back to the oad baancer. The primary benefit of an adaptive request forwarding architecture is its potentia for greater scaabiity and fairness. For exampe, the front-end server proxy can examine the current oad on each repica before seecting the target of each request, which might et it distribute the oad more equitaby. Hence, this forwarding architecture is suitabe for use with adaptive oad baancing poicies. Unfortunatey, this architecture can aso introduce excessive atency and network overhead because a front-end server processes each request. Moreover, it introduces two new network messages: 1. the request from the front-end server to the repica; and 2. the corresponding repy from the back-end server (repica) to the front-end server. Adaptive on-demand architecture As Figure 2c shows, cients receive an object reference to the oad baancer initiay. Using CORBA's standard LOCATION_FORWARD mechanism, the oad baancer can redirect the initia cient request to the appropriate target server repica. CORBA cients wi continue to use the new object reference obtained as part of the LOCATION_FORWARD message to communicate with this repica directy unti they are redirected again or finish their conversation. Unike the nonadaptive architectures described earier, adaptive oad baancers that forward requests on demand can monitor repica oad continuousy. Using this oad information and the poicies specified by an appication, a oad baancer can determine how equitaby the oad is distributed. When the oad becomes unbaanced, the oad baancer can communicate with one or more repicas and request them to redirect subsequent cients back to the oad baancer. The oad 9

baancer wi then redirect the cient to a ess oaded repica. Using this architecture, the overa distributed object computing system can recover from inequitabe cient/repica bindings whie amortizing the additiona network and processing overhead over mutipe requests. This strategy satisfies most of the requirements outined previousy. In particuar, it requires minima changes to the appication initiaization code and no changes to the object impementations (servants) themseves. The primary drawback with adaptive on-demand architectures is that server repicas must be prepared to receive messages from a oad baancer and redirect cients to that oad baancer. Athough the required changes do not affect appication ogic, appication deveopers must modify a server's initiaization and activation components to respond to the oad advisory messages we mentioned. CORBA adaptive on-demand oad baancing using TAO The CORBA-based oad baancing service TAO (for more information on TAO, see the reated artice highighted in this issue of DS Onine) provides ets distributed appications be oad baanced adaptivey and efficienty. This service increases overa system throughput by distributing requests across mutipe back-end server repicas without increasing round-trip atency substantiay or assuming predictabe or homogeneous oads. As a resut, deveopers can concentrate on their core appication behavior, rather than wresting with compex infrastructure mechanisms needed to make their appication distributed, scaabe, and dependabe. We based TAO's oad baancing service impementation entirey on standard features in CORBA, which demonstrates that CORBA technoogy has matured to the point where many higher-eve services can be impemented efficienty without requiring extensions to the ORB or its communication protocos. Expoiting the rich set of primitives avaiabe in CORBA sti requires speciaized skis, however, aong with the use of somewhat poory documented features. We beieve that further research and documentation of the effective architectures and design patterns used in the impementation of higher-eve CORBA services is required to advance the state of the practice and to et appication deveopers make better decisions when designing their systems. To overcome the drawbacks reated to the adaptive on-demand architecture, we appied standard CORBA portabe interceptors. 11 Likewise, we impemented our oad baancing soution based on the patterns 12 in the CORBA component mode 13 to avoid changing appication code. In the CCM, a container is responsibe for configuring the portabe object adapter 5 that manages a component. Thus, TAO's adaptive on-demand oad baancer just requires enhancing standard CCM 10

containers so they support oad baancing and does not require changes to appication code. More extensive discussion of our design for an adaptive CORBA oad baancing service using TAO wi be incuded in the Apri 2001 issue of DS Onine. Performance resuts For oad baancing to improve the overa performance of CORBA-based systems significanty, the oad baancing service must incur minima overhead. This section describes the design and resuts of severa experiments we performed to measure the benefits of TAO's strategy empiricay, as we as to demonstrate imitations with the aternative oad baancing strategies. The first set of experiments show the overhead incurred by the request forwarding architectures described in this artice. The second set of experiments demonstrates how TAO's oad baancer can maintain baanced oads dynamicay and efficienty, whereas aternative oad baancing strategies cannot. Hardware/software patform We ran benchmarks using three 733 MHz dua CPU Inte Pentium III workstations, and one 400 MHz quad CPU Inte Pentium II Xeon workstation, a running Debian GNU/Linux "potato" (gibc 2.1), with Linux kerne version 2.2.16. GNU/Linux is an open source OS that supports kerne-eve mutitasking, mutithreading, and symmetric mutiprocessing. A workstations are connected through a 100 Mbps Ethernet switch. We ran a benchmarks in the POSIX rea-time thread scheduing cass. 14 This scheduing cass improved the integrity of our resuts by ensuring the threads created during the experiment were not preempted arbitrariy during their execution. Benchmark tests The core benchmarking software is based on the "Latency" performance test distributed with the TAO open source software reease. (See $TAO_ROOT/performance-tests/Latency/ in the TAO reease for the source code of this benchmark.) A benchmarks use one of the foowing variations of the Latency test: 1. Cassic Latency test: In this benchmark, we use high-resoution OS timers to measure the throughput, atency, and jitter of requests made on an instance of a CORBA object that verifies a given integer is prime. Prime number factorization 11

provides a suitabe workoad for our oad baancing tests, because each operation runs for a reativey ong time. In addition, it is a stateess service that shieds the resuts from transitiona effects that woud otherwise occur when transferring state between oad baanced statefu repicas. 2. Latency test with a nonadaptive per-request oad baancing strategy: This variant of the Latency test was designed to demonstrate the performance and scaabiity of optima oad baancing using per-request forwarding as the underying request forwarding architecture. This variant added a speciaized "forwarding server" to the test, whose soe purpose was to forward requests to a target server at the fastest possibe rate. No changes were made to the cient. 3. Latency test with TAO's adaptive on-demand oad baancing strategy: This variant of the Latency test added support for TAO's adaptive on-demand oad baancer to the cassic Latency test. The Latency test cient code remained unchanged, thereby preserving cient transparency. This variant quantified the performance and scaabiity impact of TAO's adaptive on-demand oad baancer. Benchmarking the overhead of oad baancing mechanisms These benchmarks measure the degree of end-to-end overhead incurred by adding oad baancing to CORBA appications. The overhead experiments presented in this artice compute the throughput, atency, and jitter incurred to communicate between a singe-threaded cient and a singe-threaded server (that is, one repica) using the foowing four request forwarding architectures: No oad baancing: To estabish a performance baseine without oad baancing, the Latency performance test was first run between a singe-threaded cient and a singe-threaded server (one repica) residing on separate workstations. These resuts refect the baseine performance of a TAO cient server appication. A nonadaptive per-session cient binding architecture: We then configured TAO's oad baancer to use the nonadaptive per-session oad baancing strategy when baancing oads on a Latency test server. We added the registration code to the Latency test server impementation, which causes the repica to register itsef with the oad baancer so that it coud be oad baanced. No changes to the core Latency test impementation were made. Because the repica sends no feedback to the oad baancer, this benchmark estabishes a baseine for the best performance a oad baancer can achieve that utiizes a per-session cient binding granuarity. A nonadaptive per-request cient binding architecture: Next, we added a speciaized 12

nonadaptive per-request forwarding server to the origina Latency test. This server just forwards cient requests to an unmodified backend server. The forwarding server resided on a different machine than either the cient or backend server, which themseves each ran on separate workstations. Because the forwarding server is essentiay a ightweight oad baancer, this benchmark provides a baseine for the best performance a oad baancer can achieve using a per-request cient binding granuarity. An adaptive on-demand cient binding architecture: Finay, we incuded TAO's adaptive on-demand cient binding granuarity in the experiment by adding the oad monitor to the Latency test server. This enhancement et TAO's oad baancer react to the current oad on the Latency test server. TAO's oad baancer, the cient, and the server each ran on separate workstations (three workstations were invoved in this benchmark). No changes were made to the cient portion of the Latency test, nor were any substantia changes made to the core servant impementation. The overhead benchmark resuts iustrated in Figure 3 quantify the atency imposed by adding oad baancing specificay request forwarding to the Latency performance test. A overhead benchmarks were run with 200,000 iterations. As shown in this figure, a nonadaptive per-session approach imposes essentiay no atency overhead to the cassic Latency test. In contrast, the nonadaptive perrequest approach more than doubes the average atency. TAO's adaptive ondemand approach adds itte atency. The sight increase in atency TAO's approach incurred is caused by the additiona processing resources the oad monitor needs to perform oad monitoring; and the resources used when sending periodic oad reports to the oad baancer that is, "push-based" oad monitoring. 13

Figure 3. Load baancing atency overhead. These resuts ceary show that it is possibe to minimize atency overhead, yet sti provide adaptive oad baancing. As Figure 3 shows, the jitter did not change appreciaby between each of the test cases, which iustrates that oad baancing hardy affects the time required for cient requests to compete. Figure 4 shows how the average throughput differs between each oad baancing strategy. Again, we used ony one cient and one server for this experiment. Figure 4. Load baancing throughput overhead. Not surprisingy, the throughput remained basicay unchanged for the nonadaptive 14

per-session approach because ony one out of 200,000 requests was forwarded. The remaining requests were a sent to directy to the server a requests were running at their maximum speed. Figure 4 iustrates that throughput decreases dramaticay in the per-request strategy because it forwards requests on behaf of the cient and forwards repies received from the repica to the cient, thereby doubing the communication required to compete a request. This architecture is ceary not suitabe for throughput-sensitive appications. In contrast, the throughput in TAO's oad baancing approach ony decreased sighty with respect to the case where no oad baancing was performed. The sight decrease in throughput can be attributed to the same factors that caused the sight in increase in atency described above additiona resources the oad monitor used and the communication between the oad baancer and the oad monitor. Load baancing strategy effectiveness The foowing benchmarks quantify how effective each oad baancing strategy is at maintaining baanced oad across a given set of repicas. In a cases, we used the Latency test from the overhead benchmarks for the experiments. The goa of the effectiveness benchmark was to overoad certain repicas in a group and then measure how different oad baancing strategies handed the imbaanced oads. We hypothesized that oads across repicas shoud remain imbaanced when using nonadaptive per-session oad baancing strategies. Conversey, when using adaptive oad baancing strategies, such as TAO's adaptive oad baancing strategy, oads across repicas shoud be baanced shorty after imbaances are detected. To create this situation, we registered our Latency test server repicas each with a dedicated CPU with TAO's oad baancer during each effectiveness experiment. We then aunched eight Latency test cients. Haf the cients issued requests at a higher rate than the other haf. For exampe, the first cient issued requests at a rate of 10 requests per-second, the second cient issued requests at a rate of five requests per-second, the third at 10 requests per-second, and so forth. The actua oad was not important for this set of experiments. Instead, it was the reative oad on each repica that was important in other words, a we-baanced set of repicas shoud have reativey simiar oads, regardess of the actua vaues of the oad. For testing nonadaptive per-session oad baancing effectiveness, TAO's oad baancer was configured to use its round-robin oad baancing strategy. This strategy does not perform any anaysis on reported oads but simpy forwards cient requests to a given repica. The cient then continues to issue requests to the same 15

repica over the ifetime of that repica. The oad baancer thus appies the nonadaptive per-session strategy that is, it is ony invoved during the initia cient request. Figure 5 iustrates the oads incurred on each of the Latency server repicas using nonadaptive per-session oad baancing. The resuts quantify the degree to which oads across repicas become unbaanced by using this strategy. Because there is no feedback oop between the repicas and the oad baancer, it is not possibe to shift oad from highy oaded repicas to ess heaviy oaded repicas. Figure 5. Effectiveness of nonadaptive per-session oad baancing. Two of the repicas (3 and 4) had the same oad. The ine representing the oad on repica 4 obscures the ine representing the oad on repica 3. In addition, each cient issued the same number of iterations. Because some cients issued requests at a faster rate (10 Hz), however, those cients competed their execution before the cients with the ower request rates (5 Hz). This difference in request rate accounts for the sudden drop in oad haf way before the sower (ow oad) cients competed their execution. This test for TAO's adaptive oad baancing strategy effectiveness demonstrated the benefits of an adaptive oad baancing strategy. Therefore, we increased the oad each cient imposed and increased the number of iterations from 200,000 to 750,000. Four cients running at 100 Hz and another four running at 50 Hz were started and ended simutaneousy. We increase cient request rates to exaggerate oad imbaance and to make the oad baancing more obvious as it progresses. It was necessary to increase the number of iterations in this experiment because of the higher cient request rates. If the number of iterations were capped at the 200,000 used in the overhead 16

experiments, the experiment coud have ended before baancing the oads across the repicas. As Figure 6 iustrates, the oads across a four repicas fuctuated for a short period of time unti reaching an equiibrium oad of 150 Hz. (The 150 Hz equiibrium oad corresponds to one 100 Hz cient and one 50 Hz cient on each of the four repicas.) The initia oad fuctuations resut from the oad baancer periodicay rebinding cients to ess oaded repicas. By the time a given rebind competed, the repica oad had become imbaanced, at which point the cient was re-bound to another repica. Figure 6. Effectiveness of adaptive on-demand oad baancing. The oad baancer required severa iterations to baance the oads across the repicas (to stabiize). Had it not been for the dampening buit into TAO's adaptive on-demand oad baancing strategy, it is ikey that repica oads woud have osciated for the duration of the experiment. Dampening prevents the oad baancer from basing its decisions on instantaneous repica oads, and forces it to use average oads instead. It is instructive to compare the resuts in Figure 6 to the nonadaptive per-session oad baancing architecture resuts in Figure 5. Loads in the nonadaptive approach remained imbaanced. Using the adaptive on-demand approach, the overhead is minimized and oads remained baanced. After it was obvious that the oads were baanced equiibrium was reached we terminated the experiment. This accounts for the uniform drops in oad depicted in Figure 6. Contrast this to the nonuniform drops in the oad that occurred in the overhead experiments, where cients were aowed to compete a iterations. In both cases, the number of iterations is ess important than the fact that the iterations were executed to iustrate the effects of oad baancing and to ensure 17

that the overa resuts were not subject to transient effects, such as periodic execution of operating system tasks. The actua time required to reach the equiibrium oad depends greaty on the oad baancing strategy. The exampe above was based on the minimum dispersion strategy. This strategy minimizes the differences in reative oads between repicas. We coud have empoyed a more sophisticated adaptive oad baancing strategy to improve the time to reach equiibrium. Regardess of the compexity of the adaptive oad baancing strategy, these resuts show that adaptive oad baancing strategies can maintain baanced oads across a given set of repicas. As the resuts of our benchmarks show, CORBA oad baancing services using TAO provide many advantages and increase system scaabiity and dependabiity. TAO and TAO's oad baancing service have been appied to a wide range of distributed appications, incuding many teecommunication systems, aerospace and miitary systems, onine trading systems, medica systems, and manufacturing process contro systems. A the source code, exampes, and documentation for TAO, its oad baancing service, and its other CORBA services is freey avaiabe at http://www.cs.wust.edu/~schmidt/tao.htm. Acknowedgments Automated Trading Desk, BBN, Cisco, DARPA contract 9701516, and Siemens MED provided part of the funding for this work. References 1. E. Johnson and ArrowPoint Communications, "A Comparative Anaysis of Web Switching Architectures," 1998, http://www.arrowpoint.com/soutions/white_papers/ws_archv6.htm (current 6 Mar. 2001) 2. G. Cououris, J. Doimore, and T. Kindberg, Distributed Systems: Concepts and Design, Pearson Education, Ltd., Harow, Engand, 2001. 3. F. Dougis and J. Ousterhout, "Process Migration in the Sprite Operating System,"Proc. Int. Conf. Distributed Computing Systems, IEEE CS Press, Los Aamitos, Caif., 1987, pp. 18 25. 4. Object Management Group, The Common Object Request Broker: Architecture and Specification, 2.3 ed., OMG, Needham, Mass., June 1999. 5. M. Henning and S. Vinoski, Advanced CORBA Programming With C++, Addison-Wesey, Reading, Mass., 1999. 6. E. Gamma et a., Design Patterns: Eements of Reusabe Object-Oriented 18

Software, Addison-Wesey, Reading, Mass., 1995. 7. N. Pryce, "Abstract Session," Pattern Languages of Program Design, Addison- Wesey, Reading, Mass., 1999. 8. M. Henning, "Binding, Migration, and Scaabiity in CORBA," Comm. of the ACM, vo. 41, no. 10, Oct. 1998 9. S. Baker, CORBA Distributed Objects using Orbix, Addison-Wesey, Reading, Mass., 1997. 10. F. Buschmann et. a., Pattern-Oriented Software Architecture A System of Patterns, John Wiey and Sons, New York, 1996. 11. Adiron, LLC, et. a., Portabe Interceptor Working Draft Joint Revised Submission, Object Management Group, OMG Document orbos/99-10-01 ed., Oct. 1999. 12. D.C. Schmidt et. a., Pattern-Oriented Software Architecture: Patterns for Concurrency and Distributed Objects, Voume 2, John Wiey & Sons, New York, 2000. 13. BEA Systems, et. A., CORBA Component Mode Joint Revised Submission, Object Management Group, OMG Document orbos/99-07-01 ed, Juy 1999. 14. S. Khanna et a., "Reatime Scheduing in SunOS 5.0," Proc. USENIX Winter Conf., USENIX Association, 1992, pp. 375 390. Ossama Othman is a research assistant at the Distributed Object Computing Laboratory in the Department of Eectrica and Computer Engineering, University of Caifornia at Irvine. He is currenty pursuing his PhD studies and research in the fied of secure, scaabe and high avaiabiity CORBA-based middeware. As part of this work, he is one of the core deveopment team members for the open source CORBA ORB, TAO. Contact him at The Department of Eectrica and Computer Engineering, 355 Engineering Tower, The University of Caifornia at Irvine, Irvine, CA 92697-2625, ossama@uci.edu. Caros O'Ryan is a research assistant at the Distributed Object Computing Laboratory and a graduate student in the Department of Computer Engineering, University of Caifornia at Irvine. He participates in the deveopment of TAO, and open source, rea-time, high-performance, CORBA-compiant ORB. He obtained a BS in mathematics from the Pontificia Universidad Catoica de Chied and an MS. in computer science from Washington University. Contact him at The Department of Eectrica and Computer Engineering, 355 Engineering Tower, The University of Caifornia at Irvine, Irvine, CA 92697-2625, coryan@uci.edu. Dougas C. Schmidt is an associate professor in the Department of Eectrica and Computer Engineering at the University of Caifornia at Irvine. He is currenty serving as a program manager at the DARPA Information Technoogy Office, where he is eading the nationa effort on distributed object computing middeware research. His research focuses on patterns, optimization principes, and empirica anayses of object-oriented techniques that faciitate the deveopment of high-performance, rea-time distributed object computing middeware on parae processing patforms running over high-speed ATM networks and embedded system interconnects. Contact him at The Department of Eectrica and Computer Engineering, 616E Engineering Tower, The University of Caifornia at Irvine, Irvine, CA 92697-2625, schmidt@uci.edu. 19