A LOAD BALANCING EXTENSION FOR THE PVM SOFTWARE SYSTEM CHRISTOPHER WADE HUMPHRES

Size: px
Start display at page:

Download "A LOAD BALANCING EXTENSION FOR THE PVM SOFTWARE SYSTEM CHRISTOPHER WADE HUMPHRES"

Transcription

1 A LOAD BALANCING EXTENSION FOR THE PVM SOFTWARE SYSTEM by CHRISTOPHER WADE HUMPHRES A THESIS Submitted in partial fulfillment of the requirements for the degree of Master of Science in the Department of Electrical Engineering in the Graduate School of The University of Alabama TUSCALOOSA, ALABAMA 1995

2 Submitted by Christopher Wade Humphres in partial fulfillment of the requirements for the degree of Master of Science specializing in Electrical Engineering. Accepted on behalf of the Faculty of the Graduate School by the thesis committee: Patrick Gaughan, Ph.D. Tracy K. Camp, Ph.D. David J. Jackson, Ph.D. Chairperson Russell L. Pimmel, Ph.D. Department Chairperson Date Ronald Rogers, Ph.D. Dean of Graduate School Date ii

3 To my wife, Michelle iii

4 Acknowledgments I would like to express my sincere gratitude and appreciation to Dr. David Jeff Jackson for his guidance and support during this research and throughout my academic program at The University of Alabama. I would also like to thank Dr. Tracy Camp and Dr. Patrick Gaughan for serving on my thesis committee. I would like to express my appreciation to the Department of Electrical Engineering and the Graduate School at The University of Alabama for providing the resources which allowed me to fulfill this academic goal. I would also like to express my gratitude to the many friends and graduate students who have provided support and encouragement throughout the past two years. Most importantly, I would like to express my deepest appreciation to my parents and my wife, Michelle, for their enduring love, support, and encouragement. iv

5 Table of Contents Dedication iii Acknowledgments iv List of Tables vii List of Figures viii Abstract of Thesis ix Chapter 1. Introduction PVM Load Balancing in PVM Objective of the Thesis Chapter 2. Load Slave Process Design Introduction Machine Load Load Slave Chapter 3. Load Master Process Design Introduction Load Master Initialization Master Load Acquisition PVM Integration Chapter 4. PVM Group Library Extension Introduction Implementation of pvm_recvload() Implementation of pvm_freeload() Implementation of pvm_loadspawn() Chapter 5. Application Results Introduction Results Conclusion v

6 Chapter 6. Conclusions References Appendix A Appendix B Appendix C Appendix D Appendix E vi

7 List of Tables Table 5.1: Performance of the Load Balancing Extension (SUN Network) Table 5.2: Performance of the Load Balancing Extension (IBM Network) vii

8 List of Figures Figure 1.1: An Example PVM Virtual Configuration Figure 2.1: State Diagram for the Load Slave Process Figure 2.2: Output Formats for Two vmstat Commands Figure 3.1: State Diagram for the Load Master Process Figure 3.2: Format of Host Information Data Structure Figure 3.3: PVM Message Format for a Load Master Reply Figure 4.1: Format of pvm_recvload() Figure 4.2: Format of the load_struct Structure Figure 4.3: Format of pvm_freeload() Figure 4.4: Format of pvm_loadspawn() Figure 5.1: Comparison of pvm_spawn() and pvm_loadspawn() for the SUN Network (10Mb Ethernet) Figure 5.2: Comparison of pvm_spawn() and pvm_loadspawn() for the IBM Network (16Mb Token Ring) Figure 5.3: Vector-Matrix Multiply Enhancement (SUN Network) Figure 5.4: Matrix-Matrix Multiply (SUN Network) Figure 5.5: Matrix-Matrix Multiply (IBM Network) Figure 5.6: Static Fractal Image Compression Figure 5.7: Pool of Tasks Fractal Image Compression viii

9 Abstract of Thesis The University of Alabama Graduate School Degree: Master of Science Major Subject: Electrical Engineering Name of Candidate: Christopher Wade Humphres Title of Thesis: A Load Balancing Extension for the PVM Software System In this thesis, a dynamic load balancing extension for the PVM software system is developed. This extension includes a slave process for each host in the virtual machine, for obtaining host CPU load information, and a master process for gathering the load information. In addition, a programming interface is included to provide the PVM user with dynamic CPU load information and a load balancing process spawn. For a range of applications, the load balancing extension exhibits considerable improvement in parallel runtime performance. Results include parallel runtime comparisons for a vector-matrix multiply, a matrix-matrix multiply, and a fractal image compression application. Application results are presented for networks of SUN and IBM workstations. Abstract Approved: Date Chairperson of Thesis Committee Head of Department Dean of the Graduate School David J. Jackson, Ph.D. Russel L. Pimmel, Ph.D. Ronald Rogers, Ph.D. ix

10 Chapter 1. Introduction In recent years, the increased computational size and time required for the solution of scientific and engineering problems combined with the high cost of vectorized computers has led to the need for faster and more cost effective computing. One solution which not only makes use of available resources, but also reduces cost, is the parallel utilization of networked heterogeneous computers. One software package currently available for the interconnection of heterogeneous computers for parallel processing is Parallel Virtual Machine (PVM) [2]. 1.1 PVM PVM is a software system which allows a collection of heterogeneous computers to be used as a single parallel computational resource. This allows large computational problems to be solved by using the aggregate power of several, possibly heterogeneous, networked computers. These networked computers may be shared- or local-memory multiprocessors, scalar workstations, or vector supercomputers. PVM supports heterogeneity at the application, machine, and network layers [6]. At the application layer, PVM supports exploitation of the architecture best suited for solution of a particular task. At the machine layer, PVM performs all necessary data conversions between machines supporting differing data formats. Finally, at the network layer, PVM supports a virtual machine which may be connected by a variety of different networks. 1

11 2 Under PVM, a user defined collection of computers appears as one large distributed-memory computer, known as the virtual machine. A host is defined as a member computer of the virtual machine and a task is a unit of computation, often a UNIX process [2]. PVM provides a C, C++, or Fortran77 programming interface for starting tasks on the virtual machine and for communication and synchronization between tasks. PVM is a message-oriented system which implements process communication and synchronization through the sending and receiving of messages between tasks [5]. PVM is composed of two parts, the PVM daemon (pvmd) and the library of interface functions. The PVM daemon is a process which resides on all hosts within the virtual machine and is responsible for maintaining the host and configuration information of the entire virtual machine. In addition, the PVM daemon usually handles spawning of new tasks on the virtual machine, communication between tasks, and information requests about the virtual machine from tasks. The original PVM daemon, also called the master PVM daemon, handles all dynamic reconfiguration of the virtual machine [6]. The second component of PVM, the library of interface functions, is composed of user callable functions for requesting message passing, spawning of processes, coordination of tasks, and modification of the virtual machine [6]. 1.2 Load Balancing in PVM In a multiuser network environment, it has been found that load balancing can be the single most important performance enhancement [3]. Load balancing with PVM, however, has not proven to be a trivial task. Current load balancing policies can be classified as either static or dynamic policies [1]. Static load balancing policies depend on pre-execution information about the system for static allocation of the workload across the

12 3 virtual machine. Static load balancing policies are very effective for systems in which the communication overhead and CPU load can be accurately estimated in advance. However, in the case of a multiuser network where system load varies unpredictably, static load balancing does not provide a viable policy. Dynamic load balancing policies, on the other hand, attempt to take into consideration the dynamic changes in system load and often machine availability as well. Although dynamic load balancing policies are more difficult to implement than static policies, the potential performance improvement of the system often outweighs the complexity associated with implementation [1]. Previously, dynamic load balancing in PVM has usually depended on the Pool of Tasks paradigm [3]. In the Pool of Tasks paradigm, a task to be completed is decomposed, typically in the data domain, into many smaller subtasks and distributed by a master process to idle slave processes until all task assignments are completed. Implementation of this scheme provides a more evenly balanced load between processors by keeping all processors busy as long as subtasks remain in the pool. PVM works well with the Pool of Tasks model when using the entire virtual machine provided the end user employs an appropriate task decomposition. Many times, however, only a small portion of the entire virtual machine is used in a multiuser environment. In the event the entire virtual machine is not available or is not required, it is advantageous to use only those workstations with the lowest CPU load for solution of a problem. Unfortunately, in PVM, the spawning process occurs in a roundrobin order without regard to the CPU load of the workstations in the virtual machine. As task assignments may be made to heavily loaded processors, significant load imbalance, resulting in larger execution times, may occur. In addition, PVM does not provide dynamic information on CPU loads in the virtual machine.

13 1.3 Objective of the Thesis 4 The purpose of this research is to develop an extension to the PVM software system which provides the dynamic information needed for load balancing in a multiuser environment. The programming tools in this extension include a slave process for each host in the virtual machine, for obtaining host CPU load information, and a master process for gathering the load information. An example PVM virtual configuration is given in Figure 1.1 which illustrates the communication structure between the various components of the system. Also included is a programming interface to provide the PVM user with dynamic CPU load information and a load balancing spawn to start slave processes on the lowest loaded machines as an alternative to the static round-robin spawning process. Additionally, these tools operate seamlessly within PVM s dynamically reconfigurable environment. UM US US US Slave 1 Slave 2 Slave n GS LS LS LS LM LS Master Host UM US GS LM LS User Master User Slave PVM Group Server Load Master Load Slave Figure 1.1: An Example PVM Virtual Configuration

14 Chapter 2. Load Slave Process Design 2.1 Introduction The load slave code is part of a master/slave model. A load slave process resides on each host in the virtual configuration to obtain load information. This information is forwarded to a load master process at regular intervals. The load master process tracks all load information and upon request provides this information to the user of the PVM system. The following is a discussion of the load slave process design. The state diagram for the load slave is shown in Figure 2.1. Two requirements for the load slave software are established. First, the software must execute under any valid user login. Second, the software must be portable across various architectures and operating systems. Both are requirements of the PVM software system [6]. Therefore, all load slave software must conform to this PVM standard. 2.2 Machine Load For the UNIX operating system, the /dev/kmem file contains an image of the kernel virtual memory where information necessary for determining machine load is located. Access to this file, however, is denied to all processes except those with root permissions. In addition, the format of the file is operating system dependent. Therefore, direct access of load information via kmem is not further considered. There are many approaches for determining the load of a computer under a valid login without root privileges. All of the approaches discussed below are system 5

15 6 Initialization Input From vmstat? No Yes Parse vmstat output Send Percent Idle to Load Master Figure 2.1: State Diagram for the Load Slave Process commands which access the /dev/kmem file. The machine load can be based on the average length of the run queue as returned by the UNIX uptime or w commands [5]. However, this approach may not provide an accurate picture of the load. For instance, a very computationally intense process may provide a high CPU load, but allow a very small average run queue. The load may also be based on information returned by the UNIX ps command. Information returned can include the percentage of CPU time for each process and the accumulated CPU time used by each process. Unfortunately, this does not produce a total for all activity on the machine. In addition, summation of the

16 7 percentage of CPU time may lead to a sum of more than one hundred percent due to the variation of start times for each process. Another approach is the use of the iostat command. The iostat command returns output which includes the percentage of CPU time for all activity on a machine broken down into user time, system time, and idle time. Large system-to-system variations in output formats, however, preclude the use of the iostat command. Finally, the vmstat command provides the percentage of CPU time for all processes broken down into user, system, and idle time. The user time is the percentage of CPU utilization observed at the application level during a specified time interval. The system time is the percentage of CPU utilization by the kernel during a specified time interval. The idle time is the percentage of time the CPU is observed to be idle during a specified time interval. On some systems, there is an additional I/O wait field which contains the percentage of time the CPU is idle with outstanding I/O requests. The information returned upon first invocation of vmstat is a summary of information since system boot time. After a header line of descriptive field information, the vmstat command summarizes CPU activity over a specified interval number of seconds. Overall, vmstat gives a more accurate representation of machine load than the uptime, w, or ps commands. Based on this information, the vmstat system command is chosen as the mechanism for load acquisition. 2.3 Load Slave Next, a programming interface is developed for a load slave process to access the load information through the vmstat command. The UNIX popen() and pclose() functions provide such an interface. However, the popen() and pclose() functions do not provide a

17 8 method for termination of the vmstat process. When pclose() is called, the calling process waits for a natural termination of the vmstat process which never occurs. Therefore, two new functions are developed. The popenk() function, which is very similar to the UNIX popen() function, is developed to execute the vmstat command and establish a pipe for reading the output of vmstat. The popenk() function creates a pipe for unidirectional interprocess communication, fork s and execl s to start the new vmstat process, and redirects stdout from the vmstat process to the pipe. The pclosek() function performs a system call to kill the vmstat process, waits for vmstat termination, and closes the load slave end of the pipe. After a call to popenk(), the load slave process performs UNIX read s to obtain the output of the vmstat command. As the load slave reads the output of the vmstat command, the output is parsed. Due to minor variations in the format of the output from vmstat across various systems, the load slave must dynamically locate the CPU idle time. Figure 2.2 illustrates two vmstat formats as produced by a SUN SPARCstation and an IBM RS/6000. During the initial pass through the vmstat information, the header is parsed and the column location of the id field is determined. Once the id field is located, all header information is discarded and the first line of information is read. Based on the column number of the id field, the percentage of CPU idle time is determined. The percentage of CPU idle time is chosen over the user or system times for several reasons. First, user or system time alone may not give an accurate picture of the machine load. A machine may be heavily loaded by either intensive user or system time processes. Also, the addition of the user and system times unnecessarily complicates the parsing algorithm. Finally, the idle time provides an accurate representation of how much time remains available to a prospective user, which is the primary requirement for load balancing software.

18 9 procs memory page disk faults cpu r b w avm fre re at pi po fr de sr s0 d1 d2 d3 in sy cs us sy id procs memory page faults cpu r b avm fre re pi po fr sr cy in sy cs us sy id wa Figure 2.2: Output Formats for Two vmstat Commands. After the percent idle time is found, it is sent to the load master process. The load slave communicates with the load master via a UDP datagram socket to prevent the limits on scalability which TCP stream sockets incur. This form of communication, however, is unreliable and requires some form of error checking. To handle the unreliable communication, the load slave transmits the percent idle time to the load master and waits for a confirmation message from the load master. If confirmation does not arrive within a reasonable amount of time, currently defined as ten seconds, the load slave repeats the process. This process continues until a confirmation is received, or fifteen messages have been sent without confirmation. If no confirmation is received, the load slave presumes the load master to be dead, performs cleanup, and exits. After confirmation, the load slave waits for the next output from the vmstat command and repeats the process. A load slave process may shutdown for several reasons. A SIGTERM signal indicating deletion of the host from the PVM virtual configuration, I/O errors, or a failure of the load master process to respond may cause a load slave process to shutdown. Before the load slave process exits, the load slave writes an explanation of the shutdown to a log file. In addition, the load slave calls the function pclosek() which terminates the vmstat process and closes the open pipe.

19 Chapter 3. Load Master Process Design 3.1 Introduction The load master software is the second component of the master/slave model employed to provide load balancing extensions for PVM applications. The load master gathers load information from the load slaves and supplies the load information to PVM users upon request. The load master and the PVM group server should generally reside on the same host as the master pvmd for fault tolerance purposes. Like the load slave, the load master must be executable under any valid user login and portable across a variety of architectures and operating systems to maintain compatibility with the PVM software system [6]. The load master program design is discussed in three parts. The first section presents all aspects of initialization of the load master process. The second section includes the components of the load master program which relate to gathering and storage of load information. Finally, the third section discusses all components related to the integration of the load master process with the PVM software system. The state diagram for the load master is shown in Figure Load Master Initialization Several actions are taken before the load master process can be utilized for load balancing by the PVM user. The first step in this initialization process is creation of an 10

20 11 Initialization Input? Socket Input Load Dynamic Dynamic Request Host Add Host Delete Time for File Update? Update Load Information File Figure 3.1: State Diagram for the Load Master Process

21 12 error log file. The file is used to log messages for all errors which occur within the load master process. Next, a datagram socket is created and bound to a port using the UNIX socket() and bind() functions. The third initialization step is the spawning of slave processes. The pvm_config() function is used to obtain the list of all hosts in the virtual machine configuration. For each host in the virtual machine, a data structure is inserted into a linked list to hold all host load information, a slave process is spawned using pvm_spawn() to gather the CPU load information, and a pvm_notify() call is made to instruct the PVM daemon to notify the master if a host is deleted. The data structure used by the load master contains all of the information returned by the pvm_config() function plus two additional integers. The first integer, h_perc, contains the percentage of CPU idle time. The other integer, h_up, indicates whether the percentage of CPU idle time is a current or old value. After a specified interval of seconds, h_up is incremented. This interval, currently defined as thirty seconds, corresponds to the interval used in the slave to determine how often an update should occur. If h_up is ever larger than one, the host information is considered invalid. Since h_up is reset to zero every time a new value for the idle time is received, the host information is only invalidated by failure of a slave. Figure 3.2 presents the data structure used for storage of the host information. struct h_list { char *h_name; char *h_arch; int h_speed; int h_tid; int h_perc; int h_up; struct h_list *next; }; Figure 3.2: Format of Host Information Data Structure

22 13 Next, the PVM daemon is requested to notify the load master process of any host additions to the virtual configuration. Notification permits the load master to be aware of any reconfiguration of the virtual machine. Finally, a signal handler is established for the SIGTERM signal to allow a graceful shutdown. The SIGTERM signal is used by the PVM daemon to terminate all PVM processes in the case of a PVM shutdown or reset. 3.3 Master Load Acquisition After initialization, the load master enters a state in which it checks for messages from slave processes, PVM users, or the pvmd. As stated previously, the load master and load slave processes communicate via datagram sockets. PVM users and the pvmd, however, communicate to the load master via PVM messages. Therefore, the load master must check for messages from the load slaves, PVM users, and the pvmd in a nonblocking fashion. For socket communication with the load slaves, the UNIX select() function call is used. The select() call examines the socket and returns a positive value if the socket is ready to be read. If, after a specified interval, no information is available on the socket, the select() call returns a zero. When a select() call returns a positive value, the load master executes a UNIX recvfrom() function. The recvfrom() function returns the value of the percentage of CPU idle time sent by the slave as well as the socket address of the slave process. The UNIX function gethostbyaddr() returns the hostname of the slave process based on the socket address. The load master then attempts an update of the load information for the sending host. Although the update will normally be successful, the update will fail if the hostname of the slave process is not found in the linked list of load information. A slave process will not appear in the linked list if the host has been removed from the virtual machine but the

23 14 load slave on the host has not yet been terminated. If the update is successful, the load master sends a confirmation to the slave process. The confirmation consists of the value of the percentage of CPU idle time received. 3.4 PVM Integration The load master process handles dynamic reconfiguration of the virtual machine as well as provides load information to PVM processes on request. During initialization, the load master process requests notification from the pvmd of any reconfiguration of the virtual machine, whether it is a deletion or addition of a host. Notification consists of the pvmd sending a PVM message with the requested message tag. Different message tags are used for additions and deletions to allow the load master process to distinguish between the two virtual configuration changes. The load master process uses PVM non-blocking receive calls, pvm_nrecv(), to test for the arrival of a PVM message with a specified message tag. When the load master receives a message from the pvmd indicating one or more hosts have been added to the virtual configuration, the message contains the number of added hosts and an integer task identifier, TID, for each host. Upon receipt of the message, the load master obtains the current virtual configuration via pvm_config(). From the pvm_config() information, the load master determines the hostname for each host added according to their TID, requests notification of future deletion of each new host, and spawns a load slave on each new host. Subsequently, each host is added to the linked list of host load information with h_up set to two signifying the load information for this host is currently invalid. When current load information arrives from the load slave, h_up is reset to zero indicating valid load information.

24 15 If one or more hosts are deleted from the virtual configuration, the pvmd sends the load master a message for each host deleted. The message body contains the TID of the host which has been deleted from the virtual configuration. The load master searches for and removes the host associated with the TID from the linked list of host load information. Consequently, the load information available to the PVM user is immediately reconfigured to reflect the current virtual configuration. The final step in integrating the load master with the PVM software system is to provide a method of access to the load information. Access to load information is provided in two ways. First, the load information is written to the file.host_load.out every specified interval of seconds, corresponding to the interval between load information updates. The file can be accessed and the information used by any process with access to the file system of the load master process. Second, PVM users may utilize the PVM group library extensions or construct their own code to request the current load information from the load master. The requests are PVM messages with the message tag L_MSG_TAG which is defined in the load.h header file. The load master checks for receipt of PVM messages with the L_MSG_TAG message tag within the main program loop. The message body consists of the PVM TID of the requesting process. When a message with the L_MSG_TAG message tag arrives, the load master process packs the number of valid hosts for which load information is available and the load information for the valid hosts into a PVM message buffer. Finally, the reply is sent to the requesting process with the L_MSG_TAG message tag. The format of the reply is shown in Figure 3.3.

25 16 Number of Valid Hosts Hostname Host Architecture Host Speed Host Percentage of CPU Idle Time Hostname Host Architecture Host Speed Host Percentage of CPU Idle Time Host Percentage of CPU Idle Time Figure 3.3: PVM Message Format for a Load Master Reply

26 Chapter 4. PVM Group Library Extension 4.1 Introduction The PVM software system supports the use of dynamic process groups with the PVM group library. Group functions are not performed by the pvmd, but rather by the PVM group server (pvmgs). The group library functions provide access to information about group members maintained by the pvmgs as well as synchronization and data exchange functionality [2]. The PVM group library extension includes three functions which provide the PVM user with easy access to the load information maintained by the load master. The first function, pvm_recvload(), returns the load information for the virtual machine in an array of structures. The second function, pvm_freeload(), releases the memory allocated for the array of structures. The third function, pvm_loadspawn(), performs a load balanced spawn as an alternative to the round-robin spawning process provided by PVM. 4.2 Implementation of pvm_recvload() The pvm_recvload() function retrieves the load information from the load master, inserts the load information into a dynamically allocated structure array, and returns the structure array and the number of hosts included in the structure array to the user. In addition, an integer status code is returned to indicate success or failure of the function. The format for pvm_recvload() is shown in Figure

27 18 int stat = pvm_recvload( int *num_hosts, struct load_struct **hostp ) num_hosts - A pointer to an integer. hostp stat - A pointer to a pointer for a load_struct structure. - A status code returned by the routine. A value of zero is returned on success and a value less than zero on failure. Figure 4.1: Format of pvm_recvload() The first step in retrieving the load information is to determine the TID of the load master. The TID is required to send a PVM message to the load master. The load master is the initial member of the PVM_LOAD_BALANCE group and thus, has group instance number zero. Since the load master always has group instance number zero, the TID of the load master is obtained by executing the PVM group library function pvm_gettid() with the group PVM_LOAD_BALANCE and the instance number zero as arguments. Next, the pvm_recvload() function requests the load information from the load master by sending a PVM message with the message tag L_MSG_TAG. The body of the message consists of the requesting processes TID as returned by pvm_mytid(). A pvm_recv() call is used to receive the reply from the load master. Upon receipt, the reply message is unpacked and stored in a dynamically allocated array of structures of type load_struct. The format of the load_struct structure is shown in Figure 4.2. Finally, if the pvm_recvload() function is successful, a status code of zero is returned along with the load information. If an error occurs in contacting the load master or dynamically allocating the array of structures, the status code is returned with a value less than zero, the number of hosts is set to zero, and the array of load information is considered invalid.

28 19 struct load_struct { char *h_name; char *h_arch; int h_speed; int h_perc; int h_used; int h_ntask; int *h_tids; }; Figure 4.2: Format of the load_struct Structure 4.3 Implementation of pvm_freeload() The pvm_freeload() function is used to free memory allocated by a pvm_recvload() function call. The function implementation consists of a loop in which the memory used for the h_name and h_arch character strings are released and a call to free the array of structures. An integer status code is returned by pvm_freeload() to indicate success or failure of the function. If the function call is successful, the status code returned is zero. Otherwise, a value less than zero is returned. The format for pvm_freeload() is shown in Figure 4.3. int stat = pvm_freeload( int num_hosts, struct load_struct **hostp ) num_hosts - An integer specifying the number of hosts in the structure array. hostp stat - A pointer to a pointer to an array of load_struct structures. - A status code. A value of zero is returned on success and a value of less than zero on failure. Figure 4.3: Format of pvm_freeload() 4.4 Implementation of pvm_loadspawn() The pvm_loadspawn() function is provided as an alternative to the round-robin pvm_spawn() provided by the PVM software system. The pvm_loadspawn() function performs spawns by choosing the machines with the lowest load. The machines with the

29 20 lowest loads are chosen based on either the percentage of idle time only or the percentage of idle time and the architectural speed. A status code is returned to indicate success or failure. A description of the pvm_loadspawn() function is given in Figure 4.4. int stat = pvm_loadspawn( int *num_hosts, struct load_struct **hostp, char *task, char **argv, int flag, int ntask, int *tids ) num_hosts - A pointer to an integer. hostp - A pointer to a pointer for a load_struct structure. task argv flag ntask tids stat - A pointer to the executable file name of the process to be started. - A pointer to an array of command line arguments for the process to be started. The end of the array must be a NULL. - An integer specifying loadspawn options. The flag should be one of the following: PvmTaskDefault 0 Machines with highest percentage of idle time. PvmLoadSpeed 1024 Machines with highest percentage of idle time times architectural speed. - An integer specifying the number of processes to start. - A pointer to an integer array of at least ntask integers. On return, the array will hold the task identifiers of the PVM processes successfully started. If an error occurs, the array will contain the associated error code for each process which failed. - A status code returned by the routine. A value of zero is returned on success and a value less than zero on failure. Figure 4.4: Format of pvm_loadspawn() A host is chosen for spawning based on one of two algorithms. First, if the function arguments specify PvmTaskDefault, then the host with the highest percentage of idle time is chosen. When a host is chosen, the h_used variable is set to one to prevent the host from being chosen again. Second, if PvmLoadSpeed is specified, the host with the

30 21 highest value of the percentage of idle time multiplied by the architectural speed is chosen. Again, the h_used variable is set to one to prevent reuse of a host. The first step in the pvm_loadspawn() function is a pvm_recvload() function call which acquires the current load information for the virtual machine. From this point, there are two possible paths of execution. If the number of hosts for which load information is available is less than the number of tasks to be spawned, a sequence of instructions is used which permits multiple tasks to be spawned at one time. This is accomplished by calculating the number of tasks per host with lowest loaded hosts receiving any extra tasks to be spawned. A pvm_spawn() call is then executed for each host. If the number of hosts for which load information is available is greater than or equal to the number of tasks to be spawned, the lowest loaded hosts are spawned, one task each, until all of the requested tasks have been spawned. In either case, the h_ntask integer is set to the number of tasks spawned and the h_tids array is set to the TIDs for those tasks for every host. After attempts to spawn all of the tasks are complete, the function returns the number of tasks successfully spawned or a value less than zero if an error occurred. An error will occur only if pvm_recvload() fails.

31 Chapter 5. Application Results 5.1 Introduction To evaluate the performance of the load balancing tools, several applications are considered. First, a comparison of the execution times of the pvm_spawn() and pvm_loadspawn() function calls are performed. In addition, three algorithms are selected to test the load balancing tools, a parallel vector-matrix multiply algorithm, a parallel matrix-matrix multiply algorithm, and a parallel fractal image compression algorithm [7]. An analysis of the application results is presented. The testbeds for these results are networks of SUN and IBM workstations. In general, the results presented here are obtained with an approximately even distribution of heavily loaded (0-35% idle), moderately loaded (36-70% idle), lightly loaded (71-90% idle), and unloaded (91-100% idle) processors. However, all test cases are executed in a multiuser environment in which load distribution fluctuations occur. 5.2 Results The overhead associated with the use of the pvm_loadspawn() function, which replaces the original pvm_spawn() function, is first determined. Figures 5.1 and 5.2 display the spawn time as a function of the number of tasks spawned for both the original and modified spawn functions. The results, shown in Figure 5.1, are for executions on a network of five SUN SPARC workstations. The results presented in Figure 5.2 are for 22

32 23 executions on a network of 18 IBM RS/6000 workstations. Both graphs illustrate the small performance cost of using the pvm_loadspawn() function in place of the pvm_spawn() function Spawn Time (milliseconds) pvm_spawn pvm_loadspawn Number of Tasks Figure 5.1: Comparison of pvm_spawn() and pvm_loadspawn() for the SUN Network (10Mb Ethernet) Next, a parallel implementation of a vector-matrix multiply algorithm based on a master/slave model is used to evaluate the performance of the load balancing tools. The vector-matrix multiply program involves passing the entire vector, 500 elements, and a particular submatrix to each slave. Upon receipt of the vector and its submatrix, the slave performs the vector-matrix multiply calculations to determine the results for which it is responsible and returns the results to the master. The graph in Figure 5.3 displays the execution time on the SUN network as a function of the number of processors involved in the parallel implementation of the algorithm.

33 Spawn Time (milliseconds) pvm_spawn pvm_loadspawn Number of Tasks Figure 5.2: Comparison of pvm_spawn() and pvm_loadspawn() for the IBM Network (16Mb Token Ring) Two load balancing mechanisms are integrated into the original vector-matrix multiply program, VM. First, the original PVM spawn is replaced with a load balanced spawn. The load balanced spawn considers both CPU idle time and architectural speed in selecting hosts. Second, the workload allocation for each slave is calculated based on the CPU idle time and the architectural speed of each slave. The SUN network is speed heterogeneous (SS1 and SS2 processors). The architectural speed for each host is determined by executing a sequential matrix-matrix multiply on each host. One host is assigned an architectural speed of 1000 and is used as the basis for calculating the architectural speeds of the remaining hosts based on a comparison of their sequential execution times.

34 25 Average Slave Time (milliseconds) VM VMLSBS Number of Processors Figure 5.3: Vector-Matrix Multiply Enhancement (SUN Network) Figure 5.3 provides a comparison of the execution times of the original program and the enhanced program, VMLSBS, which utilizes the load balanced spawn and a load balanced workload allocation considering CPU idle time and architectural speed. This illustrates the additional benefits which result from the use of these load balancing mechanisms. The enhancement is greater when only two processors out of the available five are used. This is because the load balanced spawn may choose any two out of five processors. In comparison, when four processors are used, the load balanced spawn must choose four out of five processors which increases the probability of selecting a more heavily loaded machine. A straight forward parallel implementation of a matrix-matrix multiply algorithm based on a master/slave model [4] is also used to evaluate the performance of the load balancing tools. Based on the equation Y=AX, each slave receives the entire square matrix

35 26 X, of size 500x500, and the submatrix of A allocated to it. When a slave completes the calculation of the results for which it is responsible, the slave returns the results to the master. The graphs in Figures 5.4 and 5.5 illustrate the execution time as a function of the number of processors used by the matrix-matrix multiply algorithm. Average Slave Time (seconds) MM MML MMLB MMLSBS Number of Processors Figure 5.4: Matrix-Matrix Multiply (SUN Network) Several modifications are made to the original program, MM; each modification is embodied in a program which integrates one or more of the load balancing mechanisms. In the first program, MML, the original PVM spawn is replaced with a load balanced spawn based exclusively on CPU idle time. Figure 5.4 and 5.5 reflect the enhancement which results from the incorporation of the alternate spawning technique. Figures 5.4 and 5.5 also depict the further improvement which results from the second program, MMLB. This program incorporates not only the load balanced spawn based on CPU idle time, but also allocates work assignments to the slave processes based on their individual CPU idle

36 27 Average Slave Time (seconds) MM MML MMLB Number of Processors Figure 5.5: Matrix-Matrix Multiply (IBM Network) times. In the final program, MMLSBS, full load balancing capabilities are used to implement the matrix-matrix multiply algorithm. Specifically, MMLSBS incorporates a load balanced spawn based on the architectural speed and CPU idle time as well as a workload allocation based on the architectural speed and CPU idle time of each slave. Since architectural speed differences exist only on the SUN network, MMLSBS is not executed on the IBM RS/6000 network. Figure 5.4 clearly demonstrates the improved performance of the program which utilized full load balancing capabilities. The final applications used to evaluate the load balancing tools are parallel implementations of a fractal image compression algorithm for IBM RS/6000 s [7]. The original fractal image compression software consists of two versions based on a master/ slave model. One version, stat_pack, uses a static allocation of workload to slave processes whereas the other version, dyn_pack, uses a dynamic load balancing Pool of

37 28 Tasks paradigm for workload allocation. Both versions of the software are modified to use the load balanced spawn. For the original versions and modified versions, Figures 5.6 and 5.7 show the execution time as a function of the number of processors involved in the parallel implementation of the algorithm. Execution Time (seconds) Original Load Balanced Spawn Number of Processors Figure 5.6: Static Fractal Image Compression In stat_load, the static version of the parallel image compression algorithm is modified to use the load balanced spawn. The graph in Figure 5.6 depicts a marked improvement in efficiency when stat_pack is modified to use the load balanced spawn. The degree of enhancement depends on the number of processors available for spawning. Therefore, when 16 of the 18 available processors are utilized, the improvement associated with the load balanced spawn is reduced due to the use of heavily loaded processors. The increase in execution time with 16 processors is a result of the use of hosts with less than 10% CPU idle time.

38 29 Execution Time (seconds) Original Version Load Balanced Spawn Number of Processors Figure 5.7: Pool of Tasks Fractal Image Compression Similarly, the algorithm using the Pool of Tasks paradigm is modified to incorporate the load balanced spawn in dyn_load. Figure 5.7 presents the improvement associated with the load balanced spawn. The Pool of Tasks paradigm provides a very efficient load balancing strategy for the fractal image compression software. Thus, the improvement resulting from the load balanced spawn is smaller for the Pool of Tasks fractal image compression algorithm than for the static load distribution case. A significant improvement, however, is shown when utilizing only a small portion of the virtual machine. For example, the four processor case for the Pool of Tasks implementation is able to select high CPU idle time processors and provides over an 18% reduction in execution time.

39 5.3 Conclusion 30 The overhead involved in utilization of the load balancing tools presented is minimal. The load balanced spawn overhead is less than half a second greater than the original PVM spawn for up to 16 hosts. In comparison with the execution time of most parallel programs, this overhead is negligible. With this minimal overhead comes significant gain. Tables 5.1 and 5.2 demonstrate the enhancement provided by the load balancing extension in terms of the percent reduction in execution time for the modified applications compared to the original applications. The performance improvement for utilization of the load balancing spawn on small portions of the virtual machine yields an 18-55% reduction in execution time. Even when using a large portion of the virtual machine, the load balancing spawn provides a reduction in execution time of 4-9%. The additional gain associated with load balanced workload allocation is also substantial. Without architectural speed differences, the percent decrease in execution time when compared to the original program ranges from 23-52% for utilization of small portions of the virtual machine and from 8-36% for utilization of large portions of the virtual machine. The improvement is even more significant when architectural speed differences are present in the virtual machine. When considering architectural speed differences, a 32-57% reduction in execution time is seen when utilizing few processors in the virtual machine. In comparison, a reduction of between 22-30% is observed when utilizing a majority of the processors available in the virtual machine.

40 31 Application Number of Processors % Reduction in Execution Time VMLSBS VMLSBS VMLSBS MML MML MML MMLB MMLB MMLB MMLB MMLSBS MMLSBS MMLSBS MMLSBS Table 5.1: Performance of the Load Balancing Extension (SUN Network)

41 32 Application Number of Processors % Reduction in Execution Time MML MML MML MML MML MMLB MMLB MMLB MMLB MMLB stat_load stat_load stat_load stat_load dyn_load dyn_load dyn_load dyn_load Table 5.2: Performance of the Load Balancing Extension (IBM Network)

42 Chapter 6. Conclusions An extension to the PVM software system, which provides dynamic information needed for load balancing in a multiuser environment, is developed. All software developed in this extension conforms to two PVM requirements [6]. First, the software will execute under any valid user login. Second, the software is portable across a variety of architectures and operating systems. The extension includes the development of a load master program, a load slave program, and a programming interface. The load master is responsible for dynamic reconfiguration of the virtual machine, gathering load information from the load slaves, and providing load information to PVM users upon request. The responsibilities of the load slave include gathering load information from the host on which it resides and reliably transmitting the load information to the load master. The programming interface allows a PVM user to obtain the load information held by the load master and to perform a load balanced spawn. The load balanced spawn may use CPU idle time only or CPU idle time and architectural speed in the host selection process. Evaluation of the load balancing extension demonstrates the efficiency with which these tools can be used. In addition, the use of the load balancing tools introduce negligible overhead to a computationally intense process. Overall, the implementation of the load balancing extension permits faster and more cost effective computing. Thus, the extension is a significant step toward parallel utilization of networked heterogeneous 33

Introduction. What is an Operating System?

Introduction. What is an Operating System? Introduction What is an Operating System? 1 What is an Operating System? 2 Why is an Operating System Needed? 3 How Did They Develop? Historical Approach Affect of Architecture 4 Efficient Utilization

More information

How to analyse your system to optimise performance and throughput in IIBv9

How to analyse your system to optimise performance and throughput in IIBv9 How to analyse your system to optimise performance and throughput in IIBv9 Dave Gorman gormand@uk.ibm.com 2013 IBM Corporation Overview The purpose of this presentation is to demonstrate how to find the

More information

Tools Page 1 of 13 ON PROGRAM TRANSLATION. A priori, we have two translation mechanisms available:

Tools Page 1 of 13 ON PROGRAM TRANSLATION. A priori, we have two translation mechanisms available: Tools Page 1 of 13 ON PROGRAM TRANSLATION A priori, we have two translation mechanisms available: Interpretation Compilation On interpretation: Statements are translated one at a time and executed immediately.

More information

High Performance Cluster Support for NLB on Window

High Performance Cluster Support for NLB on Window High Performance Cluster Support for NLB on Window [1]Arvind Rathi, [2] Kirti, [3] Neelam [1]M.Tech Student, Department of CSE, GITM, Gurgaon Haryana (India) arvindrathi88@gmail.com [2]Asst. Professor,

More information

Linux Kernel Architecture

Linux Kernel Architecture Linux Kernel Architecture Amir Hossein Payberah payberah@yahoo.com Contents What is Kernel? Kernel Architecture Overview User Space Kernel Space Kernel Functional Overview File System Process Management

More information

Operating Systems for Parallel Processing Assistent Lecturer Alecu Felician Economic Informatics Department Academy of Economic Studies Bucharest

Operating Systems for Parallel Processing Assistent Lecturer Alecu Felician Economic Informatics Department Academy of Economic Studies Bucharest Operating Systems for Parallel Processing Assistent Lecturer Alecu Felician Economic Informatics Department Academy of Economic Studies Bucharest 1. Introduction Few years ago, parallel computers could

More information

An Empirical Study and Analysis of the Dynamic Load Balancing Techniques Used in Parallel Computing Systems

An Empirical Study and Analysis of the Dynamic Load Balancing Techniques Used in Parallel Computing Systems An Empirical Study and Analysis of the Dynamic Load Balancing Techniques Used in Parallel Computing Systems Ardhendu Mandal and Subhas Chandra Pal Department of Computer Science and Application, University

More information

Load balancing Static Load Balancing

Load balancing Static Load Balancing Chapter 7 Load Balancing and Termination Detection Load balancing used to distribute computations fairly across processors in order to obtain the highest possible execution speed. Termination detection

More information

A Robust Dynamic Load-balancing Scheme for Data Parallel Application on Message Passing Architecture

A Robust Dynamic Load-balancing Scheme for Data Parallel Application on Message Passing Architecture A Robust Dynamic Load-balancing Scheme for Data Parallel Application on Message Passing Architecture Yangsuk Kee Department of Computer Engineering Seoul National University Seoul, 151-742, Korea Soonhoi

More information

Parallel Processing over Mobile Ad Hoc Networks of Handheld Machines

Parallel Processing over Mobile Ad Hoc Networks of Handheld Machines Parallel Processing over Mobile Ad Hoc Networks of Handheld Machines Michael J Jipping Department of Computer Science Hope College Holland, MI 49423 jipping@cs.hope.edu Gary Lewandowski Department of Mathematics

More information

Performance analysis of a Linux based FTP server

Performance analysis of a Linux based FTP server Performance analysis of a Linux based FTP server A Thesis Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Technology by Anand Srivastava to the Department of Computer Science

More information

Linux Driver Devices. Why, When, Which, How?

Linux Driver Devices. Why, When, Which, How? Bertrand Mermet Sylvain Ract Linux Driver Devices. Why, When, Which, How? Since its creation in the early 1990 s Linux has been installed on millions of computers or embedded systems. These systems may

More information

Understanding Slow Start

Understanding Slow Start Chapter 1 Load Balancing 57 Understanding Slow Start When you configure a NetScaler to use a metric-based LB method such as Least Connections, Least Response Time, Least Bandwidth, Least Packets, or Custom

More information

Contributions to Gang Scheduling

Contributions to Gang Scheduling CHAPTER 7 Contributions to Gang Scheduling In this Chapter, we present two techniques to improve Gang Scheduling policies by adopting the ideas of this Thesis. The first one, Performance- Driven Gang Scheduling,

More information

Instrumentation for Linux Event Log Analysis

Instrumentation for Linux Event Log Analysis Instrumentation for Linux Event Log Analysis Rajarshi Das Linux Technology Center IBM India Software Lab rajarshi@in.ibm.com Hien Q Nguyen Linux Technology Center IBM Beaverton hien@us.ibm.com Abstract

More information

Load Balancing and Termination Detection

Load Balancing and Termination Detection Chapter 7 Load Balancing and Termination Detection 1 Load balancing used to distribute computations fairly across processors in order to obtain the highest possible execution speed. Termination detection

More information

VIRTUALIZATION AND CPU WAIT TIMES IN A LINUX GUEST ENVIRONMENT

VIRTUALIZATION AND CPU WAIT TIMES IN A LINUX GUEST ENVIRONMENT VIRTUALIZATION AND CPU WAIT TIMES IN A LINUX GUEST ENVIRONMENT James F Brady Capacity Planner for the State Of Nevada jfbrady@doit.nv.gov The virtualization environment presents the opportunity to better

More information

An Implementation Of Multiprocessor Linux

An Implementation Of Multiprocessor Linux An Implementation Of Multiprocessor Linux This document describes the implementation of a simple SMP Linux kernel extension and how to use this to develop SMP Linux kernels for architectures other than

More information

Networks. Inter-process Communication. Pipes. Inter-process Communication

Networks. Inter-process Communication. Pipes. Inter-process Communication Networks Mechanism by which two processes exchange information and coordinate activities Inter-process Communication process CS 217 process Network 1 2 Inter-process Communication Sockets o Processes can

More information

Load Balancing on a Non-dedicated Heterogeneous Network of Workstations

Load Balancing on a Non-dedicated Heterogeneous Network of Workstations Load Balancing on a Non-dedicated Heterogeneous Network of Workstations Dr. Maurice Eggen Nathan Franklin Department of Computer Science Trinity University San Antonio, Texas 78212 Dr. Roger Eggen Department

More information

Laboratory Report. An Appendix to SELinux & grsecurity: A Side-by-Side Comparison of Mandatory Access Control & Access Control List Implementations

Laboratory Report. An Appendix to SELinux & grsecurity: A Side-by-Side Comparison of Mandatory Access Control & Access Control List Implementations Laboratory Report An Appendix to SELinux & grsecurity: A Side-by-Side Comparison of Mandatory Access Control & Access Control List Implementations 1. Hardware Configuration We configured our testbed on

More information

Introduction to Parallel Programming and MapReduce

Introduction to Parallel Programming and MapReduce Introduction to Parallel Programming and MapReduce Audience and Pre-Requisites This tutorial covers the basics of parallel programming and the MapReduce programming model. The pre-requisites are significant

More information

Operating System Concepts. Operating System 資 訊 工 程 學 系 袁 賢 銘 老 師

Operating System Concepts. Operating System 資 訊 工 程 學 系 袁 賢 銘 老 師 Lecture 7: Distributed Operating Systems A Distributed System 7.2 Resource sharing Motivation sharing and printing files at remote sites processing information in a distributed database using remote specialized

More information

Elemental functions: Writing data-parallel code in C/C++ using Intel Cilk Plus

Elemental functions: Writing data-parallel code in C/C++ using Intel Cilk Plus Elemental functions: Writing data-parallel code in C/C++ using Intel Cilk Plus A simple C/C++ language extension construct for data parallel operations Robert Geva robert.geva@intel.com Introduction Intel

More information

Socket Programming. Request. Reply. Figure 1. Client-Server paradigm

Socket Programming. Request. Reply. Figure 1. Client-Server paradigm Socket Programming 1. Introduction In the classic client-server model, the client sends out requests to the server, and the server does some processing with the request(s) received, and returns a reply

More information

Setting up PostgreSQL

Setting up PostgreSQL Setting up PostgreSQL 1 Introduction to PostgreSQL PostgreSQL is an object-relational database management system based on POSTGRES, which was developed at the University of California at Berkeley. PostgreSQL

More information

Operating Systems 4 th Class

Operating Systems 4 th Class Operating Systems 4 th Class Lecture 1 Operating Systems Operating systems are essential part of any computer system. Therefore, a course in operating systems is an essential part of any computer science

More information

CMLOG: A Common Message Logging System

CMLOG: A Common Message Logging System CMLOG: A Common Message Logging System Jie Chen, Walt Akers, Matt Bickley, Danjin Wu and William Watson III Control Software Group Thomas Jefferson National Accelerator Facility Newport News, Virginia

More information

SMTP-32 Library. Simple Mail Transfer Protocol Dynamic Link Library for Microsoft Windows. Version 5.2

SMTP-32 Library. Simple Mail Transfer Protocol Dynamic Link Library for Microsoft Windows. Version 5.2 SMTP-32 Library Simple Mail Transfer Protocol Dynamic Link Library for Microsoft Windows Version 5.2 Copyright 1994-2003 by Distinct Corporation All rights reserved Table of Contents 1 Overview... 5 1.1

More information

Performance Monitoring of Parallel Scientific Applications

Performance Monitoring of Parallel Scientific Applications Performance Monitoring of Parallel Scientific Applications Abstract. David Skinner National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory This paper introduces an infrastructure

More information

Socket Programming in C/C++

Socket Programming in C/C++ September 24, 2004 Contact Info Mani Radhakrishnan Office 4224 SEL email mradhakr @ cs. uic. edu Office Hours Tuesday 1-4 PM Introduction Sockets are a protocol independent method of creating a connection

More information

ELEN 602: Computer Communications and Networking. Socket Programming Basics

ELEN 602: Computer Communications and Networking. Socket Programming Basics 1 ELEN 602: Computer Communications and Networking Socket Programming Basics A. Introduction In the classic client-server model, the client sends out requests to the server, and the server does some processing

More information

Infrastructure for Load Balancing on Mosix Cluster

Infrastructure for Load Balancing on Mosix Cluster Infrastructure for Load Balancing on Mosix Cluster MadhuSudhan Reddy Tera and Sadanand Kota Computing and Information Science, Kansas State University Under the Guidance of Dr. Daniel Andresen. Abstract

More information

Scalability and Classifications

Scalability and Classifications Scalability and Classifications 1 Types of Parallel Computers MIMD and SIMD classifications shared and distributed memory multicomputers distributed shared memory computers 2 Network Topologies static

More information

The Application Level Placement Scheduler

The Application Level Placement Scheduler The Application Level Placement Scheduler Michael Karo 1, Richard Lagerstrom 1, Marlys Kohnke 1, Carl Albing 1 Cray User Group May 8, 2006 Abstract Cray platforms present unique resource and workload management

More information

Network Attached Storage. Jinfeng Yang Oct/19/2015

Network Attached Storage. Jinfeng Yang Oct/19/2015 Network Attached Storage Jinfeng Yang Oct/19/2015 Outline Part A 1. What is the Network Attached Storage (NAS)? 2. What are the applications of NAS? 3. The benefits of NAS. 4. NAS s performance (Reliability

More information

Chapter 14 Virtual Machines

Chapter 14 Virtual Machines Operating Systems: Internals and Design Principles Chapter 14 Virtual Machines Eighth Edition By William Stallings Virtual Machines (VM) Virtualization technology enables a single PC or server to simultaneously

More information

Chapter 7 Load Balancing and Termination Detection

Chapter 7 Load Balancing and Termination Detection Chapter 7 Load Balancing and Termination Detection Load balancing used to distribute computations fairly across processors in order to obtain the highest possible execution speed. Termination detection

More information

System Calls and Standard I/O

System Calls and Standard I/O System Calls and Standard I/O Professor Jennifer Rexford http://www.cs.princeton.edu/~jrex 1 Goals of Today s Class System calls o How a user process contacts the Operating System o For advanced services

More information

Load Balancing and Termination Detection

Load Balancing and Termination Detection Chapter 7 slides7-1 Load Balancing and Termination Detection slides7-2 Load balancing used to distribute computations fairly across processors in order to obtain the highest possible execution speed. Termination

More information

Architectural Patterns. Layers: Pattern. Architectural Pattern Examples. Layer 3. Component 3.1. Layer 2. Component 2.1 Component 2.2.

Architectural Patterns. Layers: Pattern. Architectural Pattern Examples. Layer 3. Component 3.1. Layer 2. Component 2.1 Component 2.2. Architectural Patterns Architectural Patterns Dr. James A. Bednar jbednar@inf.ed.ac.uk http://homepages.inf.ed.ac.uk/jbednar Dr. David Robertson dr@inf.ed.ac.uk http://www.inf.ed.ac.uk/ssp/members/dave.htm

More information

CS 377: Operating Systems. Outline. A review of what you ve learned, and how it applies to a real operating system. Lecture 25 - Linux Case Study

CS 377: Operating Systems. Outline. A review of what you ve learned, and how it applies to a real operating system. Lecture 25 - Linux Case Study CS 377: Operating Systems Lecture 25 - Linux Case Study Guest Lecturer: Tim Wood Outline Linux History Design Principles System Overview Process Scheduling Memory Management File Systems A review of what

More information

Last Class: Communication in Distributed Systems. Today: Remote Procedure Calls

Last Class: Communication in Distributed Systems. Today: Remote Procedure Calls Last Class: Communication in Distributed Systems Structured or unstructured? Addressing? Blocking/non-blocking? Buffered or unbuffered? Reliable or unreliable? Server architecture Scalability Push or pull?

More information

Implementing and testing tftp

Implementing and testing tftp CSE123 Spring 2013 Term Project Implementing and testing tftp Project Description Checkpoint: May 10, 2013 Due: May 29, 2013 For this project you will program a client/server network application in C on

More information

A Performance Study of Load Balancing Strategies for Approximate String Matching on an MPI Heterogeneous System Environment

A Performance Study of Load Balancing Strategies for Approximate String Matching on an MPI Heterogeneous System Environment A Performance Study of Load Balancing Strategies for Approximate String Matching on an MPI Heterogeneous System Environment Panagiotis D. Michailidis and Konstantinos G. Margaritis Parallel and Distributed

More information

Programma della seconda parte del corso

Programma della seconda parte del corso Programma della seconda parte del corso Introduction Reliability Performance Risk Software Performance Engineering Layered Queueing Models Stochastic Petri Nets New trends in software modeling: Metamodeling,

More information

IBM Tivoli Monitoring Version 6.3 Fix Pack 2. Infrastructure Management Dashboards for Servers Reference

IBM Tivoli Monitoring Version 6.3 Fix Pack 2. Infrastructure Management Dashboards for Servers Reference IBM Tivoli Monitoring Version 6.3 Fix Pack 2 Infrastructure Management Dashboards for Servers Reference IBM Tivoli Monitoring Version 6.3 Fix Pack 2 Infrastructure Management Dashboards for Servers Reference

More information

Chapter 12 File Management

Chapter 12 File Management Operating Systems: Internals and Design Principles, 6/E William Stallings Chapter 12 File Management Dave Bremer Otago Polytechnic, N.Z. 2008, Prentice Hall Roadmap Overview File organisation and Access

More information

Chapter 12 File Management. Roadmap

Chapter 12 File Management. Roadmap Operating Systems: Internals and Design Principles, 6/E William Stallings Chapter 12 File Management Dave Bremer Otago Polytechnic, N.Z. 2008, Prentice Hall Overview Roadmap File organisation and Access

More information

IOS Server Load Balancing

IOS Server Load Balancing IOS Server Load Balancing This feature module describes the Cisco IOS Server Load Balancing (SLB) feature. It includes the following sections: Feature Overview, page 1 Supported Platforms, page 5 Supported

More information

File Transfer Protocol (FTP) Chuan-Ming Liu Computer Science and Information Engineering National Taipei University of Technology Fall 2007, TAIWAN

File Transfer Protocol (FTP) Chuan-Ming Liu Computer Science and Information Engineering National Taipei University of Technology Fall 2007, TAIWAN File Transfer Protocol (FTP) Chuan-Ming Liu Computer Science and Information Engineering National Taipei University of Technology Fall 2007, TAIWAN 1 Contents CONNECTIONS COMMUNICATION COMMAND PROCESSING

More information

Tivoli IBM Tivoli Web Response Monitor and IBM Tivoli Web Segment Analyzer

Tivoli IBM Tivoli Web Response Monitor and IBM Tivoli Web Segment Analyzer Tivoli IBM Tivoli Web Response Monitor and IBM Tivoli Web Segment Analyzer Version 2.0.0 Notes for Fixpack 1.2.0-TIV-W3_Analyzer-IF0003 Tivoli IBM Tivoli Web Response Monitor and IBM Tivoli Web Segment

More information

Load Balancing Techniques

Load Balancing Techniques Load Balancing Techniques 1 Lecture Outline Following Topics will be discussed Static Load Balancing Dynamic Load Balancing Mapping for load balancing Minimizing Interaction 2 1 Load Balancing Techniques

More information

MOSIX: High performance Linux farm

MOSIX: High performance Linux farm MOSIX: High performance Linux farm Paolo Mastroserio [mastroserio@na.infn.it] Francesco Maria Taurino [taurino@na.infn.it] Gennaro Tortone [tortone@na.infn.it] Napoli Index overview on Linux farm farm

More information

Middleware and Distributed Systems. Introduction. Dr. Martin v. Löwis

Middleware and Distributed Systems. Introduction. Dr. Martin v. Löwis Middleware and Distributed Systems Introduction Dr. Martin v. Löwis 14 3. Software Engineering What is Middleware? Bauer et al. Software Engineering, Report on a conference sponsored by the NATO SCIENCE

More information

Sensors and the Zaurus S Hardware

Sensors and the Zaurus S Hardware The Zaurus Software Development Guide Robert Christy August 29, 2003 Contents 1 Overview 1 2 Writing Software for the Zaurus 2 3 Using the bathw Library 3 3.1 Using the fans.............................

More information

A Comparison of Distributed Systems: ChorusOS and Amoeba

A Comparison of Distributed Systems: ChorusOS and Amoeba A Comparison of Distributed Systems: ChorusOS and Amoeba Angelo Bertolli Prepared for MSIT 610 on October 27, 2004 University of Maryland University College Adelphi, Maryland United States of America Abstract.

More information

Workflow Templates Library

Workflow Templates Library Workflow s Library Table of Contents Intro... 2 Active Directory... 3 Application... 5 Cisco... 7 Database... 8 Excel Automation... 9 Files and Folders... 10 FTP Tasks... 13 Incident Management... 14 Security

More information

CIT 470: Advanced Network and System Administration. Topics. Performance Monitoring. Performance Monitoring

CIT 470: Advanced Network and System Administration. Topics. Performance Monitoring. Performance Monitoring CIT 470: Advanced Network and System Administration Performance Monitoring CIT 470: Advanced Network and System Administration Slide #1 Topics 1. Performance monitoring. 2. Performance tuning. 3. CPU 4.

More information

Parallel Computing. Benson Muite. benson.muite@ut.ee http://math.ut.ee/ benson. https://courses.cs.ut.ee/2014/paralleel/fall/main/homepage

Parallel Computing. Benson Muite. benson.muite@ut.ee http://math.ut.ee/ benson. https://courses.cs.ut.ee/2014/paralleel/fall/main/homepage Parallel Computing Benson Muite benson.muite@ut.ee http://math.ut.ee/ benson https://courses.cs.ut.ee/2014/paralleel/fall/main/homepage 3 November 2014 Hadoop, Review Hadoop Hadoop History Hadoop Framework

More information

Performance Monitoring User s Manual

Performance Monitoring User s Manual NEC Storage Software Performance Monitoring User s Manual IS025-15E NEC Corporation 2003-2010 No part of the contents of this book may be reproduced or transmitted in any form without permission of NEC

More information

Parallelization: Binary Tree Traversal

Parallelization: Binary Tree Traversal By Aaron Weeden and Patrick Royal Shodor Education Foundation, Inc. August 2012 Introduction: According to Moore s law, the number of transistors on a computer chip doubles roughly every two years. First

More information

Job Reference Guide. SLAMD Distributed Load Generation Engine. Version 1.8.2

Job Reference Guide. SLAMD Distributed Load Generation Engine. Version 1.8.2 Job Reference Guide SLAMD Distributed Load Generation Engine Version 1.8.2 June 2004 Contents 1. Introduction...3 2. The Utility Jobs...4 3. The LDAP Search Jobs...11 4. The LDAP Authentication Jobs...22

More information

ELIXIR LOAD BALANCER 2

ELIXIR LOAD BALANCER 2 ELIXIR LOAD BALANCER 2 Overview Elixir Load Balancer for Elixir Repertoire Server 7.2.2 or greater provides software solution for load balancing of Elixir Repertoire Servers. As a pure Java based software

More information

NetFlow Aggregation. Feature Overview. Aggregation Cache Schemes

NetFlow Aggregation. Feature Overview. Aggregation Cache Schemes NetFlow Aggregation This document describes the Cisco IOS NetFlow Aggregation feature, which allows Cisco NetFlow users to summarize NetFlow export data on an IOS router before the data is exported to

More information

Event Logging and Distribution for the BaBar Online System

Event Logging and Distribution for the BaBar Online System LAC-PUB-8744 August 2002 Event Logging and Distribution for the BaBar Online ystem. Dasu, T. Glanzman, T. J. Pavel For the BaBar Prompt Reconstruction and Computing Groups Department of Physics, University

More information

AXIGEN Mail Server Reporting Service

AXIGEN Mail Server Reporting Service AXIGEN Mail Server Reporting Service Usage and Configuration The article describes in full details how to properly configure and use the AXIGEN reporting service, as well as the steps for integrating it

More information

Socket Programming in the Data Communications Laboratory

Socket Programming in the Data Communications Laboratory Socket Programming in the Data Communications Laboratory William E. Toll Assoc. Prof. Computing and System Sciences Taylor University Upland, IN 46989 btoll@css.tayloru.edu ABSTRACT Although many data

More information

IOS Server Load Balancing

IOS Server Load Balancing IOS Server Load Balancing This feature module describes the Cisco IOS Server Load Balancing (SLB) feature. It includes the following sections: Feature Overview, page 1 Supported Platforms, page 5 Supported

More information

Energy Efficient MapReduce

Energy Efficient MapReduce Energy Efficient MapReduce Motivation: Energy consumption is an important aspect of datacenters efficiency, the total power consumption in the united states has doubled from 2000 to 2005, representing

More information

Andrew McRae Megadata Pty Ltd. andrew@megadata.mega.oz.au

Andrew McRae Megadata Pty Ltd. andrew@megadata.mega.oz.au A UNIX Task Broker Andrew McRae Megadata Pty Ltd. andrew@megadata.mega.oz.au This abstract describes a UNIX Task Broker, an application which provides redundant processing configurations using multiple

More information

Dynamic Load Balancing in a Network of Workstations

Dynamic Load Balancing in a Network of Workstations Dynamic Load Balancing in a Network of Workstations 95.515F Research Report By: Shahzad Malik (219762) November 29, 2000 Table of Contents 1 Introduction 3 2 Load Balancing 4 2.1 Static Load Balancing

More information

Study of Various Load Balancing Techniques in Cloud Environment- A Review

Study of Various Load Balancing Techniques in Cloud Environment- A Review International Journal of Computer Sciences and Engineering Open Access Review Paper Volume-4, Issue-04 E-ISSN: 2347-2693 Study of Various Load Balancing Techniques in Cloud Environment- A Review Rajdeep

More information

6.828 Operating System Engineering: Fall 2003. Quiz II Solutions THIS IS AN OPEN BOOK, OPEN NOTES QUIZ.

6.828 Operating System Engineering: Fall 2003. Quiz II Solutions THIS IS AN OPEN BOOK, OPEN NOTES QUIZ. Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.828 Operating System Engineering: Fall 2003 Quiz II Solutions All problems are open-ended questions. In

More information

CSC 2405: Computer Systems II

CSC 2405: Computer Systems II CSC 2405: Computer Systems II Spring 2013 (TR 8:30-9:45 in G86) Mirela Damian http://www.csc.villanova.edu/~mdamian/csc2405/ Introductions Mirela Damian Room 167A in the Mendel Science Building mirela.damian@villanova.edu

More information

EView/400i Management Pack for Systems Center Operations Manager (SCOM)

EView/400i Management Pack for Systems Center Operations Manager (SCOM) EView/400i Management Pack for Systems Center Operations Manager (SCOM) Concepts Guide Version 6.3 November 2012 Legal Notices Warranty EView Technology makes no warranty of any kind with regard to this

More information

Teldat Router. DNS Client

Teldat Router. DNS Client Teldat Router DNS Client Doc. DM723-I Rev. 10.00 March, 2003 INDEX Chapter 1 Domain Name System...1 1. Introduction...2 2. Resolution of domains...3 2.1. Domain names resolver functionality...4 2.2. Functionality

More information

Multi-Channel Clustered Web Application Servers

Multi-Channel Clustered Web Application Servers THE AMERICAN UNIVERSITY IN CAIRO SCHOOL OF SCIENCES AND ENGINEERING Multi-Channel Clustered Web Application Servers A Masters Thesis Department of Computer Science and Engineering Status Report Seminar

More information

EVALUATION. WA1844 WebSphere Process Server 7.0 Programming Using WebSphere Integration COPY. Developer

EVALUATION. WA1844 WebSphere Process Server 7.0 Programming Using WebSphere Integration COPY. Developer WA1844 WebSphere Process Server 7.0 Programming Using WebSphere Integration Developer Web Age Solutions Inc. USA: 1-877-517-6540 Canada: 1-866-206-4644 Web: http://www.webagesolutions.com Chapter 6 - Introduction

More information

EWeb: Highly Scalable Client Transparent Fault Tolerant System for Cloud based Web Applications

EWeb: Highly Scalable Client Transparent Fault Tolerant System for Cloud based Web Applications ECE6102 Dependable Distribute Systems, Fall2010 EWeb: Highly Scalable Client Transparent Fault Tolerant System for Cloud based Web Applications Deepal Jayasinghe, Hyojun Kim, Mohammad M. Hossain, Ali Payani

More information

Running a Workflow on a PowerCenter Grid

Running a Workflow on a PowerCenter Grid Running a Workflow on a PowerCenter Grid 2010-2014 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise)

More information

Module 15: Network Structures

Module 15: Network Structures Module 15: Network Structures Background Topology Network Types Communication Communication Protocol Robustness Design Strategies 15.1 A Distributed System 15.2 Motivation Resource sharing sharing and

More information

Understanding TCP/IP. Introduction. What is an Architectural Model? APPENDIX

Understanding TCP/IP. Introduction. What is an Architectural Model? APPENDIX APPENDIX A Introduction Understanding TCP/IP To fully understand the architecture of Cisco Centri Firewall, you need to understand the TCP/IP architecture on which the Internet is based. This appendix

More information

FileMaker Server 7. Administrator s Guide. For Windows and Mac OS

FileMaker Server 7. Administrator s Guide. For Windows and Mac OS FileMaker Server 7 Administrator s Guide For Windows and Mac OS 1994-2004, FileMaker, Inc. All Rights Reserved. FileMaker, Inc. 5201 Patrick Henry Drive Santa Clara, California 95054 FileMaker is a trademark

More information

Chapter 11 I/O Management and Disk Scheduling

Chapter 11 I/O Management and Disk Scheduling Operatin g Systems: Internals and Design Principle s Chapter 11 I/O Management and Disk Scheduling Seventh Edition By William Stallings Operating Systems: Internals and Design Principles An artifact can

More information

IBM Tivoli Composite Application Manager for Microsoft Applications: Microsoft Hyper-V Server Agent Version 6.3.1 Fix Pack 2.

IBM Tivoli Composite Application Manager for Microsoft Applications: Microsoft Hyper-V Server Agent Version 6.3.1 Fix Pack 2. IBM Tivoli Composite Application Manager for Microsoft Applications: Microsoft Hyper-V Server Agent Version 6.3.1 Fix Pack 2 Reference IBM Tivoli Composite Application Manager for Microsoft Applications:

More information

Load Balancing and Termination Detection

Load Balancing and Termination Detection Chapter 7 Slide 1 Slide 2 Load Balancing and Termination Detection Load balancing used to distribute computations fairly across processors in order to obtain the highest possible execution speed. Termination

More information

UNISOL SysAdmin. SysAdmin helps systems administrators manage their UNIX systems and networks more effectively.

UNISOL SysAdmin. SysAdmin helps systems administrators manage their UNIX systems and networks more effectively. 1. UNISOL SysAdmin Overview SysAdmin helps systems administrators manage their UNIX systems and networks more effectively. SysAdmin is a comprehensive system administration package which provides a secure

More information

Practice #3: Receive, Process and Transmit

Practice #3: Receive, Process and Transmit INSTITUTO TECNOLOGICO Y DE ESTUDIOS SUPERIORES DE MONTERREY CAMPUS MONTERREY Pre-Practice: Objective Practice #3: Receive, Process and Transmit Learn how the C compiler works simulating a simple program

More information

Audit Trail Administration

Audit Trail Administration Audit Trail Administration 0890431-030 August 2003 Copyright 2003 by Concurrent Computer Corporation. All rights reserved. This publication or any part thereof is intended for use with Concurrent Computer

More information

Load Balancing in cloud computing

Load Balancing in cloud computing Load Balancing in cloud computing 1 Foram F Kherani, 2 Prof.Jignesh Vania Department of computer engineering, Lok Jagruti Kendra Institute of Technology, India 1 kheraniforam@gmail.com, 2 jigumy@gmail.com

More information

Computer Systems II. Unix system calls. fork( ) wait( ) exit( ) How To Create New Processes? Creating and Executing Processes

Computer Systems II. Unix system calls. fork( ) wait( ) exit( ) How To Create New Processes? Creating and Executing Processes Computer Systems II Creating and Executing Processes 1 Unix system calls fork( ) wait( ) exit( ) 2 How To Create New Processes? Underlying mechanism - A process runs fork to create a child process - Parent

More information

PART IV Performance oriented design, Performance testing, Performance tuning & Performance solutions. Outline. Performance oriented design

PART IV Performance oriented design, Performance testing, Performance tuning & Performance solutions. Outline. Performance oriented design PART IV Performance oriented design, Performance testing, Performance tuning & Performance solutions Slide 1 Outline Principles for performance oriented design Performance testing Performance tuning General

More information

CHAPTER 15: Operating Systems: An Overview

CHAPTER 15: Operating Systems: An Overview CHAPTER 15: Operating Systems: An Overview The Architecture of Computer Hardware, Systems Software & Networking: An Information Technology Approach 4th Edition, Irv Englander John Wiley and Sons 2010 PowerPoint

More information

Chapter 14: Distributed Operating Systems

Chapter 14: Distributed Operating Systems Chapter 14: Distributed Operating Systems Chapter 14: Distributed Operating Systems Motivation Types of Distributed Operating Systems Network Structure Network Topology Communication Structure Communication

More information

Virtual Servers. Virtual machines. Virtualization. Design of IBM s VM. Virtual machine systems can give everyone the OS (and hardware) that they want.

Virtual Servers. Virtual machines. Virtualization. Design of IBM s VM. Virtual machine systems can give everyone the OS (and hardware) that they want. Virtual machines Virtual machine systems can give everyone the OS (and hardware) that they want. IBM s VM provided an exact copy of the hardware to the user. Virtual Servers Virtual machines are very widespread.

More information

FPGA area allocation for parallel C applications

FPGA area allocation for parallel C applications 1 FPGA area allocation for parallel C applications Vlad-Mihai Sima, Elena Moscu Panainte, Koen Bertels Computer Engineering Faculty of Electrical Engineering, Mathematics and Computer Science Delft University

More information