Design of a High-Availability Multimedia Scheduling Service using Primary-Backup Replication

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Design of a High-Availability Multimedia Scheduling Service using Primary-Backup Replication"

Transcription

1 Design of a High-Availability Multimedia Scheduling Service using Primary-Backup Replication Goudong (Shawn) Liu Vivek Sawant December 12, 2001 Mark R. Lindsey Abstract The design of a system for highly-available client/server applications is presented, together with an example application for scheduling video services. 1 Introduction A conventional, non-replicated client/server application is susceptible to numerous types of failures, including: 1. Server crash failure 2. Client crash failure 3. Client-to-Server communication failure We wished to provide a system which would address the problem of Server crashes. A server, as used here, is a computing system which provides some service to a number of clients (i.e., users). A server crash occurs when a server ceases to operate, such that all application state which was present in volatile storage (e.g., high-speed and virtual memory) at the time of the crash is nolonger available in any form, even after server recovery (i.e., when the server has started again and is again available to provide service to clients). The standard approach to avoid the loss of application state is to store some of the application state in This effort was prosecuted for Comp 243: Distributed Systems, at the University of North Carolina at Chapel Hill, fall, non-volatile storage (e.g., on disk). Thus, even when the server crashes, data which was present in nonvolatile storage at the time of the crash is available. However, during the failure period (i.e., the period of time between the server crash and the server recovery), no service is available to clients. We wish to continue to provide service during the period after a given server has crashed and before it is recovered. We employ spares to accomplish this task, and assume that each server fails independently. A spare server is not needed to provide the service when no failures have occurred: it is provided strictly to support operation of the service during a failure period. Providing a single spare can, ideally, allow the service to operate uninterrupted as long as a single server is not failing. If the probability of failure during a given time interval t of primary server P is P (P ) and the probability of failure of the spare server S during an equivalent time interval is P (S) then the probability that both will fail simultaneously is P (S) P (P ). For example, if t = 1 hour, P (P ) = , and P (S) = , then the mean time between failures (MTTF) of the primary only is 1 P (P ) = 3, 333 hours, while the MTTF of the primary and the spare 1 together is P (P ) P (S) = 8, 333, 333 hours a drastic improvement. Further, supposing a mean repair time (MTTR) of 24 hours, then the availability of running the primary server only is 3,333 3, = %, while the expected availability of running the primary together with the spare is as high as 8,333,333 8,333, = 1

2 99.999%. 1 In general, if a service can operate on a single server, then to survive the failure of f machines the system should include f + 1 servers, f of which may be considered as spares. But, in order to approach the availability gains described above, a fundamental issue must be resolved: how can a service which has been programmed to operate on a single server be made to operate in a cluster of servers as if it were operating on a single server? Each server in the cluster must be capable of providing the service; each, the cluster is said to be composed of a set of replica servers. However, a naive duplication of the service is not sufficient: simply running multiple instances of the service will not produce the same results as a single server providing the service. The primary-backup approach to replication addresses this problem by designating that each server participating in the cluster has a role at any instant in time, either as primary or as backup. At any instant, the primary server provides the service to the clients, while the backups stand-by as spares. When a backup fails, the service is not affected; but when the primary fails, exactly one of the backups must assume the role of primary. Clients communicate only with the current primary. The implementation of such replication is nontrivial, and requires that the following issues be addressed: 1. How does the client determine which replica is the primary? 2. How does a backup know that the primary has failed? 3. When the primary has failed, how does a particular backup know whether it should take over as primary? (Recall that exactly one of the backups should assume the role of primary when the primary fails.) 4. How is application state replicated from the primary servers to the backup? 1 This analysis assumes an idealized case in which the spare could provide the service seamlessly starting at the instant that the primary fails. In this paper, we describe a primary-backup replication system implementation which addresses these questions, and which can be used to develop highlyavailable applications. We also present an example application which demonstrates this functionality. 2 Replication Service Overview At any point during operation of the service, each replica in a must have Current application state such that the replica could assume the primary role. Knowledge of its current role so that the application can function properly, according to its role. For example, a backup should refuse service requests which would modify the application state, while a primary should service such requests. We provide a replication system which can ensure that each application has this information. 2.1 Application/Replication System Interaction The replication system interface (FTServer) provides a set of procedure calls for use by the application: Start Fault-Tolerant Server instructs the replication system to join the cluster; the calling application is a replica when this call returns. FTServer(AppEventListener) (constructor) Am I primary allows a replica to determine whether it is the primary. boolean isprimary() Broadcast application state is provided only to the primary, and instructs the replication system to distribute an updated version of the application state. Note that this procedure does not allow the primary to determine whether the application state was properly received by any of the backup replicas; as we shall see, this is a property of the non-blocking protocol used. 2

3 This method does nothing on a backup replica, ensuring that the replicated system state remains consistent. void bcaststateupdate(object) Get connected replicas provides a replica with a list of all other active members of the cluster. InetAddress[] getserverlist() To use the replication system, an application must provide an Application Event Listener (AppEventListener) to the Start Fault-Tolerant Server procedure. The replication system uses a type of callback to inform the application of some events in the cluster by calling these procedures in the AppEventListener: Add server informs the primary replica that a new server has joined the cluster. void addserver(inetaddress) Remove Server informs the primary replica that a server which was in the cluster has left the cluster (e.g., by failing). void removeserver(inetaddress) Update State informs a backup replica that new application state is available. The application, while running in the backup role, contracts to store the current application state. void updatestate(object) 2.2 Application-State Distribution Mechanism We considered two mechanisms for distribution of the application state: 1. Distribute only updates as they occur 2. Distributed the entire application state as necessary Distributing only the updates would allow the system to support applications in which the replicated application state is arbitrarily large. The integration of a new replica, or re-integration of a recovered replica would require that the updates be replayed in order to the joining replica (assuming that no version of the application state is copied to non-volatile storage). Alternately, if a copy of the entire application state can be distributed each time, then no such rollforward protocol is required. We selected this option for its simplicity, and a communication infrastructure to support it. 2.3 Replication System Options Several replications methodologies have been described in literature, and were considered: State-machine approach In the state-machine approach the client presents service requests to every member of the cluster, and collects their responses. If a sufficient number of replicas send an equivalent response, then this response is taken as the true response to the service request. This technique requires that each client is programmed to communicate with the cluster; i.e., each client must be aware of the replication system. We wished to provide a system which was decoupled from the client/server application itself, so this technique was rejected. Single primary, single backup This approach uses a cluster of exactly two replicas, in which one must be dedicated as the primary and another as the backup at any time. The backup changes roles to become the primary only when the primary has failed; as such, the protocol used to determine when to change roles is straightforward. This technique does allow for the replication subsystem to operate independently of the client/server application. However, we wished to provide a system which would use all available resources and provide greater availability than a two-replica cluster can provide. Single primary, multiple backups This approach is a generalization of the singleprimary/single-backup approach, as it uses an arbitrary number of backups. It can provide greater availability, because a failure of 3

4 the service requires that every replica fail simultaneously. This approach does introduce additional complexities, principle of which is the distributed consensus required to decide which of the replicas will takeover as primary when the primary has failed. Variations of this technique exist; among them are blocking systems and non-blocking systems. The Blocking time in such a system is the worstcase delay between the receipt of a client request by the primary and the response to the client in a failure-free execution. Non-blocking systems have zero blocking time, and provide the fastest-possible response to the user. However, this technique does not allow the primary to confirm that any of the backup replicas has properly received the state change; therefore, there is a non-zero probability that an acknowledgement of an operation will be transmitted by the primary to the client, and that the primary will crash before any of the backups have successfully received the updated state. This is a lost update failure. We chose a to implement a non-blocking primarybackup system supporting an arbitrary number of replicas, as it satisfies our goals for application interaction, and provides for quick responses to the client. 3 Replication System Structure 3.1 Replication Manager Thread, PBServer Each replica runs an instance of the PBServer thread, which manages the replica s interaction with the other members of the cluster. This thread interacts as specified in 2.1 with the application. 3.2 Communications Substrate, objecttransfer The replication system was implemented in Java 2, using the Sun J2SDK 1.3. To support the nonblocking primary-backup protocol, the communication system needed to provide two fundamental services: 1. Send a message to a recipient, but do not wait to ensure that its transmission was completed. 2. Deliver received messages to the replication system as they are available, and do not force the replication system to block until messages are available. The system required the use of several types of messages, including Heartbeat messages and application-state transfer. Java provides a straightforward, blocking mechanism for transfer of objects across TCP channels Message Encapsulation Each type of message to be transmitted in this system was encapsulated as a Java class, and the contents were chosen carefully so as to ensure that the class could be marked serializable, as required for transfer by Java s default serialization protocol Non-Blocking Object Transmission For each server to which a replica wishes to transfer messages, the replica constructs an ObjectSender. When the replica needs to send a message, it invokes ObjectSender.send(Object), which returns immediately. send() starts a short-lived thread for sending the message, and discards any error results. A message can be transmitted to many recipients with a call to ObjectSenderGroup.broadcast(Object), where an ObjectSenderGroup is a collection of ObjectSender s Non-Blocking Object Reception The primary-backup technique requires that failures of a replica are detected by the absence of I-am-alive 4

5 Heartbeat messages; this implies that the replication system continue make progress through its processing even when no messages have been received. Java does not provide a non-blocking I/O interface, such as select(). Thus, we developed a multi-threaded mechanism for receiving objects: Any ObjectReceiver which wishes to receive a certain type of message (i.e., a certain class of objects) register() s as an Observer with the ReceivedObjectMediator. An ObjectListener runs as a thread and receives all messages from a single replica. It then forwards received objects to ReceivedObjectMediator, which forwards the objects on to the registered receiver. 4 Replication Operation The operation of the replication is based on the non-blocking protocol described in (cite: Budhiraja, Marzullo, et al, Optimal Primary... ). 4.1 Replicated State Each replica maintains two objects which must be equivalent on every replica for correct operation: Message type is encoded by the Java serialization protocol Application State with Version Number is explicitely copied from the primary to the backup each time it is updated. It can be any serializable object. After startup of the cluster (i.e., after the first server has started), the application running on the primary can update the application state at any time; when it does so, the replication system assigns it a version number which is 1 greater than the previous version number, and transmits the updated state with its new version number to all of the other replicas using the Application State Update protocol. Until the first version of the state is distributed, every replica considers the version to be 0 (zero), and the application state to be undefined. This is an acceptable configuration. Version Vector records, for each replica, its status (primary, backup, or faulty), and the version of the application state which that replica is known to have. The Version Vector maintained by all of the protocols described below: 1. Every message between replicas includes the sender s ssversion and isprimary status, as described in Each message, thus, can be used to update the sender s entry in the version vector. Specifically, the Heartbeat and Join- Response messages are used to determine which replica is primary, and to inform other replicas that new state has been received. 2. The absence of a Heartbeat from a replica ρ can indicate that ρ is faulty. 4.2 Protocols Four related protocols are provided to support operation of the cluster. In each protocol, the messages are transferred between replicas are encapsulated as serializable Java Objects, and encoded using Java s default serialization protocol Protocol Unit Header Each message (protocol unit) includes at least two fields: int ssversion The current version of the application state held by the sender when the message was constructed. boolean isprimary True iff the sender was primary when the message was constructed Heartbeat The Heartbeat or I-am-alive message is sent by a replica in order to 1. Inform other replicas that it is still functioning 2. Inform other replicas of its version of the application state 5

6 3. Inform other replicas when a role-change to primary has occurred. Each Heartbeat message consists only of the header contents, described above. Liveness Monitoring Upon joining the cluster, a replica ρ has a version vector which includes all of the active replicas at the time of joining. Every HeartbeatSendRate milliseconds, ρ broadcasts a Heartbeat message (i.e., it transmits a Heartbeat to every non-faulty replica, except itself). In our experiments, we set HeartbeatSendRate = During operation, ρ checks every HeartbeatCheckRate milliseconds to determine whether any new Heartbeats have been received (500ms in experiments). If, after HeartbeatTimeout milliseconds, a Heartbeat has not been received from another replica φ, then ρ updates its version vector to indicate that φ is faulty. State-Version Changes If a replica ρ receives an Application State Update immediately after ρ has broadcast a Heartbeat, then under the mechanism described above, every other replica s version vector will have an incorrect version number recorded for ρ, even though ρ does have the latest version of the state. Because only a replica with the current state can takeover as primary, every other replica may incorrectly conclude that ρ is not a candidate to takeover as primary. Thus, a failure of the primary before the next Heartbeat broadcast from ρ may cause contention to become the primary. Fundamentally, the problem is that of an inconsistent version vector. To remedy this, ρ will broadcast an extra Heartbeat immediately after it receives an Application State Update. This Heartbeat includes ρ s updated version number, so that each replica has a consistent version vector, as required for the Takeover by distributed consensus (see 4.2.5). Incidentally, while we observed this problem in development, it does not appeared to be mentioned in the original protocol specifications (cite: Optimal ) Join Upon starting, a replica ρ transmits a Join Request message to each member of on its list Replicas. This list contains an entry for each replica in the cluster, and indicates the replica s network address (IP address, in our case), and its rank. Normally, the list Replicas will be distributed before cluster startup. The Join Request message contains only the fields of the header. It is sent in an attempt to discover the current primary. Each active replica φ responds with a Join Response message, which contains the fields of the header, plus a field int result which resolves to one of the values: JOIN OK indicates that φ is the primary server, and that φ has recorded ρ as a functioning backup. JOIN LOCAL ERR is not used JOIN FAILED indicates that φ is not the primary server, but that φ has recorded ρ as a functioning backup. When a primary replica φ receives a Join Request from ρ, it responds with a result = JOIN OK as described above. It also broadcasts the current version of the Application State to all active replicas, using the Application State Update protocol. Even before the joining replica ρ receives the Application State, it is a functioning replica, but it transmits all messages with ssver = 0. If the current Application State version is not zero, then ρ is ineligible to takeover as primary. This is simply a specific case of the takeover by distributed consensus, described in Application State Transmission Only the primary replica may transmit Application State messages. Each Application State message consists of the header plus a single field, appstate, which is the entire contents of the application state encoded as a serializable object. 2 2 To be precise, Java transfers the application state as an Object Graph of serialized objects, so that references within the object can be followed to other objects. This allows for 6

7 When the application running on the primary replica calls FTServer.bcastStateUpdate(), the primary increments by one the recorded Application State Version number and broadcasts the new version of the Application State to all active backups. It does not wait to determine whether any of the backups receive the updated Application State; the conventional Heartbeat protocol described above is used by the primary to maintain its version vector. When a backup replica receives the Application State α, it stores the α locally, and updates its own Application State Version (as used in outgoing messages) to α s ssver. It then calls AppEventListener.updateState(α) to inform the application running on the backup that new application state is available. This mechanism allows a backup replica to perform non-modifying operations on the Application State; for example, in the multimedia scheduling application described in 5, clients may view the Application State on any backup, but may only modify it on the primary Takeover When the primary replica fails, exactly one of the backups must takeover as primary. Our implementation of this technique makes use of the Version Vector maintained by the message exchanges described above. Distributed Consensus In a single-backup cluster, the backup replica can always takeover as primary immediately. However, in our cluster, all of the backup replicas could potentially take over as primary. Thus, we adapted a distributed consensus protocol to determine which one of the backups should takeover as primary. When a backup detects that the primary has failed, it consults an algorithm boolean cantakeoverasprimary() to determine whether it must assume the role of primary. This algorithm consists of the following: conventional object-oriented techniques to be used in the software design of the application. 1. If (version[self] > (version[i], for all i in Non-Faulty)) then return cantakeoverasprimary := true; 2. Else If (((version[self] == version[j]) AND (rank[self] > rank[j])), for all j in Non-Faulty) then return cantakeoverasprimary := true; 3. Else return cantakeoverasprimary := false; This algorithm ensures that a replica with the latest application state available among the non-faulty servers takes over as primary, and if there are multiple replicas with the latest application state, then the tie is broken by the rank which is guaranteed through configuration to be unique to each server. Fail-over time The fail-over time is the period during which no server is the primary, such that the service is unavailable. For the implementation described, this time is F (δ+heartbeatsendrate = During operation, ρ checks every HeartbeatCheckRate milliseconds to determine whether any new Heartbeats have been received (500ms in experiments). If, after HeartbeatTimeout milliseconds, a Heartbeat has not been received from another replica φ, then ρ updates its version vector to indicate that φ is faulty. 4.3 Failure Handling The replication system presented is designed to provide higher availability than could be achieved with a single server. The behavior of the system under various fault conditions is described Crashed Server We wished to provide proper, uninterrupted operation from the time that the cluster starts as long as any one replica is functioning, provided that each server is functioning long enough to join the cluster and, receive the Application State from the current 7

8 primary. This requires that we handle server crash failures and re-integration. As described above in 4.2.5, if the primary server fails during otherwise-normal operation, then exactly one of the non-faulty backup replicas will takeover as primary. Thus, service continues to be available. When a server joins the cluster (either after recovering from failure, or when starting for the first time), it employs the Join protocol to become a replica. When it has received the Application State, it is a fully-functional backup, and can subsequently takeover as primary. Experiments have demonstrated that our implementation performs this operation reliably Missed Message The response of the system to a missed message depends on the type of message that was missed. Missed Heartbeat If replica ρ misses a Heartbeat from replica φ, then ρ will mark φ as faulty, and ρ will cease to transmit any messages to φ until φ sends ρ another Join Request. Missed Application State When a replica ρ misses an updated Application State (i.e., a message a version of the Application State which is newer than any previously-distributed version), then it will nolonger be eligible to takeover as primary as long as another replica with a newer version is non-faulty Missed Application State + Primary Crash The presented replication system does not attempt to recover from the multiple-failure described following scenerio: 1. A client κ makes a request to the primary φ 2. φ modifies the application state 3. φ broadcasts the updated application state to all backup replicas, but every backup misses the update 4. φ responds to κ with an indication that the update has been made. 5. φ crashes In this case, the client believes that the change has been made, but it was actually lost. One of the backup replicas will takeover as primary, but the new primary will not have the change made by κ. This scenario describes a disadvantage of every non-blocking primary-backup protocol Network Partition In a network partition, the replica cluster is divided into multiple groups of servers; each server within a partition can communicate with others in the partition. In this case, a primary will takeover within each partition, and clients within that partition can communicate only with that primary. This will cause inconsistent application state to be maintained within each partition. When the network partition is repaired, the servers within each of the previous partitions will continue to communicate with each other only. A new server ρ will contact all of the servers, and will join the first primary φ whose Join Response message ρ receives first. It will establish contact with all of the non-faulty replicas in each of the partitioned clusters, and ultimately may elect to takeover as primary within one of the clusters. The long-term results of running in such an arrangement are undefined. Clearly, such a degenerate configuration is undesirable. It can be repaired in only some cases by stopping every server in any cluster except for the one server which has a desirable version of the Application State (if any such version exists), then starting all of the other servers. Supplemental mechanisms are required to provide proper operation in the presence of network partitions Link Failure A link failure occurs when a pair of servers φ and ρ cannot communicate with each other, but both of 8

9 which can communicate with some common set of other servers. While not supported in the version of replication system presented here, we have done work to develop a protocol providing safe operation in the presence of link failures. It would consist of a can-you-see-theprimary protocol, used follows: 1. If the primary φ loses contact with a backup, then the primary marks the backup as faulty and proceeds as usual. 2. If the backup ρ loses contact with the primary, then the backup polls each of the other backup replicas to determine whether any of them can see a primary. If any one of them can see a primary, then ρ halts. Otherwise, the takeover protocol (see 4.2.5) is invoked. The implementation of this protocol is left as future work. 5 Distributed Video Scheduling Service, MESS To demonstrate this replication system in a useful application, we developed the Multimedia Entertainment Super Server, MESS. 5.1 Overview of Service MESS provides a highly-available scheduling service for streaming video. A client connects to the MESS primary server to view the request to watch a particular television channel, and the MESS server attempts to satisfy the request using one of the available video servers. If the request can be satisfied, then the assigned video server tunes to the appropriate television channel, and transmits the video back to the requesting client. Each MESS server is both a member of the application cluster, and is a video server. The backup MESS servers cannot be used for scheduling, but they do provide a read-only view of the schedule to users. A user communicates with the MESS servers to view the schedule and request to view a specific channel through a web interface (i.e., HTML and CGI over HTTP). The MESS server assigned to transmit video to a particular client uses an Open Mash program developed by Ketan Mayer-Patel 3 to transmit the video. 5.2 System Structure The Application Layer stands at the top of the MESS system. It has the following goals: 1. Provide scheduling facility for a client to schedule viewing of an entertainment event (watching a TV channel) via any of the available servers. 2. Provide multimedia streaming via video servers Scheduler As figure 2 shows, the Scheduler maintains a global schedule which is implemented as an in-memory database storing all the schedule information of the whole MESS system. The Global Schedule consists of a set of Server Schedules. Each Server Schedule has a set of Schedule Entries. Each Schedule Entry has a client identifier (the Internet address of the client), the requested television channel, the time to start playing, and the time to stop playing Replication Mediator The Replication Mediator provides for interaction between the video-scheduling application layer and replication layer. It initiates the interaction with the cluster by constructing an FTServer(), and receives and stores each new version of the Global Schedule that is received from either the application (when the replica is a primary) or from the replication system (when the replica is a backup). Each time the Global Schedule changes, the Replication Mediator sends the schedule to the Video Player. 3 University of North Carolina, 9

10 Figure 1: Layers of the MESS application, together with the replication system. MESS Application Replication System Communication System Figure 2: Global Schedule Structure. A single GlobalSchedule is the replicated Application State for MESS. Global Schedule Schedule Entry... Schedule Entry Server Schedule... 10

11 Figure 3: Communication paths. The clients communicate only with the primary, via HTTP. Each MESS server can stream video to one client, but each client may receive multiple streams. MESS Server Client Backup HTTP Client MESS Server Client HTTP HTTP MESS Server Backup CATV source Promary MESS Server Backup Video Player The Video Player module interprets the Global Schedule to drive the streaming-video subsystem. When a MESS server receives an updated Global Schedule indicating that it should stream video to a client, it starts playing video. When a server crashes, its entries are not removed from a schedule. When a server joins the cluster, it gets a version of the Global Schedule. This schedule indicate that the newly-joining server needs to start streaming video; when this occurs, the it starts streaming immediately. This provides a certain sort of recovery for the service after a server crash. has requests. If there no conflict, it will add the job to the schedule, and show the change to the user. The schedule is then immediately propagated to all other members of the cluster User Interface The User Interface uses a servlet running in Apache Tomcat 4 (Catalina) servlet engine. It displays the global schedule, as retrieved from the Replication Mediator. The current status of each server for which videos have been scheduled is also shown to the user. On the primary server, the client can make a request to view a channel. When the client makes a new request, the primary will immediately show whether there is a conflict with the schedule that the client 11

High Availability and Clustering

High Availability and Clustering High Availability and Clustering AdvOSS-HA is a software application that enables High Availability and Clustering; a critical requirement for any carrier grade solution. It implements multiple redundancy

More information

High Availability Design Patterns

High Availability Design Patterns High Availability Design Patterns Kanwardeep Singh Ahluwalia 81-A, Punjabi Bagh, Patiala 147001 India kanwardeep@gmail.com +91 98110 16337 Atul Jain 135, Rishabh Vihar Delhi 110092 India jain.atul@wipro.com

More information

A Framework for Highly Available Services Based on Group Communication

A Framework for Highly Available Services Based on Group Communication A Framework for Highly Available Services Based on Group Communication Alan Fekete fekete@cs.usyd.edu.au http://www.cs.usyd.edu.au/ fekete Department of Computer Science F09 University of Sydney 2006,

More information

Non-Native Options for High Availability

Non-Native Options for High Availability The Essentials Series: Configuring High Availability for Windows Server 2008 Environments Non-Native Options for High Availability by Non-Native Options for High Availability... 1 Suitability and Cost...

More information

MarkLogic Server. Database Replication Guide. MarkLogic 8 February, 2015. Copyright 2015 MarkLogic Corporation. All rights reserved.

MarkLogic Server. Database Replication Guide. MarkLogic 8 February, 2015. Copyright 2015 MarkLogic Corporation. All rights reserved. Database Replication Guide 1 MarkLogic 8 February, 2015 Last Revised: 8.0-1, February, 2015 Copyright 2015 MarkLogic Corporation. All rights reserved. Table of Contents Table of Contents Database Replication

More information

Facebook: Cassandra. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation

Facebook: Cassandra. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation Facebook: Cassandra Smruti R. Sarangi Department of Computer Science Indian Institute of Technology New Delhi, India Smruti R. Sarangi Leader Election 1/24 Outline 1 2 3 Smruti R. Sarangi Leader Election

More information

Panorama High Availability

Panorama High Availability Panorama High Availability Palo Alto Networks Panorama Administrator s Guide Version 6.0 Contact Information Corporate Headquarters: Palo Alto Networks 4401 Great America Parkway Santa Clara, CA 95054

More information

Informix Dynamic Server May 2007. Availability Solutions with Informix Dynamic Server 11

Informix Dynamic Server May 2007. Availability Solutions with Informix Dynamic Server 11 Informix Dynamic Server May 2007 Availability Solutions with Informix Dynamic Server 11 1 Availability Solutions with IBM Informix Dynamic Server 11.10 Madison Pruet Ajay Gupta The addition of Multi-node

More information

Cheap Paxos. Leslie Lamport and Mike Massa. Appeared in The International Conference on Dependable Systems and Networks (DSN 2004 )

Cheap Paxos. Leslie Lamport and Mike Massa. Appeared in The International Conference on Dependable Systems and Networks (DSN 2004 ) Cheap Paxos Leslie Lamport and Mike Massa Appeared in The International Conference on Dependable Systems and Networks (DSN 2004 ) Cheap Paxos Leslie Lamport and Mike Massa Microsoft Abstract Asynchronous

More information

Integration of PRIMECLUSTER and Mission- Critical IA Server PRIMEQUEST

Integration of PRIMECLUSTER and Mission- Critical IA Server PRIMEQUEST Integration of and Mission- Critical IA Server V Masaru Sakai (Manuscript received May 20, 2005) Information Technology (IT) systems for today s ubiquitous computing age must be able to flexibly accommodate

More information

Database Replication

Database Replication Database Systems Journal vol. I, no. 2/2010 33 Database Replication Marius Cristian MAZILU Academy of Economic Studies, Bucharest, Romania mariuscristian.mazilu@gmail.com, mazilix@yahoo.com For someone

More information

Distributed Systems: Concepts and Design

Distributed Systems: Concepts and Design Distributed Systems: Concepts and Design Edition 3 By George Coulouris, Jean Dollimore and Tim Kindberg Addison-Wesley, Pearson Education 2001. Chapter 2 Exercise Solutions 2.1 Describe and illustrate

More information

Primary-Backup Systems. CS249 FALL 2005 Sang Soo Kim

Primary-Backup Systems. CS249 FALL 2005 Sang Soo Kim Primary-Backup Systems CS249 FALL 2005 Sang Soo Kim Active Replication vs. Primary-Backup In active-replication (state machine approach from Ch.7) o Client sends request to all servers o All servers execute

More information

Creating Web Farms with Linux (Linux High Availability and Scalability)

Creating Web Farms with Linux (Linux High Availability and Scalability) Creating Web Farms with Linux (Linux High Availability and Scalability) Horms (Simon Horman) horms@verge.net.au December 2001 For Presentation in Tokyo, Japan http://verge.net.au/linux/has/ http://ultramonkey.org/

More information

Synchronization in Distributed Systems

Synchronization in Distributed Systems Synchronization in Distributed Systems Chapter 4: Time and Synchronisation Page 1 1 Cooperation and Coordination in Distributed Systems Communication Mechanisms for the communication between processes

More information

Computer Networks. Chapter 5 Transport Protocols

Computer Networks. Chapter 5 Transport Protocols Computer Networks Chapter 5 Transport Protocols Transport Protocol Provides end-to-end transport Hides the network details Transport protocol or service (TS) offers: Different types of services QoS Data

More information

Availability Digest. MySQL Clusters Go Active/Active. December 2006

Availability Digest. MySQL Clusters Go Active/Active. December 2006 the Availability Digest MySQL Clusters Go Active/Active December 2006 Introduction MySQL (www.mysql.com) is without a doubt the most popular open source database in use today. Developed by MySQL AB of

More information

Computer Network. Interconnected collection of autonomous computers that are able to exchange information

Computer Network. Interconnected collection of autonomous computers that are able to exchange information Introduction Computer Network. Interconnected collection of autonomous computers that are able to exchange information No master/slave relationship between the computers in the network Data Communications.

More information

TECHNOLOGY BRIEF. Compaq RAID on a Chip Technology EXECUTIVE SUMMARY CONTENTS

TECHNOLOGY BRIEF. Compaq RAID on a Chip Technology EXECUTIVE SUMMARY CONTENTS TECHNOLOGY BRIEF August 1999 Compaq Computer Corporation Prepared by ISSD Technology Communications CONTENTS Executive Summary 1 Introduction 3 Subsystem Technology 3 Processor 3 SCSI Chip4 PCI Bridge

More information

Fault Tolerance in the Internet: Servers and Routers

Fault Tolerance in the Internet: Servers and Routers Fault Tolerance in the Internet: Servers and Routers Sana Naveed Khawaja, Tariq Mahmood Research Associates Department of Computer Science Lahore University of Management Sciences Motivation Client Link

More information

systems' resilience to disk failure through

systems' resilience to disk failure through BY Shashwath Veerappa Devaru CS615 Aspects of System Administration Using Multiple Hard Drives for Performance and Reliability RAID is the term used to describe a storage systems' resilience to disk failure

More information

State-Machine Replication

State-Machine Replication State-Machine Replication The Problem Clients Server The Problem Clients Server The Problem Clients Server The Problem Clients Server The Problem Clients Server The Problem Clients Server Solution: replicate

More information

Clustering with Tomcat. Introduction. O'Reilly Network: Clustering with Tomcat. by Shyam Kumar Doddavula 07/17/2002

Clustering with Tomcat. Introduction. O'Reilly Network: Clustering with Tomcat. by Shyam Kumar Doddavula 07/17/2002 Page 1 of 9 Published on The O'Reilly Network (http://www.oreillynet.com/) http://www.oreillynet.com/pub/a/onjava/2002/07/17/tomcluster.html See this if you're having trouble printing code examples Clustering

More information

Power-All Networks Clustered Storage Area Network: A scalable, fault-tolerant, high-performance storage system.

Power-All Networks Clustered Storage Area Network: A scalable, fault-tolerant, high-performance storage system. Power-All Networks Clustered Storage Area Network: A scalable, fault-tolerant, high-performance storage system. Power-All Networks Ltd Abstract: Today's network-oriented computing environments require

More information

Using email over FleetBroadband

Using email over FleetBroadband Using email over FleetBroadband Version 01 20 October 2007 inmarsat.com/fleetbroadband Whilst the information has been prepared by Inmarsat in good faith, and all reasonable efforts have been made to ensure

More information

IBM Security QRadar SIEM Version 7.2.6. High Availability Guide IBM

IBM Security QRadar SIEM Version 7.2.6. High Availability Guide IBM IBM Security QRadar SIEM Version 7.2.6 High Availability Guide IBM Note Before using this information and the product that it supports, read the information in Notices on page 35. Product information This

More information

VIRTUAL LABORATORY: MULTI-STYLE CODE EDITOR

VIRTUAL LABORATORY: MULTI-STYLE CODE EDITOR VIRTUAL LABORATORY: MULTI-STYLE CODE EDITOR Andrey V.Lyamin, State University of IT, Mechanics and Optics St. Petersburg, Russia Oleg E.Vashenkov, State University of IT, Mechanics and Optics, St.Petersburg,

More information

GlobalSCAPE DMZ Gateway, v1. User Guide

GlobalSCAPE DMZ Gateway, v1. User Guide GlobalSCAPE DMZ Gateway, v1 User Guide GlobalSCAPE, Inc. (GSB) Address: 4500 Lockhill-Selma Road, Suite 150 San Antonio, TX (USA) 78249 Sales: (210) 308-8267 Sales (Toll Free): (800) 290-5054 Technical

More information

ELIXIR LOAD BALANCER 2

ELIXIR LOAD BALANCER 2 ELIXIR LOAD BALANCER 2 Overview Elixir Load Balancer for Elixir Repertoire Server 7.2.2 or greater provides software solution for load balancing of Elixir Repertoire Servers. As a pure Java based software

More information

Five Secrets to SQL Server Availability

Five Secrets to SQL Server Availability Five Secrets to SQL Server Availability EXECUTIVE SUMMARY Microsoft SQL Server has become the data management tool of choice for a wide range of business critical systems, from electronic commerce to online

More information

Highly Available AMPS Client Programming

Highly Available AMPS Client Programming Highly Available AMPS Client Programming 60East Technologies Copyright 2013 All rights reserved. 60East, AMPS, and Advanced Message Processing System are trademarks of 60East Technologies, Inc. All other

More information

Cisco Active Network Abstraction Gateway High Availability Solution

Cisco Active Network Abstraction Gateway High Availability Solution . Cisco Active Network Abstraction Gateway High Availability Solution White Paper This white paper describes the Cisco Active Network Abstraction (ANA) Gateway High Availability solution developed and

More information

Synchronization in. Distributed Systems. Cooperation and Coordination in. Distributed Systems. Kinds of Synchronization.

Synchronization in. Distributed Systems. Cooperation and Coordination in. Distributed Systems. Kinds of Synchronization. Cooperation and Coordination in Distributed Systems Communication Mechanisms for the communication between processes Naming for searching communication partners Synchronization in Distributed Systems But...

More information

MCAPS 3000 DISASTER RECOVERY GUIDE

MCAPS 3000 DISASTER RECOVERY GUIDE MCAPS 3000 DISASTER RECOVERY GUIDE Manual Part Number 99875294-1 FEBRUARY 2004 REGISTERED TO ISO 9001:2000 1710 Apollo Court Seal Beach, CA 90740 Phone: (562) 546-6400 FAX: (562) 546-6301 Technical Support:

More information

Apache Tomcat. Load-balancing and Clustering. Mark Thomas, 20 November 2014. 2014 Pivotal Software, Inc. All rights reserved.

Apache Tomcat. Load-balancing and Clustering. Mark Thomas, 20 November 2014. 2014 Pivotal Software, Inc. All rights reserved. 2 Apache Tomcat Load-balancing and Clustering Mark Thomas, 20 November 2014 Introduction Apache Tomcat committer since December 2003 markt@apache.org Tomcat 8 release manager Member of the Servlet, WebSocket

More information

INCREASE SYSTEM AVAILABILITY BY LEVERAGING APACHE TOMCAT CLUSTERING

INCREASE SYSTEM AVAILABILITY BY LEVERAGING APACHE TOMCAT CLUSTERING INCREASE SYSTEM AVAILABILITY BY LEVERAGING APACHE TOMCAT CLUSTERING Open source is the dominant force in software development today, with over 80 percent of developers now using open source in their software

More information

DHCP Failover. Necessary for a secure and stable network. DHCP Failover White Paper Page 1

DHCP Failover. Necessary for a secure and stable network. DHCP Failover White Paper Page 1 DHCP Failover Necessary for a secure and stable network DHCP Failover White Paper Page 1 Table of Contents 1. Introduction... 3 2. Basic DHCP Redundancy... 3 3. VitalQIP Failover Solution... 5 4. VitalQIP

More information

www.basho.com Technical Overview Simple, Scalable, Object Storage Software

www.basho.com Technical Overview Simple, Scalable, Object Storage Software www.basho.com Technical Overview Simple, Scalable, Object Storage Software Table of Contents Table of Contents... 1 Introduction & Overview... 1 Architecture... 2 How it Works... 2 APIs and Interfaces...

More information

Contents. SnapComms Data Protection Recommendations

Contents. SnapComms Data Protection Recommendations Contents Abstract... 2 SnapComms Solution Environment... 2 Concepts... 3 What to Protect... 3 Database Failure Scenarios... 3 Physical Infrastructure Failures... 3 Logical Data Failures... 3 Service Recovery

More information

Active-Active and High Availability

Active-Active and High Availability Active-Active and High Availability Advanced Design and Setup Guide Perceptive Content Version: 7.0.x Written by: Product Knowledge, R&D Date: July 2015 2015 Perceptive Software. All rights reserved. Lexmark

More information

Dr Markus Hagenbuchner markus@uow.edu.au CSCI319. Distributed Systems

Dr Markus Hagenbuchner markus@uow.edu.au CSCI319. Distributed Systems Dr Markus Hagenbuchner markus@uow.edu.au CSCI319 Distributed Systems CSCI319 Chapter 8 Page: 1 of 61 Fault Tolerance Study objectives: Understand the role of fault tolerance in Distributed Systems. Know

More information

RedundancyMaster Help. 2014 Kepware Technologies

RedundancyMaster Help. 2014 Kepware Technologies 2014 Kepware Technologies 2 RedundancyMaster Help Table of Contents Table of Contents 2 Introduction 4 System Requirements 10 Accessing the Administration Menu 11 Setting Up Redundancy 11 Adding Redundancy

More information

WebEx. Network Bandwidth White Paper. WebEx Communications Inc. - 1 -

WebEx. Network Bandwidth White Paper. WebEx Communications Inc. - 1 - WebEx Network Bandwidth White Paper WebEx Communications Inc. - 1 - Copyright WebEx Communications, Inc. reserves the right to make changes in the information contained in this publication without prior

More information

Westek Technology Snapshot and HA iscsi Replication Suite

Westek Technology Snapshot and HA iscsi Replication Suite Westek Technology Snapshot and HA iscsi Replication Suite Westek s Power iscsi models have feature options to provide both time stamped snapshots of your data; and real time block level data replication

More information

Distributed Databases

Distributed Databases C H A P T E R19 Distributed Databases Practice Exercises 19.1 How might a distributed database designed for a local-area network differ from one designed for a wide-area network? Data transfer on a local-area

More information

Automatic Service Migration in WebLogic Server An Oracle White Paper July 2008

Automatic Service Migration in WebLogic Server An Oracle White Paper July 2008 Automatic Service Migration in WebLogic Server An Oracle White Paper July 2008 NOTE: The following is intended to outline our general product direction. It is intended for information purposes only, and

More information

Couchbase Server Technical Overview. Key concepts, system architecture and subsystem design

Couchbase Server Technical Overview. Key concepts, system architecture and subsystem design Couchbase Server Technical Overview Key concepts, system architecture and subsystem design Table of Contents What is Couchbase Server? 3 System overview and architecture 5 Overview Couchbase Server and

More information

First Midterm for ECE374 02/25/15 Solution!!

First Midterm for ECE374 02/25/15 Solution!! 1 First Midterm for ECE374 02/25/15 Solution!! Instructions: Put your name and student number on each sheet of paper! The exam is closed book. You have 90 minutes to complete the exam. Be a smart exam

More information

CHAPTER 2 MODELLING FOR DISTRIBUTED NETWORK SYSTEMS: THE CLIENT- SERVER MODEL

CHAPTER 2 MODELLING FOR DISTRIBUTED NETWORK SYSTEMS: THE CLIENT- SERVER MODEL CHAPTER 2 MODELLING FOR DISTRIBUTED NETWORK SYSTEMS: THE CLIENT- SERVER MODEL This chapter is to introduce the client-server model and its role in the development of distributed network systems. The chapter

More information

Remote Copy Technology of ETERNUS6000 and ETERNUS3000 Disk Arrays

Remote Copy Technology of ETERNUS6000 and ETERNUS3000 Disk Arrays Remote Copy Technology of ETERNUS6000 and ETERNUS3000 Disk Arrays V Tsutomu Akasaka (Manuscript received July 5, 2005) This paper gives an overview of a storage-system remote copy function and the implementation

More information

Building a Scale-Out SQL Server 2008 Reporting Services Farm

Building a Scale-Out SQL Server 2008 Reporting Services Farm Building a Scale-Out SQL Server 2008 Reporting Services Farm This white paper discusses the steps to configure a scale-out SQL Server 2008 R2 Reporting Services farm environment running on Windows Server

More information

ZooKeeper. Table of contents

ZooKeeper. Table of contents by Table of contents 1 ZooKeeper: A Distributed Coordination Service for Distributed Applications... 2 1.1 Design Goals...2 1.2 Data model and the hierarchical namespace...3 1.3 Nodes and ephemeral nodes...

More information

DHCP Failover: Requirements of a High-Performance System

DHCP Failover: Requirements of a High-Performance System DHCP Failover: Requirements of a High-Performance System A white paper by Incognito Software April, 2006 2006 Incognito Software Inc. All rights reserved. Page 1 of 6 DHCP Failover: Requirements of a High-Performance

More information

Requirements of Voice in an IP Internetwork

Requirements of Voice in an IP Internetwork Requirements of Voice in an IP Internetwork Real-Time Voice in a Best-Effort IP Internetwork This topic lists problems associated with implementation of real-time voice traffic in a best-effort IP internetwork.

More information

How to Make the Client IP Address Available to the Back-end Server

How to Make the Client IP Address Available to the Back-end Server How to Make the Client IP Address Available to the Back-end Server For Layer 4 - UDP and Layer 4 - TCP services, the actual client IP address is passed to the server in the TCP header. No further configuration

More information

Load Balancing Web Applications

Load Balancing Web Applications Mon Jan 26 2004 18:14:15 America/New_York Published on The O'Reilly Network (http://www.oreillynet.com/) http://www.oreillynet.com/pub/a/onjava/2001/09/26/load.html See this if you're having trouble printing

More information

Load Balancing using Pramati Web Load Balancer

Load Balancing using Pramati Web Load Balancer Load Balancing using Pramati Web Load Balancer Satyajit Chetri, Product Engineering Pramati Web Load Balancer is a software based web traffic management interceptor. Pramati Web Load Balancer offers much

More information

MICROSOFT SOFTWARE LICENSE TERMS MICROSOFT WINDOWS SERVER 2003 R2 STANDARD EDITION, ENTERPRISE EDITION, STANDARD x64 EDITION, ENTERPRISE x64 EDITION

MICROSOFT SOFTWARE LICENSE TERMS MICROSOFT WINDOWS SERVER 2003 R2 STANDARD EDITION, ENTERPRISE EDITION, STANDARD x64 EDITION, ENTERPRISE x64 EDITION MICROSOFT SOFTWARE LICENSE TERMS MICROSOFT WINDOWS SERVER 2003 R2 STANDARD EDITION, ENTERPRISE EDITION, STANDARD x64 EDITION, ENTERPRISE x64 EDITION These license terms are an agreement between you and

More information

Introduction to Hyper-V High- Availability with Failover Clustering

Introduction to Hyper-V High- Availability with Failover Clustering Introduction to Hyper-V High- Availability with Failover Clustering Lab Guide This lab is for anyone who wants to learn about Windows Server 2012 R2 Failover Clustering, focusing on configuration for Hyper-V

More information

HP Serviceguard Cluster Configuration for HP-UX 11i or Linux Partitioned Systems April 2009

HP Serviceguard Cluster Configuration for HP-UX 11i or Linux Partitioned Systems April 2009 HP Serviceguard Cluster Configuration for HP-UX 11i or Linux Partitioned Systems April 2009 Abstract... 2 Partition Configurations... 2 Serviceguard design assumptions... 4 Hardware redundancy... 4 Cluster

More information

WHITE PAPER. Best Practices to Ensure SAP Availability. Software for Innovative Open Solutions. Abstract. What is high availability?

WHITE PAPER. Best Practices to Ensure SAP Availability. Software for Innovative Open Solutions. Abstract. What is high availability? Best Practices to Ensure SAP Availability Abstract Ensuring the continuous availability of mission-critical systems is a high priority for corporate IT groups. This paper presents five best practices that

More information

Doc. Code. OceanStor VTL6900 Technical White Paper. Issue 1.1. Date 2012-07-30. Huawei Technologies Co., Ltd.

Doc. Code. OceanStor VTL6900 Technical White Paper. Issue 1.1. Date 2012-07-30. Huawei Technologies Co., Ltd. Doc. Code OceanStor VTL6900 Technical White Paper Issue 1.1 Date 2012-07-30 Huawei Technologies Co., Ltd. 2012. All rights reserved. No part of this document may be reproduced or transmitted in any form

More information

Server Clustering. What is Clustering?

Server Clustering. What is Clustering? At one point in time only a single processor was needed to power a server and all its applications. Then came multiprocessing, in which two or more processors shared a pool of memory and could handle more

More information

High Availability Essentials

High Availability Essentials High Availability Essentials Introduction Ascent Capture s High Availability Support feature consists of a number of independent components that, when deployed in a highly available computer system, result

More information

Modular Communication Infrastructure Design with Quality of Service

Modular Communication Infrastructure Design with Quality of Service Modular Communication Infrastructure Design with Quality of Service Pawel Wojciechowski and Péter Urbán Distributed Systems Laboratory School of Computer and Communication Sciences Swiss Federal Institute

More information

Oracle Database 10g: Backup and Recovery 1-2

Oracle Database 10g: Backup and Recovery 1-2 Oracle Database 10g: Backup and Recovery 1-2 Oracle Database 10g: Backup and Recovery 1-3 What Is Backup and Recovery? The phrase backup and recovery refers to the strategies and techniques that are employed

More information

A SURVEY OF POPULAR CLUSTERING TECHNOLOGIES

A SURVEY OF POPULAR CLUSTERING TECHNOLOGIES A SURVEY OF POPULAR CLUSTERING TECHNOLOGIES By: Edward Whalen Performance Tuning Corporation INTRODUCTION There are a number of clustering products available on the market today, and clustering has become

More information

Memory-to-memory session replication

Memory-to-memory session replication Memory-to-memory session replication IBM WebSphere Application Server V7 This presentation will cover memory-to-memory session replication in WebSphere Application Server V7. WASv7_MemorytoMemoryReplication.ppt

More information

ImagineWorldClient Client Management Software. User s Manual. (Revision-2)

ImagineWorldClient Client Management Software. User s Manual. (Revision-2) ImagineWorldClient Client Management Software User s Manual (Revision-2) (888) 379-2666 US Toll Free (905) 336-9665 Phone (905) 336-9662 Fax www.videotransmitters.com 1 Contents 1. CMS SOFTWARE FEATURES...4

More information

Fault Tolerance & Reliability CDA 5140. Chapter 3 RAID & Sample Commercial FT Systems

Fault Tolerance & Reliability CDA 5140. Chapter 3 RAID & Sample Commercial FT Systems Fault Tolerance & Reliability CDA 5140 Chapter 3 RAID & Sample Commercial FT Systems - basic concept in these, as with codes, is redundancy to allow system to continue operation even if some components

More information

DB2 9 for LUW Advanced Database Recovery CL492; 4 days, Instructor-led

DB2 9 for LUW Advanced Database Recovery CL492; 4 days, Instructor-led DB2 9 for LUW Advanced Database Recovery CL492; 4 days, Instructor-led Course Description Gain a deeper understanding of the advanced features of DB2 9 for Linux, UNIX, and Windows database environments

More information

Network Technologies

Network Technologies Network Technologies Glenn Strong Department of Computer Science School of Computer Science and Statistics Trinity College, Dublin January 28, 2014 What Happens When Browser Contacts Server I Top view:

More information

Synology High Availability (SHA)

Synology High Availability (SHA) Synology High Availability (SHA) Based on DSM 5.1 Synology Inc. Synology_SHAWP_ 20141106 Table of Contents Chapter 1: Introduction... 3 Chapter 2: High-Availability Clustering... 4 2.1 Synology High-Availability

More information

White Paper January 2009

White Paper January 2009 SAILFIN CONVERGED LOAD BALANCER A software interface for unified load balancing and failover of converged web and SIP applications deployed on the Java EE platform White Paper January 2009 Abstract High-availability

More information

Monitoring System Status

Monitoring System Status CHAPTER 14 This chapter describes how to monitor the health and activities of the system. It covers these topics: About Logged Information, page 14-121 Event Logging, page 14-122 Monitoring Performance,

More information

Highly Available Mobile Services Infrastructure Using Oracle Berkeley DB

Highly Available Mobile Services Infrastructure Using Oracle Berkeley DB Highly Available Mobile Services Infrastructure Using Oracle Berkeley DB Executive Summary Oracle Berkeley DB is used in a wide variety of carrier-grade mobile infrastructure systems. Berkeley DB provides

More information

Tushar Joshi Turtle Networks Ltd

Tushar Joshi Turtle Networks Ltd MySQL Database for High Availability Web Applications Tushar Joshi Turtle Networks Ltd www.turtle.net Overview What is High Availability? Web/Network Architecture Applications MySQL Replication MySQL Clustering

More information

NEC s SonicView IP Recorder Release Notes. Version 1.2. Release Notes

NEC s SonicView IP Recorder Release Notes. Version 1.2. Release Notes Version 1.2 Release Notes 1 SonicView 1.2 features: Selective Recording (rules based) The user can specify extension based recording rules and orchestrate call recordings based on the enterprise requirements

More information

Best Practice of Server Virtualization Using Qsan SAN Storage System. F300Q / F400Q / F600Q Series P300Q / P400Q / P500Q / P600Q Series

Best Practice of Server Virtualization Using Qsan SAN Storage System. F300Q / F400Q / F600Q Series P300Q / P400Q / P500Q / P600Q Series Best Practice of Server Virtualization Using Qsan SAN Storage System F300Q / F400Q / F600Q Series P300Q / P400Q / P500Q / P600Q Series Version 1.0 July 2011 Copyright Copyright@2011, Qsan Technology, Inc.

More information

Installing and Configuring a SQL Server 2014 Multi-Subnet Cluster on Windows Server 2012 R2

Installing and Configuring a SQL Server 2014 Multi-Subnet Cluster on Windows Server 2012 R2 Installing and Configuring a SQL Server 2014 Multi-Subnet Cluster on Windows Server 2012 R2 Edwin Sarmiento, Microsoft SQL Server MVP, Microsoft Certified Master Contents Introduction... 3 Assumptions...

More information

Monitoring Coyote Point Equalizers

Monitoring Coyote Point Equalizers Monitoring Coyote Point Equalizers eg Enterprise v6 Restricted Rights Legend The information contained in this document is confidential and subject to change without notice. No part of this document may

More information

First Semester Examinations 2011/12 INTERNET PRINCIPLES

First Semester Examinations 2011/12 INTERNET PRINCIPLES PAPER CODE NO. EXAMINER : Martin Gairing COMP211 DEPARTMENT : Computer Science Tel. No. 0151 795 4264 First Semester Examinations 2011/12 INTERNET PRINCIPLES TIME ALLOWED : Two Hours INSTRUCTIONS TO CANDIDATES

More information

Eloquence Training What s new in Eloquence B.08.00

Eloquence Training What s new in Eloquence B.08.00 Eloquence Training What s new in Eloquence B.08.00 2010 Marxmeier Software AG Rev:100727 Overview Released December 2008 Supported until November 2013 Supports 32-bit and 64-bit platforms HP-UX Itanium

More information

HDFS Users Guide. Table of contents

HDFS Users Guide. Table of contents Table of contents 1 Purpose...2 2 Overview...2 3 Prerequisites...3 4 Web Interface...3 5 Shell Commands... 3 5.1 DFSAdmin Command...4 6 Secondary NameNode...4 7 Checkpoint Node...5 8 Backup Node...6 9

More information

Configuring Apache and IIS for High Availability Web Server Clustering

Configuring Apache and IIS for High Availability Web Server Clustering PolyServe High-Availability Server Clustering for E-Business 918 Parker Street Berkeley, California 94710 (510) 665-2929 www.polyserve.com Number 000217 White Paper Configuring Apache and IIS for High

More information

Connectivity. Alliance Access 7.0. Database Recovery. Information Paper

Connectivity. Alliance Access 7.0. Database Recovery. Information Paper Connectivity Alliance Access 7.0 Database Recovery Information Paper Table of Contents Preface... 3 1 Overview... 4 2 Resiliency Concepts... 6 2.1 Database Loss Business Impact... 6 2.2 Database Recovery

More information

IBM DB2 for Linux, UNIX, and Windows. DB2 High Availability Disaster Recovery

IBM DB2 for Linux, UNIX, and Windows. DB2 High Availability Disaster Recovery IBM DB2 for Linux, UNIX, and Windows Best Practices DB2 High Availability Disaster Recovery Dale McInnis Senior Technical Staff Member DB2 Availability Architect Yuke Zhuge DB2 Development Jessica Rockwood

More information

Design and Implementation of High Availability OSPF Router *

Design and Implementation of High Availability OSPF Router * JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 26, 2173-2198 (2010) Design and Implementation of High Availability OSPF Router * Department of Computer Science National Chiao Tung University Hsinchu, 300

More information

Availability Digest. www.availabilitydigest.com. Leveraging Virtualization for Availability December 2010

Availability Digest. www.availabilitydigest.com. Leveraging Virtualization for Availability December 2010 the Availability Digest Leveraging Virtualization for Availability December 2010 Virtualized environments are becoming commonplace in today s data centers. Since many virtual servers can be hosted on a

More information

Overview of Luna High Availability and Load Balancing

Overview of Luna High Availability and Load Balancing SafeNet HSM TECHNICAL NOTE Overview of Luna High Availability and Load Balancing Contents Introduction... 2 Overview... 2 High Availability... 3 Load Balancing... 4 Failover... 5 Recovery... 5 Standby

More information

Note! The problem set consists of two parts: Part I: The problem specifications pages Part II: The answer pages

Note! The problem set consists of two parts: Part I: The problem specifications pages Part II: The answer pages Part I: The problem specifications NTNU The Norwegian University of Science and Technology Department of Telematics Note! The problem set consists of two parts: Part I: The problem specifications pages

More information

HAOSCAR 2.0: an open source HA-enabling framework for mission critical systems

HAOSCAR 2.0: an open source HA-enabling framework for mission critical systems HAOSCAR 2.0: an open source HA-enabling framework for mission critical systems Rajan Sharma, Thanadech Thanakornworakij { tth010,rsh018}@latech.edu High availability is essential in mission critical computing

More information

Cisco Application Networking Manager Version 2.0

Cisco Application Networking Manager Version 2.0 Cisco Application Networking Manager Version 2.0 Cisco Application Networking Manager (ANM) software enables centralized configuration, operations, and monitoring of Cisco data center networking equipment

More information

Storage Class Extensibility in the Brown Object Storage System

Storage Class Extensibility in the Brown Object Storage System DavidE.LangwoprthyandStanleyB.Zdonik BrownObjectStorageSystem StorageClassExtensibility DepartmentofComputerScience Providence,RhodeIsland02912 BrownUniversity inthe CS-94-22 May1994 Storage Class Extensibility

More information

Module 15: Network Structures

Module 15: Network Structures Module 15: Network Structures Background Topology Network Types Communication Communication Protocol Robustness Design Strategies 15.1 A Distributed System 15.2 Motivation Resource sharing sharing and

More information

SCALABILITY AND AVAILABILITY

SCALABILITY AND AVAILABILITY SCALABILITY AND AVAILABILITY Real Systems must be Scalable fast enough to handle the expected load and grow easily when the load grows Available available enough of the time Scalable Scale-up increase

More information

1 An application in BPC: a Web-Server

1 An application in BPC: a Web-Server 1 An application in BPC: a Web-Server We briefly describe our web-server case-study, dwelling in particular on some of the more advanced features of the BPC framework, such as timeouts, parametrized events,

More information

Chapter 12 Network Administration and Support

Chapter 12 Network Administration and Support Chapter 12 Network Administration and Support Objectives Manage networked accounts Monitor network performance Protect your servers from data loss Guide to Networking Essentials, Fifth Edition 2 Managing

More information

SAN Conceptual and Design Basics

SAN Conceptual and Design Basics TECHNICAL NOTE VMware Infrastructure 3 SAN Conceptual and Design Basics VMware ESX Server can be used in conjunction with a SAN (storage area network), a specialized high speed network that connects computer

More information

SanDisk ION Accelerator High Availability

SanDisk ION Accelerator High Availability WHITE PAPER SanDisk ION Accelerator High Availability 951 SanDisk Drive, Milpitas, CA 95035 www.sandisk.com Table of Contents Introduction 3 Basics of SanDisk ION Accelerator High Availability 3 ALUA Multipathing

More information