OWTWARE S iconn SERVER Delivering High Availability Out of the Box
OWTWARE S iconn SERVER Delivering High Availability Out of the Box Summary 3 Background 4 A Bold New Approach 4 Conclusion 10 Delivering High Availability Out of the Box 2 of 11
SUMMARY The need for high availability (HA) to ensure uninterrupted services, applications and data access is highly desirable in any compute environment. It is especially true for organizations considering or making the transition to a virtual desktop infrastructure (VDI), where achieving and maintaining HA has historically been costly and complicated. Not any more. Owtware s innovative iconn Server addresses the need for high availability in a VDI by making it simple, automatic and comprehensive all at a dramatic cost savings compared to traditional approaches. Delivering High Availability Out of the Box 3 of 11
BACKGROUND Implementing a virtual desktop infrastructure offers many advantages to both system administrators and end-users. But those advantages can be jeopardized if access to data, applications or services is interrupted. Establishing and sustaining HA is critical. Few system administrators would disagree that high availability is a desirable attribute. The harder question is: at what cost, in terms of both money and time? Redundancy can mean high costs for backup hardware, storage and networking. And, it can greatly add to the complexity of system administration through increased demands for configuration tuning and the implementation of additional protocols and procedures. Because of the potential costs and complexity of delivering high availability, many system administrators choose to limit investment in HA solutions only for their most critical systems, or to forego HA entirely. Such decisions, of course, leave a VDI more vulnerable to downtime. The optimum situation is for high availability to be an inherent feature of VDI installations, not a costly add-on. This if only situation is a reality today with the iconn server from Owtware. A BOLD NEW APPROACH Owtware has addressed the two most significant barriers to ensuring high availability in VDI environments cost and complexity. Owtware s approach to high availability is to make it a simple, automatic and continuous process that is built into the servers themselves. Owtware s approach builds on the commonly used elements that are fundamental to any HA solution redundant power supplies, Delivering High Availability Out of the Box 4 of 11
disks and network ports. In addition, virtual machines are automatically backed up for recovery or archiving purposes. Where Owtware s solution differs significantly from other approaches is in the system-level management of the failover process. It employs four major principles: 1. The process is entirely software-based 2. Most steps are implemented automatically 3. Replication of critical data occurs continuously 4. All elements are designed to emphasize simplicity 5. The result is a VDI that delivers high availability at significantly lower cost and with sharply reduced demands for sys-admin intervention. 6. A key to achieving simplicity is the architecture of the iconn server itself. In particular are two key aspects of that architecture an all-in-one design, and what Owtware calls one-port/one-protocol. All-in-one: Every iconn server is equipped to deliver the complete VDI experience. Each node can execute all functions. One-port/one-protocol: One of the challenges in deploying a VDI is managing the ports, protocols and links between clients and servers and server-to-server communication. This is a complex task, both initially and on an ongoing basis. With legacy VDI approaches, in order for a client to reach a server, it may touch as many as four other machines and protocols to reach the resources and services it needs. This means complex configurations, and increases potential points of failure. To virtually eliminate this complexity, Owtware multiplexes all the required protocols so clients do not have to be configured for spe- Delivering High Availability Out of the Box 5 of 11
cific ports. Each client provides a header that tells what protocol it is using. Owtware s protocol multiplexing routes the client to the appropriate server. The process conforms to HTTP specifications, so any client with a browser can use the system. Another advantage of the one-port/one-protocol approach can be seen in the process of adding remote users. In most situations, allowing remote users access to a system requires complex configurations. With Owtware, an administrator only needs to add one rule to the firewall, and that will deliver full VDI functionality to remote as well as in-house users. With one-port/one-protocol, the system will always route a client to the right server for whatever function it wants, even if the network configurations are not 100% correct. This means that in the case of a failure, a client virtual machine can be directed to any server in the cluster, and it will then be automatically connected to the right server for the resources it wants. No special configuration is required in advance to make this happen. It s built into the software in every iconn server. What all-in-one and one-port/one-protocol mean is that each and every node can perform a control function in case of a failover, and no servers need to be specially equipped for that task. This peer-based approach is one of the ways that the Owtware solution lowers the cost and complexity of HA. THE APPROACH IN ACTION Let s look at how Owtware s VDI solution delivers uninterrupted service in the face of a component, node or system failure. The starting point is to note that Owtware s approach is peerbased, software-based and highly automated. To a degree no- Delivering High Availability Out of the Box 6 of 11
where else available, iconn servers are pre-configured for high availability right out of the box. The simplicity and effectiveness of Owtware s HA solution begins long before there is even a hint of trouble. That s because Owtware automatically identifies and configures all nodes that are participating in a cluster. Any node can be designated as a leader in case of failure, and there are as many potential backups as leader as there are servers in the cluster. This reflects Owtware s belief that an HA solution can t just be about redundant hardware; high availability must be built into the control component of the system. Some approaches to HA require complex, multi-step procedures for configuring nodes that participate in redundancy, including database replication, network connections, domain names, and setting heartbeats. This process is eliminated with Owtware because all nodes are capable of performing all functions. Key to making this simple, peer-based approach work is the use of an election system for determining the leader in case of a failure. Through the election process (described below), one node is chosen as the leader. It is responsible for coordinating and delivering the top-level control functions and arbitrating access to a directory containing user names, passwords, machine configurations and other information needed to connect users to the resources they need. The leader automatically selects candidate nodes to act as active standby servers in case of failure. These candidate nodes automatically and continuously replicate all essential states to prevent data loss. Candidates continuously monitor the health of the leader, Delivering High Availability Out of the Box 7 of 11
and if the leader is identified as unhealthy by at least two candidates, then a peer-to-peer election is held to select a new leader. Much like a political election, candidates will declare their desire for office, and once a majority cast their votes for one candidate, that node becomes the new leader. Because all the candidate nodes have been continuously replicating system data, the time required to identify and recover from a failure can be as little as 5 seconds a remarkably short time. This short failover time is possible because the candidate nodes are continuously updated with all the information needed to control the system, such as the log-in portal, the database of user accounts, and the index of services provided to users. Because this information is always on the candidate backup servers, the only change that occurs at the time of failure is reconfiguring the new leader. To implement the peer-election process, Owtware uses the Raft consensus algorithm, a widely used means for delivering high availability by helping multiple servers reach a decision. MASTER SLAVE 1 SLAVE 2 MASTER STANDBY 1 Negotiation RAFT Consensus Algorithm STANDBY 2? AVAILABLE Master Unavailable Slaves Select a New Master e.g. Slave 2 Becomes new Master New Master Seeks its Master The new standby nodes are selected from the active available nodes in the cluster The advantage of using this method to select leaders is twofold. First, the process is conducted automatically. No human interven- Delivering High Availability Out of the Box 8 of 11
tion is needed to recognize a failure and act to remedy it. Second, the process is conducted purely in software. (Some systems employ hardware components to do active heartbeat monitoring, which raises costs and requires rigid configuration of servers.) The Raft-based peer approach is necessary to ensure that the remedy isn t as random as the failure probably was. Take a case in which one node lost contact with the leader, but not because there was anything wrong with the leader. Maybe it was just a momentary loss of communication. In such a case, the one node out of touch with the leader cannot initiate a change in leadership. Only when two or more nodes concurrently agree that the leader is not performing will an election take place for a new leader. In Owtware s peer-based consensus approach, identification and remedy of a failure occurs automatically and fluidly across the server cluster, and is not fixed to certain participants. In contrast, many HA approaches will configure a pair of servers to deliver control functions while all the other servers work on other tasks. This means that if there is a problem with one of the two control servers, then there is only one backup during the time it takes to remedy the problem. With Owtware s approach, if there is a problem with the leader, a new one is automatically and instantly selected, ensuring redundancy at all times. In fact, a system could maintain high availability through the failure of all but the last two servers in a cluster. At the virtual machine level, Owtware also applies a different approach that saves money and simplifies the process of ensuring high availability. It concerns how virtual machines are backed up. Most systems use storage-area networks (SANs) to continually back up VMs. But SANs are specialized, expensive pieces of equipment that can add to administrative overhead. In contrast, Delivering High Availability Out of the Box 9 of 11
Owtware continuously copies a virtual machine s state, using software between two nodes. If there is a failure on one node, a backup is available without the need for a SAN. This improves linear scalability and lowers total cost of deployment. This is another example of how Owtware uses software to replace many of the complicated and expensive hardware components that are used in other approaches to deliver high availability for VDI. Delivering High Availability Out of the Box 10 of 11
CONCLUSION By emphasizing simplicity, cost containment and processes that run automatically and continuously, Owtware eliminates the traditional barriers to ensuring high availability in a virtual desktop environment. Owtware s iconn servers are configured for high availability right out of the box, and impose almost no additional demands on system administrators, either in time or money. Delivering High Availability Out of the Box 11 of 11