INUVIKA TECHNICAL GUIDE

Similar documents

High Availability Solutions for the MariaDB and MySQL Database

Deploying Windows Streaming Media Servers NLB Cluster and metasan

Clustering ExtremeZ-IP 4.1

Administrator Guide VMware vcenter Server Heartbeat 6.3 Update 1

By the Citrix Publications Department. Citrix Systems, Inc.

FioranoMQ 9. High Availability Guide

Configuring Windows Server Clusters

QuickStart Guide vcenter Server Heartbeat 5.5 Update 2

FactoryTalk View Site Edition V5.0 (CPR9) Server Redundancy Guidelines

Network Load Balancing

How To Set Up A Two Node Hyperv Cluster With Failover Clustering And Cluster Shared Volume (Csv) Enabled

Active-Active and High Availability

Synology High Availability (SHA)

Administering and Managing Failover Clustering

EMC Data Domain Management Center

RSA Authentication Manager 7.1 to 8.1 Migration Guide: Upgrading RSA SecurID Appliance 3.0 On Existing Hardware

istorage Server: High-Availability iscsi SAN for Windows Server 2008 & Hyper-V Clustering

Metalogix SharePoint Backup. Advanced Installation Guide. Publication Date: August 24, 2015

Installing and Using the vnios Trial

Red Hat Cluster Suite

Active-Active ImageNow Server

Storage Sync for Hyper-V. Installation Guide for Microsoft Hyper-V

Cluster to Cluster Failover Using Double-Take

Synology High Availability (SHA)

TECHNICAL WHITE PAPER: DATA AND SYSTEM PROTECTION. Achieving High Availability with Symantec Enterprise Vault. Chris Dooley January 3, 2007

Extreme Networks Security Upgrade Guide

QNAP in vsphere Environment

Load Balancing and High availability using CTDB + DNS round robin

Remote Application Server Version 14. Last updated:

WhatsUp Gold v16.3 Installation and Configuration Guide

HP StoreVirtual DSM for Microsoft MPIO Deployment Guide

High Availability & Disaster Recovery Development Project. Concepts, Design and Implementation

Panorama High Availability

Configuring a Microsoft Windows Server 2012/R2 Failover Cluster with Storage Center

Remote Application Server Version 14. Last updated:

Lab 5 Explicit Proxy Performance, Load Balancing & Redundancy

Getting Started Guide

Overview... 1 Requirements Installing Roles and Features Creating SQL Server Database... 9 Setting Security Logins...

Astaro Deployment Guide High Availability Options Clustering and Hot Standby

System Compatibility. Enhancements. Security. SonicWALL Security Appliance Release Notes

Configure AlwaysOn Failover Cluster Instances (SQL Server) using InfoSphere Data Replication Change Data Capture (CDC) on Windows Server 2012

Pharos Uniprint 8.4. Maintenance Guide. Document Version: UP84-Maintenance-1.0. Distribution Date: July 2013

Executive Brief Infor Cloverleaf High Availability. Downtime is not an option

Installation and Upgrade on Windows Server 2008/2012 When the Secondary Server is Physical VMware vcenter Server Heartbeat 6.6

Drobo How-To Guide. What You Will Need. Configure Replication for DR Using Double-Take Availability and Drobo iscsi SAN

Virtual Appliance Setup Guide

Configuring High Availability for Embedded NGX Gateways in SmartCenter

Whitepaper Continuous Availability Suite: Neverfail Solution Architecture

McAfee SMC Installation Guide 5.7. Security Management Center

CA ARCserve Replication and High Availability for Windows

Enterprise Manager. Version 6.2. Administrator s Guide

Dell High Availability Solutions Guide for Microsoft Hyper-V

Quick Start Guide. Cerberus FTP is distributed in Canada through C&C Software. Visit us today at

High Availability and Clustering

HP Device Manager 4.6

High Availability of the Polarion Server

How To Install An Aneka Cloud On A Windows 7 Computer (For Free)

StarWind iscsi SAN Software: Using StarWind with MS Cluster on Windows Server 2003

Implementing Microsoft Windows Server Failover Clustering (WSFC) and SQL Server 2012 AlwaysOn Availability Groups in the AWS Cloud

Configuring MDaemon for High Availability

PolyServe Understudy QuickStart Guide

CONFIGURING MNLB FOR LOAD BALANCING EXCHANGE 2013 CU2 CAS SERVERS FOR HIGH AVAILABILITY

Availability Guide for Deploying SQL Server on VMware vsphere. August 2009

IBM Security QRadar SIEM Version High Availability Guide IBM

External Storage 200 Series. User s Manual

Modular Messaging. Release 4.0 Service Pack 4. Whitepaper: Support for Active Directory and Exchange 2007 running on Windows Server 2008 platforms.

Installing and Configuring a SQL Server 2014 Multi-Subnet Cluster on Windows Server 2012 R2

Maximum Availability Architecture

GlobalSCAPE DMZ Gateway, v1. User Guide

Snapt Redundancy Manual

Microsoft Exchange 2003 Disaster Recovery Operations Guide

Administration GUIDE. SharePoint Server idataagent. Published On: 11/19/2013 V10 Service Pack 4A Page 1 of 201

EMC NetWorker Module for Microsoft for Windows Bare Metal Recovery Solution

NetSpective Global Proxy Configuration Guide

High Availability Storage

Troubleshooting Failover in Cisco Unity 8.x

Deploying Remote Desktop Connection Broker with High Availability Step-by-Step Guide

Moving the Web Security Log Database

Managing Cisco ISE Backup and Restore Operations

Linux Development Environment Description Based on VirtualBox Structure

Contingency Planning and Disaster Recovery

How To Write An Emma Document On A Microsoft Server On A Windows Server On An Ubuntu 2.5 (Windows) Or Windows 2 (Windows 8) On A Pc Or Macbook (Windows 2) On An Unidenor

Deploying Exchange Server 2007 SP1 on Windows Server 2008

High Availability Essentials

istorage Server: High Availability iscsi SAN for Windows Server 2012 Cluster

Understanding offline files

Hosting Users Guide 2011

DocuShare 4, 5, and 6 in a Clustered Environment

Configuring Failover

Unitrends Virtual Backup Installation Guide Version 8.0

Implementing and Managing Windows Server 2008 Clustering

CA ARCserve Replication and High Availability for Windows

Configuring the BIG-IP system for FirePass controllers

Addonics T E C H N O L O G I E S. NAS Adapter. Model: NASU Key Features

Architectures Haute-Dispo Joffrey MICHAÏE Consultant MySQL

IM and Presence Disaster Recovery System

IM and Presence Service Network Setup

EMC NetWorker Module for Microsoft Applications Release 2.3. Application Guide P/N REV A02

Allworx Installation Course

Copyright 2012 Trend Micro Incorporated. All rights reserved.

Transcription:

--------------------------------------------------------------------------------------------------- INUVIKA TECHNICAL GUIDE FILE SERVER HIGH AVAILABILITY OVD Enterprise External Document Version 1.0 Published ------------------------------------------------------------------------------------------------------------------------------------ Passing on or copying of this document, use and communication of its content not permitted without Inuvika written approval

PREFACE This document describes the architecture that can be employed to implement a high availability solution for the OVD File Server together with an example implementation. Page 2

HISTORY Version Date Author Comments mm-dd-yy 1.0 11-12-15 Sarina Dhaliwal Initial Document Page 3

TABLE OF CONTENTS 1. INTRODUCTION... 7 1.1 What is High Availability?... 7 1.1.1 Single Points of Failure... 7 1.1.2 Resiliency... 7 1.2 Solution Overview... 7 1.2.1 Terminology... 8 1.2.2 Configurations... 8 1.3 Pre-Requisites... 9 1.3.1 General... 9 1.3.2 Minimum Requirements... 10 1.3.3 Recommended Requirements... 10 2. HIGH AVAILABILITY FOR OVD FILE SERVERS... 11 2.1 External Components... 11 2.1.1 Data Replication... 11 2.1.2 Virtual IP... 12 2.2 Session Manager Role... 14 2.3 OVD Sample Configurations... 14 2.3.1 Load Balancer without External SAN or NAS... 15 2.3.2 Using External NAS or SAN and Load Balancer... 16 3. GLUSTERFS AND CTDB IMPLEMENTATION... 17 3.1 Overview... 17 3.2 Requirements... 18 3.3 Basic System Configuration... 18 3.4 Scripted Implementation... 19 3.5 Session Manager Configuration... 20 3.6 Troubleshooting... 22 3.6.1 GlusterFS... 22 3.6.2 CTDB... 23 4. TESTING FILE SERVER HIGH AVAILABILITY... 25 4.1 General... 25 4.2 Test Scenarios... 25 4.2.1 Test 1... 26 4.2.2 Test 2... 26 4.2.3 Test 3... 26 4.2.4 Test 4... 26 Page 4

4.2.5 Test 5... 26 5. UPGRADE OVD AND ENABLE HIGH AVAILABILTIY... 27 5.1 Upgrade to OVD Version 1.3... 27 5.2 Migrate Load Balancing Setup to A High Availability Cluster... 27 6. GENERAL TROUBLESHOOTING... 29 6.1 OVD File Server... 29 6.1.1 General Information... 29 6.1.2 Service Logs... 29 6.2 OSM/OAC... 29 6.3 Other... 30 6.3.1 Firewall... 30 7. EXTENDING THE DESIGN... 31 7.1 External Data Storage... 31 Page 5

CONVENTIONS The table below shows the typing conventions used in this document. These conventions denote a special type of information. Typing convention Information type Bold-face text Italics Double Quotes Dialog Fields Commands Buttons File names Document titles Document references Menu Options Page 6

1. INTRODUCTION This document describes the architecture that can be employed to implement a high availability solution for the OVD File Server (OFS). Such an implementation will allow continued access to OVD user profiles and shared folders should one of the OVD File Servers fail. The document contains the details of a sample implementation for an Ubuntu OVD server farm using two open source tools (GlusterFS and CTDB). 1.1 WHAT IS HIGH AVAILABILITY? High availability refers to a system s ability to continue normal operation in the event of a system or component failure. A highly available system is one that uses groups or clusters of computers/components monitored by high availability software so if a critical component fails, normal operations are restarted on the backup components in a process known as failover. When the primary system is restored, normal operation can be shifted back in a process known as failback. The main purpose of a high availability system is to minimize system downtime and data loss. 1.1.1 SINGLE POINTS OF FAILURE In order to be considered highly available, a system should minimize the number of single points of failure (SPOFs). A SPOF is a critical system component that will cause system failure or data loss if it goes down. In order to eliminate SPOFs, you must add redundancy and replication. Redundancy involves providing backup components that the system can switch over to if a critical component fails and replication involves ensuring the backup system has access to the same data as the primary system. 1.1.2 RESILIENCY Highly available systems should strive for resiliency. This means system failures should be handled quickly and effectively and failover events (switching to backup/primary systems or components) should be as seamless and as quick as possible so as to minimize the impact on users. Resiliency will allow you to guarantee a minimum uptime for your system. 1.2 SOLUTION OVERVIEW The OVD File Server is a component that is used to store user profiles and session data, as well as shared folder data among users that log in to OVD. It is a server which is used by the session manager and facilitates persisting of program preferences, caches, any user-specific saved data, Page 7

or data created by the user. It stores it in a secure manner and is only accessible by that user the next time they log into OVD. The OFS has the ability to share folders across user groups, which allows multiple users to see and manage the same data for collaboration. This means that the OFS is a critical component of an OVD farm. Any failures involving it will result in user data being unavailable, leaving affected users unable to login and start sessions. OVD File Server High Availability aims to make the OFS highly available so in case of failures, the data is still accessible and service is not interrupted. 1.2.1 TERMINOLOGY For the duration of this document, a cluster will refer to a collection of hardware components that work together to provide high availability and node will refer to each individual component in the cluster. For OVD File Server High Availability (OVD FS HA), each OFS is considered a node and the collection of them is a file server cluster. 1.2.2 CONFIGURATIONS 1.2.2.1 ACTIVE/PASSIVE HA CONFIGURATION In an active/passive high availability configuration, redundancy is used to ensure high availability. A redundant instance of a resource is maintained which remains idle, listening to the primary system. If a failure occurs in the primary node, the idle node activates. The process of failover occurs, relocating resources and connections from the failed primary node to the backup node. For this configuration, backup instances have identical state through a data replication and synchronization process (see section 2.1 External Components) so a request to one instance is the same as to another, meaning a switch to a backup system will still give end users access to the same data. This configuration can also be setup to failback, either automatically or manually. For automatic failback, once the failed node is back online, resources and connection are moved back to it, reinstating it as the primary node and the secondary node returns to idle, listening mode. For manual, the administrator can switch service back to the primary node or let the secondary node function as the new primary node and use the old primary node as the new backup. Page 8

1.2.2.2 ACTIVE/ACTIVE HA CONFIGURATION In an active/active high availability configuration, redundancy is still used to ensure high availability, as in each service has a backup that the system can revert to in case of failure, but in this setup all systems are online and working concurrently. In the case of a failure, when failover occurs it is relocating resources and connections to a system that is already working. So instead of activating and taking this load on, the backup system already has its own load and simply takes on more. In this case it is not true failover but rather resource reallocation/load balancing. For this configuration, all instances have identical state through a data replication and synchronization process (see section 2.1 External Components) so a request to one instance is the same as to another, meaning each running instance can take on another s load with no difference in service to the end user. This configuration can also be setup to failback automatically. Since an active/active configuration has all instances operating at once, a load balancer is typically used to allocate resources across all instances. Once the failed node is back online, the load balancer will redistribute resources and connections between all instances, included the newly repaired one. 1.3 PRE-REQUISITES 1.3.1 GENERAL OVD File Server High Availability is an enterprise feature, so you must be using OVD Enterprise, version 1.3 or higher. This feature is not a standalone OVD capability and requires: Investigation of the existing IT setup in order to determine which existing components can be used and which components will need to be installed Implementation of a data replication mechanism (follow the guidelines in section 2.1 External Components) Implementation of a Virtual IP (VIP) mechanism (follow the guidelines in section 2.1 External Components) o The VIP is an IPv4 IP that is dedicated and allocated to be used by the OVD FS HA mechanism to determine the OFS node's availability. This IP must not already be used in your LAN network and should be a static assigned IP and in the same subnet as the two File Servers. Page 9

1.3.2 MINIMUM REQUIREMENTS 2 dedicated OVD File Servers (OFS). For High Availability the file server must not act as an OVD Application Server, Session Manager, or any other such service. It should be setup with the only the File Server role. An understanding of network and system administration 1.3.3 RECOMMENDED REQUIREMENTS An external load balancer to manage the VIP, such as F5 An external NAS to store the data All OVD FS servers should run the same Linux distribution and have the same version and system architecture 2 NICs / vlan o One dedicated VLAN (and NIC if possible / recommended) for Heartbeat/Gluster (data replication and synchronization) management Redundant power supplies which power your external storage device(s) and the servers. Page 10

2. HIGH AVAILABILITY FOR OVD FILE SERVERS Inuvika OVD provide two types of storage: User Profile (includes session data) Shared Folder The OVD File Server (OFS) component provides centralized access to user data for both Linux and Windows Application Servers. For instance, a user who is running a session with both Linux and Windows can first create a file with a Linux application, and can then open the same file with a Windows application without requiring any permission or dealing with any cache issues. The goal of having a highly available setup for your OVD Farm is to allow for servers of the same role to carry out the tasks when another server of the same role goes down with minimized or no negative consequences from it being down. A highly available OFS setup provides: Availability of your data Availability of data access (CIFS & WebDAV) In other words, if one of your File Servers crashes: No data will be lost Running sessions will remain alive after failover (potentially up to 1 minute I/O freeze) New sessions can still be started after the failover has completed (delay of less than 1 minute) 2.1 EXTERNAL COMPONENTS OVD File Server High Availability is not a standalone OVD feature. The following external components are required in order to setup a highly available OFS. 2.1.1 DATA REPLICATION In order to ensure users still have access to their OVD data (Shared Folders and User Profiles) when failover occurs, the secondary system must have the same available data as the primary system. This can be achieved through replication, a process by which data is synchronized between the primary and secondary systems Additional benefits of this synchronization are: No data is lost if a crash occurs The Samba configuration will be replicated between the OFS servers Page 11

Figure 1: Data replication between two OVD File Servers All the OVD File Servers in your highly available cluster must have synchronized data on the /var/lib/ovd/slaveserver/fs.real folder or, if no default location is specified, the data_dir path defined in the /etc/ovd/slaveserver/slaveserver.conf file. The data replication / synchronization mechanism can be one of the following: An external NAS (CIFS, NFS) GlusterFS XtreemFS Inotify + SSH + Rsync / Unisson The Administrator must choose an option to implement from this list. If your system already has a suitable NAS, we strongly recommend you use it. Otherwise, section 3 will guide you on how to setup a GlusterFS volume. For the remainder of this document all examples will be using GlusterFS for the data replication component. 2.1.2 VIRTUAL IP Virtual IP addresses (VIPs) are a core component of high availability. A VIP is an IP address mapping. It is used in place of an actual IP address so that the same VIP can be used to point to Page 12

many different IPs. For a high availability cluster, VIPs are used to access critical components so that if they fail, the load balancer can change the mapping to have the VIP point to a secondary component. This design is used in a standard active/passive (see section 2.3) setup. The active node is accessed using the VIP. In the case of a failover, the passive node is promoted and the VIP is shifted to point to it instead. Applications using the primary node have their connections broken and reconnected, with the VIP now pointing to the new setup. For a highly available OVD Farm, the VIP will be used for all CIFS/WebDAV connections to the OFS. This VIP must always target an online node. In the diagram below, the VIP directs traffic to OVD FS 1 until failure occurs. When it does, it shifts traffic to OVD FS 2 instead, until OVD FS 1 is restored. Figure 2: Two OVD File Servers in a highly available cluster. The VIP management solution can be one of the following: An external Load Balancer such as F5 Pacemaker + Heartbeat / Corosync Samba CTDB Page 13

The Administrator must choose an option to implement from this list. If your system already has a suitable Load Balancer, we strongly recommend you use it. Otherwise, section 3 will guide you on how to setup Samba CTDB. For the remainder of this document all examples will be using CTDB for the VIP management component. 2.2 SESSION MANAGER ROLE The OVD Session Manager (OSM) currently provides support for one High Availability File Server cluster. The OSM is aware of each File Server in the cluster and acts as the user management component, system management component, and primary configuration for the entire OVD system. You can configure the OSM through the OVD Administration Console (OAC) to implement high availability for the file servers (see section 3.5 Session Manager Configuration). 2.3 OVD SAMPLE CONFIGURATIONS Depending on the data replication and virtual IP solution, the following configurations are supported for File Server High Availability. Note: Inuvika strongly recommends that you setup the VIP as a failover solution i.e. Active/Passive. Page 14

2.3.1 LOAD BALANCER WITHOUT EXTERNAL SAN OR NAS Figure 3: Load Balancer without External SAN or NAS Page 15

2.3.2 USING EXTERNAL NAS OR SAN AND LOAD BALANCER Figure 4: External NAS or SAN and Load Balancer Page 16

3. GLUSTERFS AND CTDB IMPLEMENTATION This section will walk you through the set-up and configuration of GlusterFS and CTDB (Clustered Trivial Data Base) to provide highly available file storage. 3.1 OVERVIEW High availability in OVD required two external components: data replication and VIPs. In this section, we will use two open source tools for these components: GlusterFS for data replication and CTDB for VIP management. GlusterFS is a clustered file system that can provide network storage that can be made redundant, fault-tolerant, and scalable. It can be used to provide high availability of storage by replicating it between servers. Visit their official website to learn more. CTDB is a thin and efficient database that allows for the creation of a clusterized Samba. It makes it possible for Samba to serve data simultaneously from different hosts in a network and provides high availability features such as node monitoring (for the purposes of failover and failback), node failover, and IP takeover (reassign VIP addresses during failover and failback). Visit their official website to learn more. Figure 5: A highly available OVD File Server setup using GlusterFS and CTDB. Note this implementation does not use a hardware load balancer. The VIP is a software bound interface. Page 17

3.2 REQUIREMENTS Two OVD File Servers that both: o Run the same Linux distribution, version, and have the same system architecture. o Have a dedicated data partition mounted in /mnt/data. We recommend this partition be hosted by a RAID 5. Two NICs/VLAN o One dedicated VLAN (and NIC if possible) for CTDB and Gluster management. OVD File Server (OFS) role is running on the server and no other roles are active on that server (it s not possible to mix several roles in this configuration). Download the OVD File Server High Availability package, ovd-fs-ha-helper_<version>.zip from the Inuvika archives (version 1.3 and higher), in the packages folder. Extract the following files onto each OFS node before you begin installation: o glusterfs-server.init.sh o glusterfs-client.init.sh o ctdb.init.patch o ovd-fs-ha-setup-node Change the permissions for the three files from the point above to ensure they are executable (use the command: chmod +x).! This example setup assumes you file servers are using Ubuntu 14.04. If you are using a different Linux distribution, you can study the scripts used in this example as a base to understand the necessary commands and modify them to work for the distribution you are using. 3.3 BASIC SYSTEM CONFIGURATION Network configuration NOTE: the Ubuntu script that will be used in section 3.4 will have an interactive guide that prompts for and then checks the appropriateness of your IP choices and whether they match the current LAN and subnet mask. But for the sake of providing a concrete example, the following IPs are examples and assume a class C network of /24 with the 192.168.33.0/24 subnet of 255.255.255.0. o Dedicated VLAN: 192.168.33.0/24 node1: 192.168.33.11 node2: 192.168.33.12 o LAN: 192.168.56.0/24 node1: 192.168.56.21 node2: 192.168.56.22 VIP: 192.168.56.20 (NOTE: be sure this IP is not in the DHCP range if a DHCP server is running on the LAN) Page 18

Hostname resolution o On node1, modify the /etc/hostname to have the name node1 and /etc/hosts files to resolve the node2 FQDN address on the dedicated LAN. Similarly, on node2, the hostname is node2 and hosts file contains an entry for node1 on the dedicated LAN. For example on node 1, the /etc/hosts file entry should be: 127.0.0.1 node1 192.168.33.12 node2 And for example on node 2, the /etc/hosts file entry should be: 127.0.0.1 node2 192.168.33.11 node1 3.4 SCRIPTED IMPLEMENTATION Run the ovd-fs-ha-setup-node script from the OVD File Server High Availability package (see section 3.2 Requirements) on each File Server. This script will install and configure GlusterFS, CTDB, and the OFS. If you would like to see how the system is setup exactly, you can read the script for all the steps. You may also modify it for custom installations (e.g. using different open source tools for data replication and VIP management). This script can be run automatically or in an interactive mode. For our GlusterFS + CTDB example, run it in automatic mode on each OFS with the following commands (change the IPs to your own): On node1, run: NODE_ID=1 NODE2_IP=192.168.56.32 VIP=192.168.56.30 bash ovd-fs-hasetup-node On node2, run: NODE_ID=2 NODE1_IP=192.168.56.31 bash ovd-fs-ha-setup-node Page 19

If you are going to run the script in interactive mode instead, use the command:./ovd-fs-ha-setup-node And provide any information the script prompts you for.! If you are going to run the script in interactive mode, make sure you know the IP addresses of both OFS nodes before beginning as you will be prompted for them during the installation process. 3.5 SESSION MANAGER CONFIGURATION Once the ovd-fs-ha-setup-node script completes, the system is ready and you must now configure the OVD Session Manager (OSM). Login to the OVD Administration Console (OAC), go to the Servers tab and select File Server Clusters. Here you can create your new cluster, Give it a name and click Add. You will be redirected to the management page for your new cluster, where you can define the different components. Here you should add the OVD File Servers that are part of your cluster and set the IP of your VIP. In order to have File Server High Availability, a File Server Cluster requires at least two OVD File Servers and both should be online and in production mode. If either OFS is offline or in maintenance mode, the one that is still available will be used but the system will not have high availability. NOTE: If you set any of your file servers to maintenance mode, the servers will still remain synchronized as the data synchronization is handled by GlusterFS + CTDB, which will still be functional on the servers in maintenance mode. When your cluster is fully defined, uncheck the Maintenance option and save to switch the cluster to production. Page 20

Figure 6: Cluster management page in the OAC.! The OVD FS HA feature does not actively alert the administrator of a failback or a fault in the high availability. It is up to the administrator to put in hardware or their own detection system to ensure that the services and the hardware are running and working. However, OVD will still log the event of a failback or resiliency FS switchover, so an administrator can regularly monitor logs for these notifications. Page 21

3.6 TROUBLESHOOTING 3.6.1 GLUSTERFS For general GlusterFS troubleshooting, please refer to the GlusterFS documentation. If your issue is not addressed on this page, follow these suggestions to investigate and identify any issues: 3.6.1.1 GENERAL STATUS INFORMATION: Check the peer status in GlusterFS using the command: gluster peer status If nothing looks amiss, you can then check the GlusterFS volume status and info using the command: gluster volume info all 3.6.1.2 CLIENT-SIDE MOUNT POINT Confirm that GlusterFS directories have been mounted successfully using the command: mount grep gluster The resulting output should show that the directories were successfully mounted on /mnt/gluster-volume1. You can navigate to this directory and list the contents to verify that it has been mounted. 3.6.1.3 VERIFY DATA REPLICATION If the check above verified the directories were successfully mounted on /mnt/glustervolume1, you can check that the data replication and synchronization is working as expected. Check the /mnt/gluster-volume1/ovd folder on each OFS node. Both folders should contain identical data. If there are any differences, the data replication and synchronization processes are not functioning or not functioning correctly. Page 22

3.6.1.4 SERVICE STATUS Check the GlusterFS service status using the command: service glusterfs-server status 3.6.1.5 SERVICE LOGS Look through the GlusterFS service logs to see if any unexpected behavior has been logged. The logs will be located in /var/log/glusterfs. 3.6.1.6 GLUSTERFS PROCESSES Check which, if any, GlusterFS processes are currently running on your system using the command: ps auxf grep gluster 3.6.1.7 DATA VOLUME PATH Ensure that the data volume path is correct. It should be /mnt/data/gluster/vol1. Navigate to it and check that the contents are correct. 3.6.2 CTDB For general CTDB troubleshooting, please refer to the CTDB documentation. If your issue is not addressed on this page, follow these suggestions to investigate and identify any issues: 3.6.2.1 GENERAL STATUS INFORMATION Check the status of CTDB to see if there is any anomalous behavior. Check basic information and the status of your cluster nodes using the command: ctdb status The output will inform you of the status of each node (if they are okay or not) and the status of your system (Recovery mode should be NORMAL if the system is okay). Check the status of the IP addresses using the command: ctdb ip Each IP that is being served will be listed along with the cluster node serving it. Page 23

3.6.2.2 SERVICE STATUS Check the statuses of services used by CTDB and ensure they are all running. Check CTDB using the command: service ctdb status Check Samba using the command: service smbd status Check the NetBIOS name server using the command: service nmbd status All three of these services should be running in order for CTDB to be functioning properly. 3.6.2.3 SERVICE LOGS Look through the CTDB service log to see if any unexpected behavior has been logged. The logs will be located at /var/log/ctdb/log.ctdb. Page 24

4. TESTING FILE SERVER HIGH AVAILABILITY The reason to create a highly available OFS system is so there is little to no impact on the environment if one server crashed. So it s important to simulate a server crash and ensure the feature is working as intended. Once your highly available system is setup, you can carry out some of the tests in this section to validate that the system is working correctly. 4.1 GENERAL Scenario 1: Network Down You can shut down the network your master virtual machine is running on and then check the system s behavior. Some suggestions for what to look for are below. Scenario 2: Power Down You can shut down the master virtual machine and then check the system s behavior. Some suggestions for what to look for are below. System Checks: Are running sessions impacted? o In a highly available setup, there will be up to a 1 minute I/O freeze for running sessions. Any other impact is not expected. Did the VIPs switch between servers? o If a server goes down, the VIP should shift to one of the backups. What happens when the network is enabled again? o Does the system automatically failback to the primary nodes if it has been setup to do that? The only situation in which it won t is if you are using an active/active configuration with a manual failback switch. 4.2 TEST SCENARIOS For more thorough testing, you can follow the test matrices described in this section. Each test details a scenario ( State ) and the status of each OFS. Carry out the scenarios with the File Servers in the specified state. If you get the results listed in the test matrix, the test passed and you can proceed. If your results differ from the ones listed, there may be a problem with your system setup. Page 25

4.2.1 TEST 1 4.2.2 TEST 2 4.2.3 TEST 3 4.2.4 TEST 4 4.2.5 TEST 5 Page 26

5. UPGRADE OVD AND ENABLE HIGH AVAILABILTIY This section will guide you in upgrading to a newer version of OVD without any data loss and converting from load balancing, if applicable, to a highly available file server setup. 5.1 UPGRADE TO OVD VERSION 1.3 In order to use OVD File Server High Availability, you will need to upgrade OVD to version 1.3 or higher. The OVD software repository is available at http://archive.inuvika.com/ovd. You can do this without any data loss by following these steps: 1. Switch all servers to maintenance mode (you can do this from the Servers page in the OAC). 2. Switch the OSM to maintenance mode (you can do this from the main page of the OAC). 3. Upgrade the OSM and Application Servers (ApS) by following the release notes that are available on the found on the OVD software repository. 4. Upgrade all File Servers by following the relevant instructions in the release notes. 5. After you are done upgrading, switch all servers back to production mode. 6. Switch the OSM to production mode. 5.2 MIGRATE LOAD BALANCING SETUP TO A HIGH AVAILABILITY CLUSTER If you have multiple file servers and have been using the load balancing feature, you may continue to use the load balancing configuration or you can update the environment to use high availability instead. You can do this by following these steps: 1) Stop the OVD service if it s already installed and running service ovd-slaveserver stop 2) Follow section 3 to setup a highly available system using GlusterFS and CTDB. Or modify the script provided in section 3 to use any other tools you may prefer. Page 27

NOTE: The script provided in section 3 will ensure no data is lost. It does this by executing the following command: rsync -avp /var/lib/ovd/slaveserver/fs.real/./ /mnt/glustervolume1/ovd/./ By executing that command on each existing OFS in the OVD farm, the script will migrate the data from each server to the cluster, ensuring each server will have all data.! If you are implementing your setup in a different way and not using the script provided in section 3, you must execute the rsync command manually to migrate the existing data to the new folder within your high availability setup. Page 28

6. GENERAL TROUBLESHOOTING When encountering problems with your highly available setup, the path for investigation should be Data Replication (ex: GlusterFS) > VIP Management (ex: CTDB) > OFS > OSM/OAC. If you are using GlusterFS or CTDB, see section 3.6 for troubleshooting tips. If you are using any other tools for data replication and/or VIP management, please look at the official documentation for the tool. If there are no issues at the data replication or VIP management levels, read on for troubleshooting tips for the OFS and OSM/OAC. 6.1 OVD FILE SERVER 6.1.1 GENERAL INFORMATION Check which, if any, OFS processes are currently running on your system using the command: ps auxf grep ovd-slaveserver 6.1.2 SERVICE LOGS Look through the OFS service logs to see if any unexpected behavior has been logged. The logs will be located at /var/log/ovd/slaveserver.log and /var/log/ovd/rufs.log. Check slaveserver.log first as it will contain most pertinent information rufs.log can be checked if additional details are required (NOTE: this log file contains many debugging messages and may be too difficult to parse enough meaningful information from - use it as a last resort). 6.2 OSM/OAC Check the general status of the cluster through the OAC. Login to the OAC, go to the Servers tab and select File Server Clusters. Check that the cluster is defined. If it is, there should be an entry for it on this page. Click on the cluster to go to its management page. Here you should check if: The cluster is in production mode The VIP is valid All nodes are online and in production mode Page 29

If everything is fine, you can check the OSM logs for any anomalous behavior. Go to the Status tab and select Logs. 6.3 OTHER 6.3.1 FIREWALL Check your firewall to make sure it is not interfering with the cluster. Monitor the traffic and ensure CIFS/WebDAV communication is targeting the appropriate VIP and that the OSM is also able to communicate with the VIP. Page 30

7. EXTENDING THE DESIGN 7.1 EXTERNAL DATA STORAGE OVD File Server High Availability ensures the servers are highly available, as well as the files on each separate file system. There is an alternative to this setup though you can use a separate NFS server to serve the files instead of using GlusterFS and separating them between the servers. The script provided in section 3 will not cover this hardware file server high availability, so the research and implementation of that is for the reader to investigate. Figure 7: OVD File Server High Availability using an external NAS. The mount point in the figure is from the example setup in section 3. It would now point to your external storage instead. The advantages of making the file mount point highly available using a hardware RAID device are: - Speed: persisting data on the file server itself may be slower than an NFS server which has built-in RAID - Reducing the file server disk space requirements: offloading the profiles and session data onto an NFS server which is equipped for bigger files is more efficient. Having them on the file server itself is duplication and may (unnecessarily) dramatically increase the size of the partitions - Convention: it is normal convention to have one centralized external mount point for a file share instead of duplicating the synchronization. Page 31