Open Source High Availability on Linux

Similar documents

P03 Linux on POWER OSS High Availability

High Availability Solutions for MySQL. Lenz Grimmer DrupalCon 2008, Szeged, Hungary

MySQL High Availability Solutions. Lenz Grimmer OpenSQL Camp St. Augustin Germany

Red Hat Enterprise linux 5 Continuous Availability

High Availability Solutions for the MariaDB and MySQL Database

Astaro Deployment Guide High Availability Options Clustering and Hot Standby

Clustering in Parallels Virtuozzo-Based Systems

High-Availability Using Open Source Software

Parallels. Clustering in Virtuozzo-Based Systems

Veritas Cluster Server from Symantec

Linux High Availability

Skelta BPM and High Availability

High Availability Solutions with MySQL

High Availability with Postgres Plus Advanced Server. An EnterpriseDB White Paper

High Availability Storage

Executive Brief Infor Cloverleaf High Availability. Downtime is not an option

Multiple Public IPs (virtual service IPs) are supported either to cover multiple network segments or to increase network performance.

A SURVEY OF POPULAR CLUSTERING TECHNOLOGIES

Ingres High Availability Option

High Availability and Disaster Recovery for Exchange Servers Through a Mailbox Replication Approach

Building Highly Available OpenZFS Storage Appliances Grenville Whelan

High Performance Cluster Support for NLB on Window

High Availability Low Dollar Clustered Storage

Simple Introduction to Clusters

Vess A2000 Series HA Surveillance with Milestone XProtect VMS Version 1.0

High Availability of VistA EHR in Cloud. ViSolve Inc. White Paper February

High Availability and Disaster Recovery Solutions for Perforce

Deploying Global Clusters for Site Disaster Recovery via Symantec Storage Foundation on Infortrend Systems

Veritas InfoScale Availability

High Availability Design Patterns

HA for Enterprise Clouds: Oracle Solaris Cluster & OpenStack

High Availability Database Solutions. for PostgreSQL & Postgres Plus

WHITE PAPER. Best Practices to Ensure SAP Availability. Software for Innovative Open Solutions. Abstract. What is high availability?

Based on Geo Clustering for SUSE Linux Enterprise Server High Availability Extension

Real-time Protection for Hyper-V

OVERVIEW. CEP Cluster Server is Ideal For: First-time users who want to make applications highly available

Maximizing Data Center Uptime with Business Continuity Planning Next to ensuring the safety of your employees, the most important business continuity

Availability Digest. Redundant Load Balancing for High Availability July 2013

Top 10 Reasons why MySQL Experts Switch to SchoonerSQL - Solving the common problems users face with MySQL

Three Ways Enterprises are Protecting SQL Server in the Cloud

Best Practices for Implementing High Availability for SAS 9.4

WHITE PAPER: HIGH CUSTOMIZE AVAILABILITY AND DISASTER RECOVERY

Fault Tolerant Servers: The Choice for Continuous Availability on Microsoft Windows Server Platform

Architectures Haute-Dispo Joffrey MICHAÏE Consultant MySQL

Mirror File System for Cloud Computing

Protecting Citrix XenServer Environments with CA ARCserve

Protecting Microsoft Hyper-V 3.0 Environments with CA ARCserve

IBM Storwize Rapid Application Storage solutions

Cisco Active Network Abstraction Gateway High Availability Solution

Informatica MDM High Availability Solution

COMPARISON OF VMware VSHPERE HA/FT vs stratus

Peter Ruissen Marju Jalloh

Eliminate SQL Server Downtime Even for maintenance

HIGH AVAILABILITY LINUX ARCHITECTURE FOR MISSION CRITICAL WORKLOADS

Symantec Cluster Server powered by Veritas

How To Backup A Virtualized Environment

IBM Virtualization Engine TS7700 GRID Solutions for Business Continuity

SCALABILITY AND AVAILABILITY

Executive Summary WHAT IS DRIVING THE PUSH FOR HIGH AVAILABILITY?

RED HAT ENTERPRISE VIRTUALIZATION FOR SERVERS: COMPETITIVE FEATURES

VMware System, Application and Data Availability With CA ARCserve High Availability

Integrated Application and Data Protection. NEC ExpressCluster White Paper

A Continuous-Availability Solution for VMware vsphere and NetApp

A High Availability Clusters Model Combined with Load Balancing and Shared Storage Technologies for Web Servers

High Availability & Disaster Recovery Development Project. Concepts, Design and Implementation

PolyServe Matrix Server for Linux

Open-Xchange Server High availability Daniel Halbe, Holger Achtziger

Comparing TCO for Mission Critical Linux and NonStop

SAP Solutions High Availability on SUSE Linux Enterprise Server for SAP Applications

Hyper-V 3.0: Creating new virtual data center design options Top four methods for deployment

Pacemaker. A Scalable Cluster Resource Manager for Highly Available Services. Owen Le Blanc. I T Services University of Manchester

Enabling Database-as-a-Service (DBaaS) within Enterprises or Cloud Offerings

HA clustering made simple with OpenVZ

Achieving High Availability

Red Hat Cluster Suite

Veritas Storage Foundation High Availability for Windows by Symantec

CA arcserve r16.5 Hybrid data protection

HP StorageWorks Data Protection Strategy brief

Fault Tolerant Servers: The Choice for Continuous Availability

Siemens PLM Connection. Mark Ludwig

High Availability with Elixir

Red Hat Global File System for scale-out web services

High Availability Infrastructure for Cloud Computing

Avaya Aura Virtualized Environment

IdP Clustering. You want to prevent service outages. High Availability and Load Balancing. Possible problems: HW failures

SOLUTION BRIEF: CA ARCserve R16. Leveraging the Cloud for Business Continuity and Disaster Recovery

HP Serviceguard Cluster Configuration for HP-UX 11i or Linux Partitioned Systems April 2009

Implementing Storage Concentrator FailOver Clusters

Citrix XenServer Industry-leading open source platform for cost-effective cloud, server and desktop virtualization. citrix.com

Red Hat Enterprise Linux as a

Integration of PRIMECLUSTER and Mission- Critical IA Server PRIMEQUEST

A Project Summary: VMware ESX Server to Facilitate: Infrastructure Management Services Server Consolidation Storage & Testing with Production Servers

Transcription:

Open Source High Availability on Linux Alan Robertson alanr@unix.sh OR alanr@us.ibm.com

Agenda - High Availability on Linux HA Basics Open Source High-Availability Software for Linux Linux-HA Open Source project DRBD Open Source Project Linux Virtual Server (LVS) Project Slide 2

The Desire for HA Systems Who wants low availability systems? Why are so few systems High Availability? Slide 3

Barriers to HA Systems Cost Very manageable with modern hardware, OSS software Complexity Can't give away 'simplicity' good management tools help Slide 4

Slide 5

What would be the result? Increased Availability Drastically multiplying customers multiplies experience - products mature faster (especially in OSS model) OSS developers grow from customers OSS Clustering is a disruptive technology Slide 6

What is a Computer Cluster? From Wikipedia: A computer cluster is a group of loosely coupled computers that work together closely so that in many respects they can be viewed as though they are a single computer. Clusters are usually deployed to improve performance and/or availability over that provided by a single computer, while typically being much more cost-effective than single computers of comparable speed or availability. Slide 7

HA vs. HPC Clustering HPC clusters work primarily to manage and maximize the increased performance which results from having multiple computers working together High-Availability clusters primarily work to manage and maximize the increased availability which is possible when multiple computers work together These goals are not mutually exclusive Slide 8

What is an HA cluster? A group of computers which cooperate to provide a service even when system components fail When one machine goes down, others take over its work This involves IP address takeover, service takeover, etc. New work comes to the takeover machine When a service fails, it is restarted Can be restarted on the same server or a different one Slide 9

What Can HA clustering do for you? It cannot achieve 100% availability nothing can. HA Clustering primarily designed to recover from single faults It can make your outages very short From about a second to a few minutes It is like a Magician's (Illusionist's) trick: When it goes well, the hand is faster than the eye When it goes not-so-well, it can be reasonably visible A good HA clustering system adds a 9 to your base availability 99->99.9, 99.9->99.99, 99.99->99.999, etc. Complexity is the enemy of reliability! Slide 10

High Availability Approach - Redundancy Redundancy eliminates Single Points Of Failure (SPOF) Reduces cost of planned and unplanned outages Slide 11

The 3 R's of High-Availability Redundancy Redundancy Redundancy If this sounds redundant, that's probably appropriate... ;-) HA Clustering is a good way of providing and managing redundancy Slide 12

High Availability Approach - Failover Auto detect Failures (hardware, network, applications) Automatic Recovery from failures (no human intervention) Managed failover to standby systems, components Slide 13

Statistics... Counting Nines... Availability percentage Yearly downtime 100% 0 99.99999% 3s 99.9999% 30 sec 99.999% 5 min 99.99% 52 min 99.9% 9 hr 99% 3.5 day Slide 14

Two Node Active/Passive HA Cluster Shared Disk (DS4000, ESS, etc.) Slide 15

Two Node Active/Active HA Cluster Shared Disk (DS4000, ESS, etc.) Slide 16

Linux-HA ( heartbeat ) Project Open Source Project (IBM Leadership) Multiple platform solution for Linux, Solaris, *BSD, OS/X Packaged with most Linux Distributions (except Red Hat) Part of OSCAR-HA package Strong focus on ease-of-use, security, low-cost > 30K clusters in production since 1999 Equal to or superior to commercial HA packages Slide 17

What is the "Linux-HA" project? An open-community project providing basic fail over capabilities for Linux (and other OSes) Active, open development community led by IBM Wide variety of industries, applications Reference implementation for Open Cluster Framework (OCF) standards Simple to understand and easy to install No special hardware requirements; no kernel dependencies, all user space All releases tested by automatic test suites http://linux-ha.org/ Slide 18

"Linux-HA" Successes FexEx used in truck scheduling The Weather Channel (weather.com) BBC internet infrastructure CERN grid services Los Alamos National Laboratories badge readers Sony - manufacturing processes United Nations Intuit (Quicken, TurboTax, etc.) use it for firewalls Agilent Technologies in Fort Collins 3 clusters ISO New England manages the New England power grid using 12 "Linux HA" clusters University of Toledo 20K user WebCT System Emageon medical imaging services ADC telco provisioning manager product (w/ x330/335) Incredimail uses "Linux HA" on IBM hardware Bavarian Radio Station (Munich) used "Linux HA" and xseries for coverage of 2002 Olympics in Salt Lake City More listed at: http://linux-ha.org/successstories Slide 19

Linux-HA Capabilities Supports n-node clusters where 'n' is currently <= something like 16 Active/Passive or full Active/Active Can use UDP bcast, mcast, ucast comm. Fails over on node failure, or on service (resource) failure Fails over on loss of IP connectivity, or arbitrary criteria Support for the OCF resource management standard Sophisticated dependency model with rich constraint support (resources, groups, incarnations, master/slave) XML-based resource configuration Configuration and monitoring GUI Support for OCFS2 cluster filesystem others coming Slide 20

Linux-HA futures being considered Business Continuity support (in source control now) Specific virtualization support Transparent migration Containerized resources (peek inside client VM via proxy) Increase number of nodes directly supported Loosen cluster definition to manage many more nodes through hierarchical proxies Integration with provisioning software Slide 21

DRBD Distributed Replicating Block Device RAID1 over the LAN DRBD is a block-level replication technology it works underneath any (non-clustered) filesystem Every time a block is written on the master side, it is copied over the LAN and written on the slave side It is extremely cost-effective common with xseries Typically, a dedicated replication link is used Also used with slower links for Business Continuity Worst-case around 10% throughput loss typically negligible Current versions have very fast full resync Slide 22

Two Node Active/Passive HA Cluster Real-Time Disk Replication (DRBD) DRBD = Distributed Replicating Block Device Slide 23

Linux Virtual Server (LVS) Project Linux Virtual Server (LVS/ipvs) comes with Linux, very widely used IP sprayer type of load balancer Commonly used in server farm type arrangements Integrates well with Linux-HA Used in many mission-critical applications (like medical imaging, credit card authorization, nuclear facilities) Some customers perform stateful load-balancer failover in less than.5 seconds Support for stateful active/active load balancer clusters Slide 24

LVS In Action Slide 25

Plays Well With Others Each of these independent services can work together to scale to large systems All single points of failure can be eliminated High-Availability, Load Balancing work together nicely Slide 26

Linux-HA, DRBD and LVS Working Together Slide 27

References http://linux-ha.org/ http://www.drbd.org/ http://www.linuxvirtualserver.org/ Slide 28

Legal Statements IBM is a trademark of International Business Machines Corporation. Linux is a registered trademark of Linus Torvalds. Other company, product, and service names may be trademarks or service marks of others. This work represents the views of the author and does not necessarily reflect the views of the IBM Corporation. Slide 29