High Availability Architectures For Linux on IBM System z



Similar documents
Architecture of the proposed standard

Continuity Cloud Virtual Firewall Guide

The example is taken from Sect. 1.2 of Vol. 1 of the CPN book.

C H A P T E R 1 Writing Reports with SAS

A Project Management framework for Software Implementation Planning and Management

Free ACA SOLUTION (IRS 1094&1095 Reporting)

by John Donald, Lecturer, School of Accounting, Economics and Finance, Deakin University, Australia

Econ 371: Answer Key for Problem Set 1 (Chapter 12-13)

Cisco Data Virtualization

5 2 index. e e. Prime numbers. Prime factors and factor trees. Powers. worked example 10. base. power

An Broad outline of Redundant Array of Inexpensive Disks Shaifali Shrivastava 1 Department of Computer Science and Engineering AITR, Indore

REPORT' Meeting Date: April 19,201 2 Audit Committee

Remember you can apply online. It s quick and easy. Go to Title. Forename(s) Surname. Sex. Male Date of birth D

Adverse Selection and Moral Hazard in a Model With 2 States of the World

ITIL & Service Predictability/Modeling Plexent

Maintain Your F5 Solution with Fast, Reliable Support

QUANTITATIVE METHODS CLASSES WEEK SEVEN

June Enprise Rent. Enprise Author: Document Version: Product: Product Version: SAP Version:

Important Information Call Through... 8 Internet Telephony... 6 two PBX systems Internet Calls... 3 Internet Telephony... 2

SCHOOLS' PPP : PROJECT MANAGEMENT

WORKERS' COMPENSATION ANALYST, 1774 SENIOR WORKERS' COMPENSATION ANALYST, 1769

Key Management System Framework for Cloud Storage Singa Suparman, Eng Pin Kwang Temasek Polytechnic

Planning and Managing Copper Cable Maintenance through Cost- Benefit Modeling

Product Overview. Version 1-12/14

CARE QUALITY COMMISSION ESSENTIAL STANDARDS OF QUALITY AND SAFETY. Outcome 10 Regulation 11 Safety and Suitability of Premises

Rural and Remote Broadband Access: Issues and Solutions in Australia

union scholars program APPLICATION DEADLINE: FEBRUARY 28 YOU CAN CHANGE THE WORLD... AND EARN MONEY FOR COLLEGE AT THE SAME TIME!

IHE IT Infrastructure (ITI) Technical Framework Supplement. Cross-Enterprise Document Workflow (XDW) Trial Implementation

Category 7: Employee Commuting

Development of Financial Management Reporting in MPLS

Parallel and Distributed Programming. Performance Metrics

Entity-Relationship Model

CalOHI Content Management System Review

Personal Identity Verification (PIV) Enablement Solutions

Use a high-level conceptual data model (ER Model). Identify objects of interest (entities) and relationships between these objects

Moving Securely Around Space: The Case of ESA

A Secure Web Services for Location Based Services in Wireless Networks*

Nimble Storage Exchange ,000-Mailbox Resiliency Storage Solution

Keywords Cloud Computing, Service level agreement, cloud provider, business level policies, performance objectives.

Asset set Liability Management for

STATEMENT OF INSOLVENCY PRACTICE 3.2

Fleet vehicles opportunities for carbon management

IBM Healthcare Home Care Monitoring

Designing a Secure DNS Architecture

Lecture 3: Diffusion: Fick s first law

User-Perceived Quality of Service in Hybrid Broadcast and Telecommunication Networks

Sci.Int.(Lahore),26(1), ,2014 ISSN ; CODEN: SINTE 8 131

Developing Economies and Cloud Security: A Study of Africa Mathias Mujinga School of Computing, University of South Africa mujinm@unisa.ac.

High Availability for Linux on IBM System z Servers

Entry Voice Mail for HiPath Systems. User Manual for Your Telephone

FACULTY SALARIES FALL NKU CUPA Data Compared To Published National Data

Data warehouse on Manpower Employment for Decision Support System

Question 3: How do you find the relative extrema of a function?

High Availability Architectures for Linux in a Virtual Environment

Caution laser! Avoid direct eye contact with the laser beam!

DENTAL CAD MADE IN GERMANY MODULAR ARCHITECTURE BACKWARD PLANNING CUTBACK FUNCTION BIOARTICULATOR INTUITIVE USAGE OPEN INTERFACE.

Lecture 20: Emitter Follower and Differential Amplifiers

Increasing Net Debt as a percentage of Average Equalized ValuaOon

Scalable Transactions for Web Applications in the Cloud using Customized CloudTPS

CPS 220 Theory of Computation REGULAR LANGUAGES. Regular expressions

Teaching Computer Networking with the Help of Personal Computer Networks

Combinatorial Analysis of Network Security

Swisscom Cloud Strategy & Services

Title: Patient Safety Improvements through Real-Time Inventory Management

Lift Selection Guide

Cookie Policy- May 5, 2014

Foreign Exchange Markets and Exchange Rates

M.1 Emergency Response Continuity of Operations Plan

Incomplete 2-Port Vector Network Analyzer Calibration Methods

EFFECT OF GEOMETRICAL PARAMETERS ON HEAT TRANSFER PERFORMACE OF RECTANGULAR CIRCUMFERENTIAL FINS

Enforcing Fine-grained Authorization Policies for Java Mobile Agents

FEASIBILITY STUDY OF JUST IN TIME INVENTORY MANAGEMENT ON CONSTRUCTION PROJECT

Sample Green Belt Certification Examination Questions with Answers

Repulsive Force

Traffic Flow Analysis (2)

A Theoretical Model of Public Response to the Homeland Security Advisory System

Global Sourcing: lessons from lean companies to improve supply chain performances

SPECIAL VOWEL SOUNDS

Analyzing Failures of a Semi-Structured Supercomputer Log File Efficiently by Using PIG on Hadoop

Contents. Presentation contents: Basic EDI dataflow in Russia. eaccounting for HR and Payroll. eaccounting in a Cloud

Thursday, March 18, :07 PM Page 1 of 16

Job Description. Programme Leader & Subject Matter Expert

Advances in GNSS Equipment

Magic Message Maker Amaze your customers with this Gift of Caring communication piece

Engineering Analytics Opportunity Preview Zinnov Report August 2013

Keynote Speech Collaborative Web Services and Peer-to-Peer Grids

Hardware Modules of the RSA Algorithm

I/O Deduplication: Utilizing Content Similarity to Improve I/O Performance

TIME MANAGEMENT. 1 The Process for Effective Time Management 2 Barriers to Time Management 3 SMART Goals 4 The POWER Model e. Section 1.

AP Calculus AB 2008 Scoring Guidelines

B April 21, The Honorable Charles B. Rangel Ranking Minority Member Committee on Ways and Means House of Representatives

GOAL SETTING AND PERSONAL MISSION STATEMENT

Category 11: Use of Sold Products

Defense Logistics Agency STANDARD OPERATING PROCEDURE

Transcription:

High Availability Architcturs For Linux on IBM Systm z March 31, 2006 High Availability Architcturs for Linux on IBM Systm z 1

Contnts Abstract...3 Introduction. Dfinition of High Availability...4 Chaptr 1: Introduction to High Availability ith z/vm and LPARs...4 Chaptr 2: Scnarios...9 Chaptr 3: Rfrnc Architctur: Non-WbSphr application...10 Scnario Bing Solvd...10 Architctur Principls...10 Rfrnc Architctur...10 st Exampl: Tivoli Systm Automation...11 Scond Exampl: Linux Virtual and Linux-HA...14 Chaptr 4: Rfrnc Architctur: WbSphr ith DB2 databas on Linux...17 Scnario Bing Solvd...17 Architctur Principls...17 Rfrnc Architctur...18 Architctural Dcisions...22 Ky problm aras...24 Solution stimating guidlins...24 Chaptr 5: Rfrnc Architctur: WbSphr ith Oracl databas on Linux...24 Scnario Bing Solvd...24 Architctur Principls...24 Rfrnc Architctur...25 Architctural Dcisions...27 Ky problm aras...27 Solution stimating guidlins...28 Chaptr 6: Rfrnc Architctur: WbSphr ith DB2 databas on z/os...28 Scnario Bing Solvd...28 Architctur Principls...28 Rfrnc Architctur...29 Architctural Dcisions...35 Solution stimating guidlins...37 Chaptr 7: Rfrnc Architctur: WbSphr ith DB2 databas on z/os, in sparat citis. (GDPS /PPRC Multiplatform Rsilincy for Systm z ith HyprSap Managr)...37 Scnario Bing Solvd...37 Architctur Principls...38 Rfrnc Architctur...39 Architctural Dcisions...42 Altrnativs Considrd: Activ - Passiv dploymnt...43 Ky rstrictions or problm aras...45 Solution stimating guidlins...45 Appndix 1: Rfrncs...46 Appndix 2: Lgal Statmnt...47 High Availability Architcturs for Linux on IBM Systm z 2

Authors Stv Whr IBM Systm z N Tchnology Cntr Scott Lovland IBM Systm z Linux Intgration Tst Harrit Morril High Availability Cntr of Comptnc Scott Eptr High Availability Cntr of Comptnc Abstract This papr combins th fforts and talnts of th IBM Systm z N Tchnology Cntr, th High Availability Cntr of Comptnc, and Linux on Systm z Intgration Tst to produc a st of rfrnc architcturs that provid High Availability for applications running on Linux for Systm z. This papr focuss on thos architcturs that covr th folloing scnarios: Whr th application runs on Linux virtual srvrs undr z/vm. Th databas may b on Linux for Systm z or on z/os. Of highst intrst to our customrs. Uniqu to Systm z. Not covrd ar scnarios and architcturs that hav alrady bn documntd on othr distributd platforms. WbSphr HA has bn xtnsivly covrd in many documnts. Although our architcturs ill us HA faturs of WbSphr, ill not concntrat on documnting WbSphr. (Plas s th rfrncs sction for mor WbSphr HA documntation). Rathr, ill concntrat on th HA aspcts of th databas srvrs that WbSphr applications ould us and ho Systm z HA faturs can bnfit databas srvrs. Hav not bn documntd bfor on Systm z. This papr dos not covr: All th dtails ncssary to implmnt th rfrnc architcturs. For thos dtails, plas rfr to Systm z Platform Tst Rport for z/os and Linux Virtual s rittn by th IBM Poughkpsi Tst and Intgration Cntr for Linux. HA ntorking considrations. W covr th major componnts and flo btn thm. W hav not covrd ho to crat a highly availabl ntork. Ho to HA-nabl your storag subsystm. High Availability Architcturs for Linux on IBM Systm z 3

Introduction. Dfinition of High Availability For th purpos of this papr, hav adoptd th dfinition usd by th HA Cntr of Comptnc in Poughkpsi, NY. High Availability Dsignd to provid srvic during dfind priods, at accptabl or agrd upon lvls, and masks unplannd outags from nd-usrs. It mploys Fault Tolranc; Automatd Failur Dtction, Rcovry, Bypass Rconfiguration, Tsting, Problm and Chang Managmnt Continuous Oprations (CO) -- Dsignd to continuously oprat and mask plannd outags from nd-usrs. It mploys non-disruptiv hardar and softar changs, non-disruptiv configuration, and softar coxistnc. Continuous Availability (CA) Dsignd to dlivr non-disruptiv srvic to th nd usr 7 days a k, 24 hours a day (thr ar no plannd or unplannd outags). Our architcturs striv to provid Continuous Availability. Not that in som architcturs this is not possibl du to dlays in th automatd rcovry of som systm componnts. Ths dlays can b long nough to caus usr transactions to fail and hav to b r-ntrd. Chaptr 1: Introduction to High Availability ith z/vm and LPARs Whn Linux runs on distributd architcturs it is oftn running dirctly on th hardar of a singl srvr. Although psris srvrs can no hav logical partitions, thir virtualization capabilitis ar not as xtnsiv as Systm z, hr z/vm os systm rsourcs to b dynamicy shard. Linux on Systm z is alays running in a logical partition (LPAR). So hav introducd to n layrs btn Linux and th hardar, namly z/vm and LPAR. Ths layrs play prominntly in th availability of your applications, bcaus thy provid srvics that th Linux systms us. Whr ar th Singl Points of Failur (SPoFs)? Considr an xampl hr a Systm z srvr has svral LPARs running z/os, and on LPAR running z/vm to host Linux gusts. You hav instd an application on a singl Linux srvr. Whr ar th points of failur? Thr ar svral: Th Systm z hardar could xprinc multipl unrcovrabl failurs, causing th ntir srvr to fail. Th disk subsystm could fail. Not that this papr dos not includ any information on HAnabling th disk subsystm. Th LPAR microcod could fail. z/vm could fail. Linux could fail. Application A could fail. Th odds of ach failur ar diffrnt. In this cas, th probability of an application failur is highst, hil th probability of th Systm z hardar failur is lost. Th othrs f on a continuum btn thos xtrms. High Availability Architcturs for Linux on IBM Systm z 4

So ho do liminat ths singl points of failur? An asy and ffctiv mthod is to liminat thm by duplicating thm. Duplicating th application is usuy asy, but duplicating th Systm z hardar can b xpnsiv, ith th cost and difficulty of th othrs fing on a continuum btn ths xtrms. Th folloing tabl summarizs ths points: Singl Point of Failur Probability of Failur Cost to fix SPoF Systm z hardar Vry Lo High Disk Subsystm Vry Lo Mdium LPAR Vry Lo Lo z/vm Lo Lo Linux Lo Vry Lo Application High Vry Lo Bsids hardar and softar failurs, th folloing can also caus dontim for th application: Systm z hardar upgrads rquiring Por On Rst POR LPAR configuration changs rquiring rboot of th LPAR z/vm maintnanc Linux krnl maintnanc that rquirs rboot Application maintnanc Thr ar no probabilitis that can b assignd to ths sinc thy ar dirctly undr th control of th customr. Th customr s policis ill dictat ho oftn ths ill occur. In ordr of incrasing availability, th folloing xampls xamin som possibl architcturs and thir singl points of failur. Exampl 1: High Availability not ndd In this xampl, an application is instd on a singl Linux srvr that runs undr z/vm. Th SPoFs for th application ar: Systm z hardar LPAR z/vm Linux Application zsris Systm A z/vm running in LPAR 1 Dmgr DB2 Each sm box rprsnts a virtual Linux srvr running as a gust of z/vm, in a singl z/vm LPAR High Availability Architcturs for Linux on IBM Systm z 5

Th most likly to fail is th application. Whn this happns, th application can b rstartd and rcovry tim ill b a f minuts. Or prhaps Linux has to b rbootd and rcovr tim could b around 5 minuts. Rcovry rquirs that som manual or automatic mthod b implmntd to dtct th failur and initiat rcovry. If this rcovry tim is sufficint thn thr is nothing mor that nds to b don. If not thn highr availability is ndd. Exampl 2: Modrat Availability Ndd In this xampl, th application is instd on multipl Linux srvrs that run undr z/vm. Th SPoFs for th application ar rducd to: Systm z hardar LPAR z/vm zsris Systm A z/vm running in LPAR 1 Dmgr DB2 (Pri) DB2 (Bkup) WbSphr Clustr Each sm box rprsnts a virtual Linux srvr running as a gust of z/vm, in a singl z/vm LPAR. By simply rplicating th application across to or mor Linux srvrs, hav rmovd th most likly points of failurs. Workload is distributd to th duplicatd srvrs so that a failur in any srvr still lavs anothr srvr availabl to procss th orkload. A failur in any Linux srvr still os th application to run ith th full rsourcs availabl to it bfor on th rmaining virtual srvrs. This fatur is uniqu to Linux on Systm z. Bcaus of th CPU and mmory is shard among th Linux srvrs undr z/vm, a failur of on Linux srvr or application frs up its mmory and CPU for us by othr Linux srvrs. For xampl, if th to srvrs ar both 80% CPU utilizd (using 80% of on CPU), thn if on of thm fails th othr can incras it s CPU utilization to 80% of to CPUs. This os th rmaining srvr to procss th full orkload immdiatly, ithout having to rconfigur th srvr. This is vry diffrnt from failovr scnarios in distributd architcturs, hr ach srvr must b sizd to handl significantly mor than its on orkload so that th srvr can hav th capacity to handl additional orkload if anothr srvr fails. High Availability Architcturs for Linux on IBM Systm z 6

Why us only to WbSphr srvrs? On Systm z thr is usuy littl rason to us mor than to production srvrs in a WbSphr clustr. Usuy th ntir orkload can b accomplishd ith on srvr; th scond is addd only for failovr. In Linux on Systm z, adding mor virtual srvrs dos not add any mor procssing rsourcs (CPU, mmory) to th application, but instad maks z/vm ork hardr to run of th production srvrs in mmory simultanously. For ths rasons rcommnd only to production srvrs. Som bottlncks do xist that can b lssnd by duplicating th WbSphr application srvr. Application s can b clond ithr horizonty (on anothr Linux srvr) or vrticy (on th sam Linux srvr). Bottlncks that can b hlpd by this includ: Not nough JVM hap to run th application at th dsird orkload Not nough connctions in th connction pool Anothr considration for running to WbSphr srvrs is that if on is unavailabl du to ithr a plannd or unplannd outag, you hav only on srvr lft to handl th ntir orkload. This is usuy not a problm on Linux on Systm z bcaus of th rsourcs (CPU and mmory) that r availabl to both srvrs bfor th failur ar no availabl to th rmaining srvr. This dos lav you ith a singl point of failur, hovr, during th tim hn on of th srvrs is unavailabl. For ths rasons, it is rcommndd that you us thr srvrs instad of to hn highr availability is rquird. Exampl 3: High Availability Ndd In this xampl th application is instd on multipl Linux srvrs that run undr multipl z/vm systms on multipl LPARs. Th SPoFs for th application ar rducd to: Systm z hardar zsris Systm A z/vm running in LPAR 1 Dmgr DB2 (Pri) DB2 (Bkup) z/vm running in LPAR 2 WbSphr Clustr W hav liminatd most of th SPoFs by crating a scond LPAR that ill run z/vm and Linux gusts. Th application is instd on Linux srvrs in ach LPAR. Th cost for doing this is still lo, sinc both LPARs ill shar th sam IFLs, and th ral mmory can b split btn th to LPARs. High Availability Architcturs for Linux on IBM Systm z 7

A failur in th application, th Linux srvr, VM, or LPAR still os th application to run on th rmaining virtual srvrs ith th full rsourcs availabl to it bfor th failur. Should on srvr, VM, or LPAR fail, th othr LPAR can us of th IFLs that r bing shard by both. Bcaus you ar running th sam softar on th sam numbr of IFLs, softar costs do not incras. For ths rasons, this is on of th most cost-ffctiv High Availability architcturs for Linux on Systm z. Exampl 4: Continuous Availability Ndd In this xampl, th application is instd on multipl Linux srvrs that run undr multipl z/vm systms on multipl LPARs on multipl Systm z srvrs. No SPoFs for th application rmain. zsris Systm A z/vm running in LPAR 1 on Systm A Dmgr DB2 (Pri) zsris Systm B DB2 (Bkup) z/vm running in LPAR 2 on Systm B WbSphr Clustr W hav liminatd th SPoFs by using a scond Systm z srvr (Systm B) to host our scond LPAR that ill run z/vm and Linux gusts. Th application is instd on Linux srvrs in ach LPAR. During normal oprations, ach LPAR rcivs 50% of th orkload of th application. Hovr, th cost of adding a scond Systm z srvr is to run 100% of th orkload should th othr srvr fail. Softar costs incras bcaus of this. It is mor cost-ffctiv to run a Systm z LPAR nar 100% CPU utilization. But th abov architctur ould run ach LPAR narr to 50% utilization, so that it has xtra capacity in cas of a failur of th othr LPAR. Som altrnativs to bring th utilization narr to 100%: Configur fr IFLs than ar ndd to run 100% of th orkload in th LPAR. Configur othr IFLs as standby IFLs that can b brought onlin quickly ith Capacity Upgrad on Dmand. Whn xtra capacity is ndd: 1. Th standby IFLs ar dfind as activ to th LPAR. 2. VM varis th n IFLs onlin. This procss is non-disruptiv and can b automatd or compltd manuy in a f minuts. High Availability Architcturs for Linux on IBM Systm z 8

Run othr lor-priority ork in ach LPAR. Configur th Linux gusts so that thos running th WbSphr orkload hav a highr priority than thos running th othr ork. If a failovr occurs VM ill giv th systm s CPU and mmory to th WbSphr gusts and ithhold CPU and mmory from th othr orkloads. Summary Exampls 2-4, and of th rfrnc architcturs in this papr, us clustrs containing to mmbrs. You can alays choos to instad crat a clustr of thr srvrs. In our xampls that us on srvr typ pr LPAR, you could instad dfin thr LPARs. Th advantag of a thrmmbr clustr ovr a to-mmbr clustr is that should on clustr mmbr fail, you still rtain a clustr of to mmbrs, and a good dgr of High Availability. With a to-mmbr clustr, if on mmbr fails thn you ar no running in non-ha mod on th singl rmaining mmbr, until th faild clustr mmbr can b brought back onlin. Whn th absolut highst lvls of availability must b maintaind at tims, it is rcommndd to us a clustr of thr LPARs. Othris a clustr of to LPARs is sufficint. Th abov xamplsction Exampl 3: High Availability Ndd,shos th most cost-ffctiv solution for architcting LPARs and VM for High Availability. Th folloing is rcommndd: Us a singl Systm z srvr so that your z/vm and Linux LPARs can shar th sam IFLs. Us to LPARs to run your production orkload. Crat clustrs of applications split btn Linux srvrs running in ach LPAR. Run your tst and dvlopmnt Linux srvrs ithr in: o Thir on LPAR. You can us th folloing LPAR ights as starting valus: Production1: 35% Production2: 35% Tst/Dv: 30% o On of th to production LPARs. Giv that LPAR mor rsourcs than th othr production LPAR. You must nsur that th production gusts hav priority in gtting systm rsourcs. You can us th folloing z/vm SHARE valus as starting valus for th Linux gusts: Production gusts: SHARE 400 rlativ limitsoft Tst gusts: SHARE 200 rlativ limitsoft Dvlopmnt gusts: SHARE 100 rlativ limitsoft Chaptr 2: Scnarios Th rfrnc architcturs in this documnt addrss fiv typical customr scnarios. Each scnario builds on th last, incrasing in complxity. Not that most of ths scnarios concntrat on hr th data is. W hav chosn to concntrat on such scnarios bcaus: A ky strngth of Systm z is its ability to b a highly-availabl databas srvr. High Availability in distributd WbSphr applications is ll documntd alrady, but a nd xists for architcturs hr WbSphr on z/linux is using DB2 on z/os. For scnarios, our goal is to provid a rfrnc architctur that: Is rapidly scalabl to support incrass or dcrass in businss volum. Many tims this can b accomplishd simply by bringing mor IFLs onlin to th xisting architctur. Provids nar-instantanous failovr, ith almost no loss of usr transactions. High Availability Architcturs for Linux on IBM Systm z 9

Scnario: A non-wbsphr application You hav a critical application that runs on Linux on Systm z. This application dos not us WbSphr or any databas. Th application may hav bn rittn by th customr, bought from an ISV, or b a srvr that is part of Linux, but it has no HA faturs itslf. Scnario: WbSphr ith DB2 databas on Linux You hav a critical WbSphr application that runs on Linux on Systm z. Th primary databas for this application is DB2 UDB also running on Linux on Systm z. Th databas fils ar on SCSI disks. Scnario: WbSphr ith Oracl databas on Linux You hav a critical WbSphr application that runs on Linux on Systm z. Th primary databas for this application is Oracl also running on Linux on Systm z. Th databas fils ar on Extndd Count Ky Data (ECKD) disks. Scnario: WbSphr ith DB2 databas on z/os You hav a critical WbSphr application that runs on Linux on Systm z. Th primary databas for this application is DB2 running on z/os. Som of th application logic runs as DB2 stord procdurs. Scnario: WbSphr ith DB2 databas on z/os, in sparat sits You hav svral critical WbSphr applications that run on Linux on Systm z. Th primary databas for ths applications is DB2 running on z/os. You also nd to nsur that if an ntir data cntr is lost, anothr data cntr in a sparat sit can assum th ork of th first data cntr. Chaptr 3: Rfrnc Architctur: Non-WbSphr application Scnario Bing Solvd You hav a ky application that runs in Linux on Systm z in a Non-WbSphr nvironmnt and dos not rquir a databas. This could b a homgron application. Architctur Principls This architctur is dsignd to follo ths principls: Softar is gnry considrd lss rliabl than hardar. Th Systm z hardar contains rdundant componnts, making its MTBF (Man Tim Btn Failur) in th rang of yars. Bcaus th Systm z hardar is so rliabl, o th Systm z srvr to b a singl point of failur in this architctur. W duplicat th softar nvironmnts (LPAR, VM, s, and Linux) so that non of thm is a singl point of failur. Any failur ill not b noticabl to th usr. Th currnt transaction may fail, but subsqunt transactions succd. Aftr any singl failur, transactions continu at th sam rat ith no dgradation in throughput or rspons tim. Th bas architctur anticipats a lo nough volum that it can b managd by a singl srvr, but can scal if ncssary to support incrass and dcrass in businss volum. Rfrnc Architctur Th xampl scnario hr ill us an srvr as th xampl application. W ill us a Srvic IP for availability purposs. A Srvic IP addrss is a singl IP addrss by hich th High Availability Architcturs for Linux on IBM Systm z 10

srvr is knon to th outsid orld. In th vnt of a failovr, this IP addrss must b rassignd to th n srvr. This choic offrs th folloing bnfits as an xampl: srving is a common application for Linux, yt is lightight nough that anothr application could b asily substitutd. Th us of a Srvic IP not only illustrats solid availability, it also rprsnts an additional rquird rsourc for th xampl. Evn th simplst of b-srving arrangmnts rquirs at last on additional rsourc for dynamic contnt. Dmonstrating th failovr considrations affordd by a Srvic IP ill provid a multi-rsourc xampl that can gnraliz to othr xampls rquiring multipl rsourcs. Bcaus this architctur is th simplst xampl in this documnt, ill dmonstrat to diffrnt approachs to achiving high availability ith an srvr and Srvic IP: Using IBM Tivoli Systm Automation for Multiplatforms (SA): SA os th abstraction of rsourcs in rsourc groups and has a porful, policy-basd mchanism for asily dfining dpndncis among rsourc-group lmnts. It is an appropriat choic for simplicity of srvic, as it is a fully-supportd IBM product. Not that th SA xampl ill dmonstrat th us of a cold standby srvr and is thus not continuously availabl. In a failovr vnt, thr ill b som dontim associatd ith bringing up th backup srvr. Using Opn Sourc packags, namly Linux-HA and Linux Virtual (LVS): This approach os th cration of a load-balancing clustr of srvr nods. Th Srvic IP in this cas is associatd ith an LVS Dirctor instanc, hich sprays incoming rqusts ovr multipl instancs. Bcaus th srvr instancs ar clustrd, on srvr going don ill simply caus it to b fncd from th clustr. Th LVS Dirctor ill not rout any rqusts to it. Not that th LVS Dirctor in this cas is a singl point of failur. W st up a scond LVS Dirctor as a standby, and us Linux-HA (also knon as Hartbat ) btn th to LVS Dirctors. Th scondary LVS Dirctor can b liv, and thus failovr tims ould b considrd ithin an accptabl tolranc for continuous availability of th systm. Not that this xampl uss a mix of opn sourc applications from IBM and non-ibm projcts and could thus rprsnt additional srvic ovrhad. IBM Support for Linux-HA is availabl. IBM Support is not availabl for Linux Virtual. Not also that WbSphr Edg Componnts provid ssntiy th sam function as LVS. Th Edg Componnts Load Balancr can b configurd ith a hot standby. S Chaptr 4: Rfrnc Architctur: WbSphr ith DB2 Databas on Linux in this documnt for mor information about WbSphr Edg Componnts. st Exampl: Tivoli Systm Automation In this architctur, th and Srvic IP ar dfind as virtual rsourcs in a SA rsourc group. Th Srvic IP acts as a floating IP addrss. Its valu rmains fixd vn if th Linux instanc to hich it points changs (such as through a failovr). A Srvic IP is a singl virtual IP addrss by hich th currntly-activ is knon to th routr. Th concpt of a Srvic IP is not spcific to SA, but SA can vi a Srvic IP addrss as a virtual rsourc in a SA rsourc group. SA ill handl th managmnt of a Srvic IP by assigning th IP to th propr machin as ndd. For xampl, th failur of th currntly-activ ill caus SA to assign th Srvic IP to th assignd backup. Th dpnds on th Srvic IP for th addrss by hich th routr knos th srvr. Th and Srvic IP instanc ar ach knon as rsourcs in th SA Rsourc Group as shon in th diagram blo. Th 1,2 dsignation undr ach of th rsourcs is a nodlist that dsignats th nods on hich th rsourc can xist. Th strict Dpnds On rlationship btn High Availability Architcturs for Linux on IBM Systm z 11

th rsourcs ill caus SA to nforc collocation of th rsourcs. SA also supports a Dpnds on Any rlationship, in hich rsourcs in a rsourc group can b locatd on sparat nods. Th Dpnds On rlationship also nforcs a startup ordr btn th to rsourcs. A Srvic IP instanc must point to a nod bfor an can b startd on that nod. If ithr instanc gos don, th Srvic IP ill b assignd to th backup nod, and thn th ill b startd on th backup nod. Th nodlist as mntiond abov rfrs to th to Linux instancs that can run th rsourcs managd by SA. Ths instancs must b stup as mmbrs of an IBM Rliabl Scalabl Clustr Tchnology (RSCT) clustr. RSCT is an IBM softar packag that provids clustr monitoring and managmnt srvics. SA instation includs instation of th RSCT packag. RSCT provids th undrlying rsourc monitoring (hartbating) that SA uss to kp track of th stat of nods in th clustr. Aftr th clustr is stablishd, command-lin commands ar issud to manag th rsourcs undr SA. SA Rsourc Group Dpnds On Srvic IP brg 1,2 1,2 By virtu of this rlationship SA ill manag th IP addrss, associating it ith th scondary srvr if anything gos rong ith th primary srvr, ithr th srvr application, th Linux O/S, or zvm. Aftr th IP addrss has bn assignd to th backup nod, SA ill start th srvr on that nod. Th initial stat ill b as shon in th folloing diagram: z/vm LPAR 1 Routr Srvic IP points hr SAMP RSCT Hartbat SAMP z/vm LPAR 2 Not that th RSCT/SA combination is strictly managing th and Srvic IP addrss. Nithr RSCT nor SA is involvd in th flo of procssing actual srvics rqusts. If th RSCT/SA subsystm itslf r to fail, th activ ould continu to function, albit ithout th protction affordd by RSCT and SA. High Availability Architcturs for Linux on IBM Systm z 12

In th vnt of a failur of th primary srvr, SA ill assign th Srvic IP addrss to th scondary LPAR and bring up an on th LPAR. Th rsulting configuration is as shon: z/vm LPAR 1 SAMP Routr SAMP RSCT Hartbat z/vm LPAR 2 Srvic IP points hr Not that in this scnario, th standby srvr is cold. This could rsult in a srvic intrruption as th srvr is startd. Flo of rqusts through this architctur 1. Srvic IP. Rqusts ntr a routr that is aar of a singl IP addrss for th. Th Srvic IP is associatd ith th activ. In th vnt of a failur on th primary, SA ill automaticy assign th Srvic IP to th scondary. 2.. Th srvr srvs static contnt. Th scondary is cold until startd by SA. Product Vrsions Any vrsion of Tivoli Systm Automation for MultiPlatforms, V1.2 Any vrsion of z/vm Plannd Outags This sction discusss ho ach of th componnts can b takn don for softar upgrads or any othr plannd outag. SA rquirs managd rsourcs to b brought up and don undr its control. In this xampl, rsourcs ar togthr in a singl group, and as a rsult, bringing ithr activ rsourc don ill caus SA to failovr to th othr LPAR. To bring a rsourc don undr SA, us th rgmbrrq command. As mntiond abov, SA and RSCT ar in plac to manag th clustr and hav no impact on th srving of contnt. Eithr or both of th SA or RSCT softar can b brought don for maintnanc ithout impacting th opration of th, xcpt for th rmoval of RSCT/SA monitoring and failovr protction for th. High Availability Architcturs for Linux on IBM Systm z 13

What W Larnd in Tsting Whn this architctur as st up and tstd for plannd and unplannd outags, larnd th folloing: Did th softar failovr as xpctd? Ys. Did usrs xprinc any outag tim or transactions that thy ndd to rtry? Ys Did usrs xprinc any prmannt data loss? No Ho long did th failovr tak? (Ho long did usrs xprinc outags): Approximatly 6 sconds to failovr. Failovr of th virtual ip is about 2 sconds, failovr to http srvic is around 4 mor sconds. Most of this tim as to bring up a n srvr. Architctural Dcisions Architctural dcisions r mad basd on th folloing ky critria: High Availability Cost Simplicity Architctural Dcisions Dcision Point Pros/Cons Us to sparat Systm z srvrs to host th to LPARs. Us of a Srvic IP Addrss Maintaining th Dpnds On rlationship btn rsourcs This architctur can asily b run on to sparat Systm z srvrs. Th rsourc group can b simply faild ovr in its ntirty to an LPAR on anothr physical srvr. If th to srvrs ar in sparat data cntrs, or in sparat citis, thn this solution ill covr th Disastr Rcovry situation, hr on data cntr is lost and th othr can tak ovr th orkload immdiatly. Th us of a Srvic IP as chosn for to rasons. st, it abstracts th instancs in th sam ay as Virtual IP Addrssing (VIPA), rsulting in a mor simpl failovr. Scond, it os th illustration of ho to handl multipl dpndnt rsourcs in a group. It is xpctd that usr applications ill typicy b comprisd of multipl softar componnts. Th xampl, as illustratd, ill b asir to dra upon in such situations. Th is dpndnt on th Srvic IP instanc for obvious rasons. Th choic of Dpnds On rsults in forcd collocation. Dpnds On Any ould o for th to xist on a sparat LPAR than th Srvic IP instanc. As mntiond abov, SA as an architctural choic affords a robust support structur. Whil IBM support is availabl for Linux-HA, it is not availabl for Linux Virtual. IBM has found that Systm z customrs in gnral prfr supportd softar. In addition, hil this xampl is simpl, SA scals ll and so is suitabl for managing availability in vry complx nvironmnts ith intrtind dpndncis. Scond Exampl: Linux Virtual and Linux-HA For th opn sourc dsign, rly on Linux Virtual (LVS). LVS rquirs a singl dirctor nod as ll as any numbr of application instanc, or clustr, nods. All rqusts for srvics com through th dirctor and ar assignd to clustr nods to procss th actual orkload (for High Availability Architcturs for Linux on IBM Systm z 14

xampl, by round robin). High availability among th clustr nods is implicit, sinc th failur of on of thos nods ill caus it to b rmovd from considration by th LVS dirctor, and th rmaining clustr nods ill simply assum rsponsibility for th orkload. For th opn sourc dsign, us a passiv scondary LVS dirctor nod. Linux-HA manags th dirctor nods, including automatd failovr. This failovr capability liminats th singl point of failur. z/vm LPAR 1 Primary LVS Dirctor (Activ) Routr Linux-HA Hartbating Scondary LVS Dirctor (Passiv) z/vm LPAR 2 In this architctur, th singl IP addrss is associatd ith and managd by th activ LVS Dirctor. Th instancs ar both hot, hich os for load balancing of rqusts across thm. If on of th srvr nods gos don, th LVS Dirctor ill rmov it from th list of activ srvrs. LVS and Hartbat can also ork togthr to monitor th srvrs using Ldirctor. An xcllnt dscription of ho to stup LVS and Hartbat in this configuration can b found at http://mail1.cula.nt/clustr/indx.html. S also Systm z Platform Tst Rport for z/os and Linux Virtual s for mor dtails. Flo of rqusts through this architctur 1. LVS Dirctor. Rqusts ntr a routr that is aar of a singl IP addrss for th. Th IP addrss is associatd ith th LVS Dirctor. Th LVS Dirctor ill rout rqusts to th s (for xampl, by round robin). 2.. Th srvr srvs static contnt. Should any of th srvrs fail, th failing srvr ill no longr rciv rqusts from th LVS Dirctor. If Ldirctor is usd as dscribd abov, than s that com back onlin ill b automaticy placd back into th clustr of activ srvrs by th LVS Dirctor. Product Vrsions Any vrsion of Linux-HA (Hartbat) vrsion 2.0, hich includs support for many n faturs High Availability Architcturs for Linux on IBM Systm z 15

Any vrsion of LVS Any vrsion of Ldirctor compatibl ith th Linux-HA and LVS vrsions chosn (Ldirctor is availabl through th Linux-HA sit.) Any vrsion of LVS Ultra Monky vrsion 3 (if Ultra Monky is to b usd) Any vrsion of z/vm Not that th implmntation and configuration dtails providd at http://mail1.cula.nt/clustr/indx.html also dscrib th spcific softar packags to donload. Th vrsions ar orkabl but rathr outdatd. IBM rcommnds that this dscription b usd as a guidlin but that mor rcnt vrsions of packags b usd. Plannd Outags This sction discusss ho ach of th componnts can b takn don for softar upgrads or any othr plannd outag. Thr ar to sts of rdundant lmnts, namly th srvrs and th LVS Dirctors. In ithr cas, th systm is st up such that a singl lmnt of ithr st can b brought don manuy ithout impacting th systm. In th cas of th srvrs, ithr srvr could b brought don and LVS ould simply stop snding rqusts to that srvr. Using Ldirctor, th srvr ould b rturnd to th clustr upon startup. Th LVS Dirctors ar slightly diffrnt sinc only on is in activ stat. Bringing don th scondary is trivial bcaus it is not activ. Bringing don th primary ill caus Hartbat to failovr to th scondary. Not that multipl backup LVS Dirctors can b configurd. If only a singl backup Dirctor is dployd, bringing don that backup ill mak th primary a singl point of failur. A dscription of th managmnt of rdundant LVS Dirctors using Hartbat and Ldirctor can b found at http://.austintk.com/lvs/lvs-howto/howto/lvs-howto.failovr.html. Th UltraMonky projct is an Opn Sourc projct that implmnts such managmnt. S http://.ultramonky.org/ for mor information about UltraMonky. What Larnd in Tsting Whn this architctur as st up and tstd for plannd and unplannd outags, larnd th folloing: Did th softar failovr as xpctd? Ys. Did usrs xprinc any outag tim or transactions that thy ndd to rtry? No Did usrs xprinc any prmannt data loss? No Ho long did th failovr tak? (Ho long did usrs xprinc outags): No usr outags r sn. Failovr happnd in approximatly 1 scond. As long as this rmains blo th TCP/IP timout valu, no transactions ar lost. Architctural Dcisions Architctural dcisions r mad basd on th folloing ky critria: High Availability Cost Simplicity Architctural Dcisions Dcision Point Pros/Cons High Availability Architcturs for Linux on IBM Systm z 16

Us of Linux Virtual (LVS) Us of Multipl LVS Dirctors Us of Linux- HA/Hartbat LVS is a standard, robust opn-sourc packag for load balancing and high availability in Linux clustrs. All rqusts must go through th LVS Dirctor, making it a singl point of failur ithout rdundancy. Simpl hartbating btn th primary and scondary LVS Dirctors os for a highly-availabl configuration. Linux-HA is a standard, robust opn-sourc packag for hartbating in Linux clustrs. Th choic of LVS and Linux-HA provids th bnfits of srvr load balancing. Clustrs of loadbalancd srvrs afford bttr utilization and continuous srvr availability. Chaptr 4: Rfrnc Architctur: WbSphr ith DB2 databas on Linux Scnario Bing Solvd You hav a ky WbSphr application that runs in Linux on Systm z. Th primary databas for this application is DB2 UDB, also running on Linux on Systm z. Th databas fils ar on ECKD disks. Architctur Principls This architctur is dsignd to follo ths principls: Softar is gnry considrd lss rliabl than hardar. Th Systm z hardar contains rdundant componnts, making its Man Tim Btn Failur (MTBF) in th rang of dcads. Bcaus th Systm z hardar is so rliabl, o th Systm z srvr to b a singl point of failur in this architctur. W duplicat of th softar nvironmnts (LPAR, VM, s, Linux, WbSphr, DB2) so that non of thm is a singl point of failur. No failur should b noticabl to th nd usr. Th currnt transaction may fail, but subsqunt transactions succd. Aftr any singl failur, transactions continu at th sam rat ith no dgradation in throughput or rspons tim. Th architctur must b rapidly scalabl to support incrass and dcrass in businss volum. High Availability Architcturs for Linux on IBM Systm z 17

Rfrnc Architctur z/vm LPAR 1 Dmgr Primary Load Balancr DB2 (Pri) Routr HADR Backup Load Balancr DB2 (Bkup) WbSphr Clustr z/vm LPAR 2 Each box rprsnts a virtual Linux srvr running as a gust of z/vm, in to z/vm LPARs. In this architctur, softar srvrs ar duplicatd on to Logical Partitions (LPARs) on th sam Systm z srvr. Outr and innr firs form a DMZ for th Edg srvrs and srvrs. Flo of rqusts through this architctur 1. Communications ithin th LPAR. All communications btn Linux gusts ithin a z/vm LPAR ar compltd through a z/vm Virtual Sitch (vsitch). Th vsitch is a fast and scalabl communication infrastructur. W rcommnd stting up on vsitch ith to vsitch controllrs (VM usr ids) in ach z/vm LPAR. (Not that z/vm 5.2 coms ith to VSWITCH controllrs dfind. W rcommnd that you XAUTOLOG th to of thm: DTCVSW1 and DTCVSW2.) Each vsitch should b st up to us multipl ral dvics, so that if on OSA fails, th VSWITCH ill fail ovr to th othr. 2. Load Balancr. Rqusts ntr a routr that is connctd to th Virtual IP Addrss (VIPA) of th Primary Load Balancr. Should this load balancr fail, th backup load balancr ill dtct th failur and tak ovr. Th routr ill dtct this failur and rout rqusts to th backup load balancr. 3.. Th load balancr sprays rqusts btn th to srvrs. Should on of th srvrs fail, th load balancr dtcts this and ill not rout rqusts to it. Th srvr srvs static pags. It also routs WbSphr rqusts via th WbSphr plugin to th to WbSphr srvrs in th clustr. Should on of th WbSphr srvrs fail, th plugin ill dtct this and not rout rqusts to it. High Availability Architcturs for Linux on IBM Systm z 18

4. WbSphr. Th application is dployd to a WbSphr clustr consisting of to nods. WbSphr manags th dploymnt of th application onto th nods of th clustr and can upgrad th application on th fly. Usr sssion data is rplicatd among th clustr mmbrs so that if a clustr mmbr fails th transaction can b continud on th othr clustr mmbr. W rcommnd configuring WbSphr to rplicat sssion data and hold it in mmory in th clustr mmbrs. This option prforms ll and can scal ll, and is mor simpl to configur than using a databas to hold sssion data. In flight To Phas Commit (2PC) transactions can b rcovrd by th WbSphr HA managr. Th HAManagr is a n fatur of WbSphr v6. It nhancs th availability of WbSphr singlton srvics lik transaction srvics or JMS mssag srvics. It runs as a srvic ithin ach application srvr procss that monitors th halth of WbSphr clustrs. In th vnt of a srvr failur, th HAManagr ill failovr th singlton srvic and rcovr any inflight transactions. In ordr to do this th transaction logs rittn by ach application srvr must b on ntork-attachd storag or a storag SAN so that thy ar radabl to th rmaining clustr mmbrs. Not that this stup is optional and is not dpictd in our rfrnc architctur. Whn th HAmanagr coordinator dtcts that an application srvr is don, it can initiat rcovry of in-flight transactions from th transaction logs. This rcovry ill rlas any locks hld in th databas. Th folloing stps ar rquird to accomplish this stup: Mak th transaction logs sharabl by mmbrs of th clustr. By dfault ths ar locatd in th <instroot>\profils\<profilnam>\tranlog\<cllnam>\<nodnam>\<srvrnam>\transaction dirctory, but should b configurd for anothr dirctory that you ill mak sharabl. Aftr you configur th logs, th only othr stup is to Enabl high availability for prsistnt srvics by chcking th box by that nam in th Clustr. Rfr to Rdbook SG24-6392: WbSphr Application V6 Scalability and Prformanc Handbook, sction 9.7 Transaction Managr High Availability for dtails about ho to st up th HAmanagr in this ay. 5. DB2 Clint (JDBC). WbSphr runs th application and snds DB2 data rqusts to th Primary DB2. HADR communicats to th DB2 clints (th JDBC drivr in our cas) hn thy first connct to DB2 to inform thm of th addrss of th backup srvr. If any communication to th primary DB2 srvr fails, th clints automaticy rout rqusts to th backup srvr. This is th Automatic Clint Rrout fatur. Configur th WbSphr connction pool sttings in th DB2 data sourc to us purg.policy=pool. This ill caus WbSphr to mpty out th ntir pool upon a singl connction failur. Without this option, ill hand out stal connctions until it purgs th pool of connctions. 6. DB2. Th DB2 HADR (High Availability Disastr Rcovry) fatur is usd to provid failovr in cas of a DB2 failur. HADR uss to DB2 srvrs and to databass to mirror th data from th primary databas to th backup. Tivoli Systm Automation running on both DB2 s automaticy dtcts a failur of th primary and issus commands on th backup for th DB2 thr to bcom th primary. Sinc it has bn mirroring data from th primary, th backup dos not nd to do any databas rcovry bfor bcoming primary. So th databas takovr is accomplishd as fast as can b dtctd by TSA. Dtails on th DB2 failovr stup: High Availability Architcturs for Linux on IBM Systm z 19

DB2 provids th TSA MP scripts to automat th HADR takovr. Inst ths scripts into ach TSA srvr. S th Automating DB2 HADR Failovr on Linux using Tivoli Systm Automation hitpapr for mor information. Th WbSphr application nds to look for a spcific SQL rturn cod that indicats that th primary DB2 srvr has faild. This rturn cod mans that any transactions prior to COMMIT hav bn rolld back. Th application nds to rissu th prvious transactions. Us th DB2 updat altrnat srvr for databas command at ach DB2 nod to idntify th othr DB2 srvr. With this stup, conncting clints ill bcom aar of th backup srvr aftr conncting to th primary. This is only ncssary for typ2 drivrs. HADR is includd in DB2 UDB 8.2 Entrpris Edition. HADR ACR (automatic clint rrout) orks ith typ 2 and typ 4 JDBC clints, but for XA transactions th typ 4 clint is rquird. On WbSphr Linux, th XA implmntation for to-phas commit is th only on availabl for WbSphr managd transactions. As a rsult, rsourcs nd to b spcificy dclard as XA (that is, XADataSourc) if you ant to participat in 2PC. Th typ 4 JDBC clint is ndd for XA-typ transactions that nd automatic clint rrout. (XA is supportd for both typ2 and typ4 clints, but XA and ACR only orks for typ4.) 7. Disk Multipathing. Th databas volums ar configurd for multipathing, so that if on path fails, accss to th dvic is maintaind through th surviving paths. For ECKD DASD dvics accssd ovr FICON or ESCON channls, multipathing is handld invisibly to th Linux oprating systm. A singl dvic is prsntd to th oprating systm on hich to do I/O oprations. Multipathing happns automaticy and is handld by th Systm z I/O subsystm. All that is rquird is for multipl paths to b dfind to th dvic in th activ I/O Dfinition Fil (IODF), and for thos paths to b onlin. Th complxity of choosing among th multipl paths is hiddn from th Linux OS. Product Vrsions This architctur rquirs th folloing softar vrsions: WbSphr Ntork Dploymnt V6.0 or nr. This architctur can b don ith WbSphr v5, but it lacks som of th HA faturs of WbSphr v6. For ths rasons rcommnd v6. DB2 UDB 8.2 is ndd for th HADR function. You also nd th JDBC drivr that coms ith 8.2, to gt th Automatic Clint Rrout fatur. Tivoli Systm Automation for MultiPlatforms, V1.2. This is includd ith DB2 8.2 at no cost for us ith DB2. z/vm 4.4 or abov, to gt vsitch. Plannd Outags This sction dscribs ho ach of th componnts can b takn don for softar upgrads or any othr plannd outag. Plannd Outags Componnt Load Balancr Procdur 1. Stop th ntork dispatchr componnt on th srvr you nd to upgrad. Th backup dtcts that th primary is stoppd and bcoms th primary. Th routr dtcts that th primary is don and routs rqusts to th backup. 2. Aftr th primary has bn upgradd, start th Ntork Dispatchr componnt and mak it primary again. 3. Go to th backup srvr and apply th upgrad thr. High Availability Architcturs for Linux on IBM Systm z 20

WbSphr Application 1. Stop th on th srvr you nd to upgrad. Th load balancr dtcts that this srvr is not rsponding and ill stop snding rqusts thr. 2. Aftr th has bn upgradd, rstart th. As soon as it is startd, th load balancr dtcts that it is availabl and starts snding rqusts to it. 3. Go to th othr srvr and rpat stps 1 and 2. To upgrad th application running in th clustr, simply prform th Rolling upgrad function from th ND administrativ panls. This drains ach rqust quu, stops th application srvr, upgrads th application, and starts th app srvr again. ill mak sur th upgradd app srvr is up and handling rqusts bfor upgrading th nxt clustr mmbr. If sssion rplication in nabld btn th clustr mmbrs, thn this mthod ill caus no loss of transactions. To apply srvic to WbSphr 1. From th ND administrativ panls, stop on application srvr in th clustr. This procss nsurs that no in-flight transactions ar intrruptd by th stop by first notifying th WbSphr plug-in running in th srvr that is don and to stop snding n rqusts. Thn WbSphr finishs bringing don th application srvr. 2. Upgrad or do hatvr othr plannd outag ork nds to b don on that srvr. 3. Start th application srvr again. Wait until th app srvr is again handling ork. 4. Rpat this procss on th othr application srvr in th clustr. DB2 1. On th backup DB2 srvr, stop HADR mirroring. 2. Apply th upgrads. 3. Start HADR again and o th DB2 srvr to catch up ith th changs mad on th primary. 4. Aftr this procss is complt, tll th backup to bcom th primary, using th TAKEOVER command. 5. Rpat this procss on th primary DB2 srvr. What Larnd in Tsting Whn this architctur as st up and tstd for plannd and unplannd outags, larnd th folloing: Did th softar failovr as xpctd? Ys. Did usrs xprinc any outag tim or transactions that thy ndd to rtry? Ys Did usrs xprinc any prmannt data loss? No Ho long did th failovr tak? (Ho long did usrs xprinc outags): Approximatly 1 minut for HADR to failovr and WbSphr to stop snding rqusts to th faild DB2 srvr and rdirct to th altrnat. High Availability Architcturs for Linux on IBM Systm z 21

Architctural Dcisions Architctural dcisions r mad basd on th folloing ky critria: High Availability Cost Simplicity Architctural Dcisions Dcision Point Pros/Cons Us to sparat This architctur can asily b run on to sparat Systm z srvrs. All Systm z srvrs communication btn clustr mmbrs is through TCP/IP, and th DB2 HADR to host th to mirroring can b don in asynchronous mod so that ntork dlays btn LPARs. Systm z systms ill not b a problm. This rmovs th Systm z hardar as a singl point of failur. Hovr, it significantly incrass th cost of th solution. Thr must b th sam numbr of IFLs availabl on th scond Systm z as thr is on th first, sinc ithr must b abl to run 100% of th orkload ithout dgrading rspons tim. So th numbr of IFLs ndd to run th orkload is doubl th numbr ndd to run th orkload on to LPARs ho shar th sam Systm z srvr. If th to srvrs ar in sparat data cntrs, or in sparat citis, thn this solution ill covr th Disastr Rcovry situation, hr on data cntr is lost and th othr can tak ovr th orkload immdiatly. S th architctur for Scnario 5 for mor information. Us WbSphr mmory rplication to sav usr sssion data Anothr con is that ncryption of of th data traffic is no rquird. Whn of th data traffic as on th sam CEC, thr as no nd for ncryption bcaus no communication vr lft th Systm z machin. WbSphr givs you thr lvls of sssion prsistnc: No prsistnc Hardn sssion data to a databas Kp sssion data in mmory in clustr mmbrs. Hardning sssion data to a databas is th most robust and scalabl solution, but also rquirs that th databas b HA-nabld itslf. Mmory rplication mans that sssion data from ach clustr mmbr is availabl to of th othr clustr mmbrs, but not hardnd to disk. Th sssion data is kpt in JVM mmory. This mans that sssion data is lost only if th ntir clustr fails. For modrat orkloads, mmory rplication prforms ll but uss mor mmory than databas hardning. But as WbSphr scals to vry larg numbrs of usrs and larg sssion data, mmory rplication can prform ors than hardning to disk bcaus of th folloing: WbSphr spnds lots of CPU tim synching th sssion data on th app srvrs. Th amount of JVM mmory ndd to stor sssion data incrass, laving lss for th application, and rquiring largr JVM haps. WbSphr offrs svral ays to st up mmory rplication so that ithr of th High Availability Architcturs for Linux on IBM Systm z 22

folloing is tru: All of th sssion data is in mmory in ach application srvr in th clustr. Th sssion data is hld in mmory in a sparat application srvr that is ddicatd to this task. This configuration can ork ll ith vry high transaction rats or larg amounts of sssion data, bcaus of th sssion data dos not hav to b rplicatd in ach application srvr. Hovr, this sparat srvr bcoms a n singl point of failur. WbSphr v6.0.2 incrasd th prformanc of mmory-to-mmory sssion rplication to qual that of databas rplication. For this and th rasons abov rcommnd using TBW (tim-basd rit) mmory rplication of sssion data for an HA configuration. Whr to run th WbSphr Dploymnt Managr Us TSA alon (ithout DB2 HADR) to provid failovr for DB2 Rfr to th Jun 2005 dition of WbSphr Tchnical Journal, availabl at http://.ibm.com/dvloprorks/bsphr/tchjournal/0506_col_alcott/0506_col_alcott.html, and th book WbSphr Dploymnt and Advancd Configurations for an xcllnt articl on this subjct. In WbSphr v6, th dploymnt managr is only usd for dploymnt and configuration, and dos not prform any HA monitoring of th clustr lik it did in v5. So fl that it is not rquird to HA-nabl it. Thrfor rcommnd you run th Dmgr in a sparat Linux gust on ithr of th Linux LPARs. Tivoli Systm Automation can b st up to manag th availability of th DB2 instanc in a 2-nod clustr. A hitpapr and TSA scripts can b donloadd for this solution from http://.ibm.com/softar/tivoli/products/sys-autolinux/donloads.html. Us SCSI dvics instad of ECKD. In th vnt of a databas failur, TSA activats th backup DB2 instanc (hich is an activ but idl Linux gust ith DB2 alrady startd) and triggrs th DB2 rcovry mchanisms. DB2 provids th TSA scripts to do this. Th DB2 transaction log is usd to rplay committd transactions and undo in-flight transactions, in ordr to bring th on-disk imag to a consistnt stat in th vnt of a crash. Bcaus th DB2 srvr nod and th spar nod ar both connctd to th sam disk subsystm, hn DB2 on th spar nod is activatd during failovr, it can simply start th rcovry procss using th databas storag, transaction logs, and configuration information rittn to disk by th original DB2 nod. No transfr of logs or databas imags is rquird btn th failing systm and th spar systm, nabling fastr rcovry tim. An issu ith this is th spd of rcovry. Using HADR, failovr from th primary to th scondary DB2 srvr taks only a f sconds, and clint rqusts ill not tim out du to this dlay. Using TSA alon to kickoff DB2 rcovry can tak up to 60 sconds to start and rcovr DB2. HADR can also asily handl rolling upgrads to DB2. Simply upgrad th backup srvr, and thn us th DB2 Control cntr to sitch th backup to bcom th primary, and upgrad th othr srvr. SCSI dvics accssd ovr FCP paths can b usd instad. Th FCP/SCSI approach is dscribd in Chaptr 5. Each is usd onc in this documnt in ordr to articulat th diffrncs in configuring multipathing and dvic sharing for ach. High Availability Architcturs for Linux on IBM Systm z 23

Ky problm aras Non-functional rquirmnts: Data intgrity. No data loss is prmittd, vn in disastr (sit fail-ovr) scnario. Scurity. Th Architctur can b usd for systms ith stringnt scurity rquirmnts. It provids layrd dfnss against intrnal and xtrnal thrats. Prformanc and Scalability. This Architctur supports systms ith mdium throughput rquirmnts and os for additions to scal to gratr volums. Solution stimating guidlins Us th WbSphr on Linux on Systm z Sizing procss to stimat th numbr of IFLs rquird for, WbSphr, and DB2. Chaptr 5: Rfrnc Architctur: WbSphr ith Oracl databas on Linux Scnario Bing Solvd You hav a ky WbSphr application that runs in Linux on Systm z. Th primary databas for this application is Oracl, also running on Linux on Systm z. Th databas fils ar on SCSI disks. Architctur Principls This architctur is dsignd to follo ths principls: Softar is gnry considrd lss rliabl than hardar. Th Systm z hardar contains rdundant componnts, making its MTBF in th rang of yars. Bcaus th Systm z hardar is so rliabl, o th Systm z srvr to b a singl point of failur in this architctur. W duplicat of th softar nvironmnts (LPAR, VM, s, Linux, WbSphr, Oracl) so that non of thm is a singl point of failur. Failur should not b noticabl to th nd usr. Th currnt transaction may fail, but subsqunt transactions succd. Aftr any singl failur, transactions continu at th sam rat ith no dgradation in throughput or rspons tim. Th architctur must b rapidly scalabl to support incrass and dcrass in businss volum. High Availability Architcturs for Linux on IBM Systm z 24

Rfrnc Architctur z/vm LPAR 1 Dmgr Primary Load Balancr Oracl Routr Shard Disk Backup Load Balancr Oracl WbSphr Clustr z/vm LPAR 2 Each box rprsnts a virtual Linux srvr running as a gust of z/vm, in to z/vm LPARs. In this architctur, softar srvrs ar duplicatd on to Logical Partitions (LPARs) on th sam Systm z srvr. Outr and innr firs form a DMZ for th Load Balancrs and srvrs. Flo of rqusts through this architctur 1. Load Balancr. Th routr and Load Balancr ar configurd th sam as it is dscribd on pag 18. 2.. Th srvr is configurd th sam as it is dscribd on pag 18. 3. WbSphr. WbSphr is configurd th sam as it is dscribd on pag 19. 4. Oracl Clint (JDBC). WbSphr runs th application and snds rqusts to th Oracl databas srvr (Oracl cs this an instanc ). Communication ith Oracl is via th Oracl Thin Clint, hich is a typ 4 JDBC drivr. It uss a slf containd, light ight vrsion of SQL*Nt to communicat ovr TCP/IP ith th Oracl databas instanc. Th WbSphr datasourc lists th instancs that mak up th Oracl RAC clustr. If th primary instanc is unavailabl, th drivr ill us an altrnat instanc. Th thin clint should b dfind to us sssion failovr, maning that if a connction is lost from th clint to th instanc, a n sssion is automaticy cratd ith anothr instanc (sssion failovr dos not attmpt to rcovr slcts.) Sinc th thin clint handls th rstablishmnt of connctions to an altrnat instanc, no purging of th WbSphr connction pool is ncssary. 5. Oracl. Th srvrs ar configurd using Oracl Ral Application Clustr (RAC) in an activ/activ configuration ith to instancs. This mans that ach instanc is activ and updats a common databas. Should on of th Oracl instancs fail, th in-flight transactions ar lost, but th othr instanc in th RAC can rciv JDBC rqusts immdiatly. Configuration data is in th tnsnams.ora fil on ach instanc. High Availability Architcturs for Linux on IBM Systm z 25

6. Disk Multipathing. Th databas volums ar configurd for multipathing, so that if on path fails, accss to th dvic is maintaind through th surviving paths. Unlik FICON-attachd ECKD DASD dvics, hn SCSI DASD dvics ar accssd ovr Systm z FCP channls, ach path to ach LUN appars to th oprating systm as a diffrnt dvic. For xampl, if thr ar four paths to fiv LUNs, th Linux systm ss 20 SCSI dvics. This mans that thr must b anothr layr of cod btn th Linux filsystm layr and th subsystm. This xtra layr handls of th coordination btn th ra paths and th highr lvl subsystm. On Rd Hat Linux, this xtra layr is handld by softar RAID and mdadm. On SUSE Linux it is handld by LVM (SLES 8) or th Dvic Mappr subsystm in th 2.6 krnl in conjunction ith EVMS (SLES 9). So, unlik th FICON/ECKD cas, FCP/SCSI multipathing is managd by th Linux OS. Thr is no global multipathing schdulr that orks across th ntir Systm z systm for FCP. Th mdadm approach usd by Rd Hat currntly dos not balanc loads among availabl paths to LUNs. Rathr, it uss a primary path to th LUN, thn fails ovr to a scondary path if thr ar any problms ith th primary path. Whn an activ path fails, th md subsystm dtcts th failur, marks th path faild, thn maks a scondary path activ. If th faild path coms back, th md subsystm rcognizs this and brings it back as a scondary path. For information about configuring FCP multipathing for Rd Hat or SUSE systms, s th Rdbooks SG24-6344, Linux for Systm z: Fibr Channl Protocol Implmntation Guid and SG24-6694, Linux for IBM Systm z9 and Systm z. Product Vrsions This architctur rquirs th folloing softar vrsions: WbSphr Ntork Dploymnt V6.0 or nr. This architctur can b don ith WbSphr v5 but lacks som of th HA faturs of WbSphr v6. For ths rasons rcommnd v6. Oracl Databas 10g Rlas 1 (10.1.0.3) ith RAC fatur. Oracl thin clint JDBC typ 4 drivr. Plannd Outags This sction discusss ho ach of th componnts can b takn don for softar upgrads or any othr plannd outag. Plannd Outags Componnt Procdur Load Balancr Sam as in prvious architctur. Sam as in prvious architctur. WbSphr Application Sam as in prvious architctur. Oracl Us th SHUTDOWN TRANSACTIONAL command ith th LOCAL option to mak an instanc (RAC ) shutdon aftr activ transactions on th instanc hav ithr committd or rolld back. Transactions on othr instancs do not block this opration. What Larnd in Tsting Whn this architctur as st up and tstd for plannd and unplannd outags, larnd th folloing: Did th softar failovr as xpctd? Ys. High Availability Architcturs for Linux on IBM Systm z 26

Did usrs xprinc any outag tim or transactions that thy ndd to rtry? Ys Did usrs xprinc any prmannt data loss? No Ho long did th failovr tak? (Ho long did usrs xprinc outags): Approximatly 2 minuts for RAC to failovr and WbSphr to stop snding rqusts to th faild Oracl instanc and rdirct to th altrnat. Th activ/passiv configuration of RAC may fail ovr fastr. Problms r xprincd during failback (hn th primary instanc as rstartd and th altrnat stoppd). Transaction throughput droppd significantly or stoppd. W assumd this as a configuration rror that could not find. Architctural Dcisions Architctural dcisions r mad basd on th folloing ky critria: High Availability Cost Simplicity Sinc this architctur is so similar to th prvious on, hav not duplicatd of th discussion of th architctural dcisions hr. Only thos that ar uniqu to this architctur ar dscribd blo. Architctural Dcisions Dcision Point Using Oracl DataGuard Pros/Cons Whil RAC provids a clustr of Oracl instancs sharing on databas, DataGuard provids for mirroring of th databas itslf using a primary and standby databas. To duplicat both th instancs and th databas rquirs both RAC and DataGuard. Using DataGuard ith transaction intgrity, if thr is a disk failur thn th standby databas ill b currnt to th last committd transaction. If thr is a disk failur hn using RAC only, thn a databas rcovry is ndd (Oracl 10g kps rcnt transactions in a flash-back ara making that kind of rcovry vry quick.) Using an Activ/Passiv configuration instad of Activ/Activ Us ECKD dvics instad of SCSI Dataguard is a good solution to provid high availability for RAC clustrs on sparat Systm z srvrs. Th dataflo btn th clustrs is minimal bcaus rdo log shipping is usd. Thr ar svral choics as to ho currnt to kp th backup clustr, including transactional intgrity. Thr is a high CPU cost for an activ/activ RAC configuration. Using activ/passiv RAC uss lss CPU, and os fastr failovrs, but dos not support load balancing sinc only on RAC instanc is activ. ECKD dvics accssd ovr FICON or ESCON paths can b usd instad. Th ECKD approach is dscribd in th prvious architctur. Each is usd onc in this documnt in ordr to articulat th diffrncs in configuring multipathing and dvic sharing for ach. Ky problm aras Non-functional rquirmnts ar th folloing: Data intgrity. No data loss is prmittd, vn in a disastr (sit fail-ovr) scnario. High Availability Architcturs for Linux on IBM Systm z 27

Scurity. Th architctur can b usd for systms ith stringnt scurity rquirmnts. It provids layrd dfnss against intrnal and xtrnal thrats. Prformanc and scalability. This Architctur supports systms ith mdium throughput rquirmnts and os for additions to scal to gratr volums. Cost. Oracl RAC is quit costly, but provids good function. Solution stimating guidlins Us th WbSphr on Linux on Systm z Sizing procss to stimat th numbr of IFLs rquird for, WbSphr, and Oracl. Th sizing actuy uss DB2, but th cycls for DB2 and Oracl ar similar nough that th sizing ill b valid. Chaptr 6: Rfrnc Architctur: WbSphr ith DB2 databas on z/os Scnario Bing Solvd You hav a ky WbSphr application that runs in Linux on Systm z. Th primary databas for this application is a DB2 data sharing group running in a Parl Sysplx on z/os. Architctur Principls This architctur is dsignd to follo ths principls: Softar is gnry considrd lss rliabl than hardar. Th Systm z hardar contains rdundant componnts, making its MTBF in th rang of yars. Bcaus th Systm z hardar is so rliabl, o th Systm z srvr to b a singl point of failur in this architctur. W duplicat of th softar nvironmnts (VM, s, Linux, WbSphr, DB2) so that non of thm is a singl point of failur. No failur should b noticabl to th nd usr. Th currnt transaction may fail, but subsqunt transactions succd. Aftr any singl failur, transactions continu at th sam rat ith no dgradation in throughput or rspons tim. Th architctur must b rapidly scalabl to support incrass and dcrass in businss volum. Th architctur should lvrag th high availability capabilitis and faturs of z/os Parl Sysplx. High Availability Architcturs for Linux on IBM Systm z 28

Rfrnc Architctur z/vm LPAR 1 Dmgr z/os LPAR 1 Primary Load Balancr DVIPA --------- DB2 SD Routr Backup Load Balancr WbSphr Clustr DVIPA --------- DB2 SD Backup z/vm LPAR 2 z/os LPAR 2 Each box on th lft rprsnts a virtual Linux srvr running as a gust of z/vm, in to z/vm LPARs. On th right ar to z/os LPARs. Th boxs ithin thm rprsnt z/os procsss. This architctur introducs th us of a DB2 z/os data sharing group as th data stor. Th data sharing function of DB2 z/os nabls applications that run on diffrnt DB2 subsystms to rad and rit th sam data concurrntly. DB2 subsystms that shar data must blong to a DB2 data sharing group, hich runs in a Parl Sysplx clustr. A data sharing group is a collction of on or mor DB2 subsystms that accss shard DB2 data. Each DB2 subsystm that blongs to a particular data sharing group is a mmbr of that group. All mmbrs of a data sharing group us th sam shard DB2 catalog and dirctory, shar usr data, and bhav as a singl logical srvr ith th bnfit of highr scalability and availability. Th maximum numbr of mmbrs in a data sharing group is 32. Flo of rqusts through this architctur 1. Load Balancr. Th routr and Load Balancr ar configurd th sam as dscribd on pag 18. 2.. Th srvr is configurd th sam as dscribd on pag 18. 3. WbSphr. WbSphr is configurd th sam as dscribd on pag 19. 4. JDBC Typ 4 Drivr. Each srvr is configurd to us a pur Java drivr for connctivity to DB2 UDB on z/os, cd th DB2 Univrsal JDBC Drivr typ 4 (JDBC T4). JDBC T4 is sysplxaar and can intlligntly rout orkload across a DB2 data sharing group in a Parl Sysplx. On z/os, Sysplx Distributor provids an initial contact singl clustr IP addrss (knon as a group Dynamic VIPA) for th data sharing group, hich has built-in rdundancy across ntork adaptrs and z/os imags. Each DB2 data-sharing group mmbr is also addrssabl by its on Dynamic VIPA (Virtual IP Addrss) to insulat it from outags of any individual ntork adaptr, High Availability Architcturs for Linux on IBM Systm z 29

and from potntial rstarts of th DB2 group mmbr on othr z/os imags in th parl sysplx. z/os WLM (Workload Managr) orks ith DB2 z/os and JDBC T4 to dirct subsqunt traffic to th DB2 group mmbr ith th most availabl capacity on a transaction-by-transaction basis. 5. DB2. Should a DB2 z/os group mmbr fail, in-flight ork to that DB2 ill fail and b backd out, but subsqunt transactions ill b automaticy rroutd to surviving DB2 group mmbrs. z/os ARM (Automatic Rstart Managr) can automaticy rstart a faild DB2, ithr in-plac if its host z/os is still availabl, or on anothr z/os systm in th sysplx if its original z/os host is no longr activ (basd upon hatvr rstart policy th usr has prviously spcifid). If an ntir z/os host has ithr faild or appars hung, z/os SFM (Sysplx Failur Managmnt) can prform systm isolation to clanly rmov th z/os systm from th sysplx, ithr automaticy or basd upon oprator rqust (basd on hatvr policy th usr has prviously spcifid). Mor dtail on ths componnts Lt s look at ach of th componnts in this architctur in mor dtail. Th focus hr is on th flo from Linux on Systm z to DB2 on z/os. A comprhnsiv vi of z/os Parl Sysplx high availability configurations and options is byond th scop of this documnt, though a f aspcts of z/os Parl Sysplx high availability that ar particularly rlvant to this discussion ar covrd. DB2 UDB for z/os: Each DB2 mmbr in th Parl Sysplx intrfacs ith z/os WLM to obtain a prioritizd list of DB2 data sharing group mmbrs basd on thir availability and usabl capacity. Aftr a sysplx-aar JDBC T4 clint has mad initial contact ith DB2 UDB for z/os, DB2 rturns this list to th JDBC T4 clint. JDBC T4: JDBC T4 is configurd to b sysplx-aar, and to us connction concntrator in addition to th connction pooling availabl ithin WbSphr. This function provids improvd load balancing in Parl Sysplx DB2 data sharing configurations. With only th standard connction pooling includd in WbSphr, an application must disconnct bfor anothr on can rus a poold connction. In a Connction Concntrator implmntation, ithout any chang in application bhavior, th physical databas connction is rlasd at th nd of a transaction (commit or rollback), instad of th closconnction() c. This os incrasd connction sharing and highr connction utilization bcaus connctions ar not hld by applications hil not activly participating in a DB2 unit of ork. Th JDBC T4 sysplx orkload balancing fatur also implmnts a sophisticatd schduling algorithm that uss z/os WLM information (passd back by DB2 UDB for z/os) to distribut orkload across mmbrs of th DB2 data sharing group at th transaction boundary. In addition, connction concntrator ill dtct faild databas connctions and ocat a valid databas connction hn th application bgins a n unit of ork. This availability and orkload managmnt capability nabls JDBC T4 to transparntly rlocat ork aay from faild or ovrloadd mmbrs to thos that ar up and undrusd, and to do so on a transaction-basis, rathr than on a connction-basis. Also, unlik ordinary connction pooling, hn an outag occurs on on of th mmbrs in th DB2 data sharing group, only thos clint connctions that r actuy procssing transactions in that failing mmbr rciv connction failurs. With ordinary connction pooling, clint connctions to that mmbr ould rciv connction failurs rgardlss of hthr th clints r activ in th databas srvr. Not that JDBC T4 sysplx and connction concntrator support is distinct from WbSphr Application connction pooling. Both can b usd simultanously as thy don t conflict ith on anothr. High Availability Architcturs for Linux on IBM Systm z 30

Dynamic VIPA (DVIPA): Each link to an IP ntork must hav an IP addrss. Hovr, if a srvr s physical adaptr or th link associatd ith a sourc or dstination IP addrss fails, TCP/IP rlationships ar brokn until th adaptr (and its IP addrss) is rstord to srvic. Virtual IP addrss, or VIPA, provids an IP addrss that is ond by a TCP/IP stack, but not associatd ith any particular physical adaptr. Bcaus th VIPA is associatd ith a virtual dvic, it is alays availabl as long as th TCP/IP stack is functioning and accssibl. This is usful for larg systms such as Systm z machins that hav multipl IP adaptrs, including OSA-Exprss and Hiprsockts. As long as on IP adaptr is orking and connctd to th IP ntork, othrs can fail ithout disrupting srvic to th DB2 data-sharing group. But hat happns if th TCP/IP stack itslf bcoms unavailabl (for xampl, if a systm loss por)? In this cas, th VIPA gos aay ith th TCP/IP stack and is unavailabl until th TCP/IP stack can b rstartd. Unlik a static VIPA, a DVIPA can mov from on TCP/IP stack in th sysplx to anothr, automaticy. TCP/IP stacks xchang information about DVIPAs, thir xistnc, and thir currnt location, and th stacks ar continuously aar of hthr thir partnr sysplx stacks ar still aliv and functioning so that thy can back ach othr up. Thr ar to diffrnt typs of DVIPAs: multipl instanc and application-spcific. DB2 xploits both typs: multipl instanc DVIPA for DB2 group-id n orkload rqusts and th application-spcific DVIPA for mmbr-spcific rqusts. Th mmbr-spcific DVIPA is usd by JDBC T4 to rout orkload to a spcific mmbr and a scond mmbr-spcific port is usd to o to-phas commit failur and rcovry procssing. In a mmbr-spcific DVIPA, if a DB2 data sharing group mmbr is startd on any systm in th sysplx, th TCP/IP stack on that systm ill dtct DB2 trying to bind a sockt to th DVIPA. Th proprly configurd z/os TCP/IP stack (through th BIND paramtr on th PORT statmnt) is smart. It ill vrify that th DVIPA is not activ lshr in th sysplx, thn ill activat th DVIPA automaticy bfor rporting succssful compltion back to DB2. If th DB2 mmbr latr fails, TCP/IP ill dactivat th DVIPA. If that DB2 mmbr is rstartd on anothr TCP/IP stack in th sysplx that has bn similarly configurd to o this dynamic activation, thn th DVIPA ill b activatd thr automaticy so that th DB2 rstart can procd succssfully. In othr ords, th mmbr-spcific DVIPA ill follo its associatd DB2 mmbr around th sysplx. Each DB2 datasharing group mmbr ill hav a uniqu DVIPA. Sysplx Distributor: Sysplx Distributor is th stratgic IBM solution for IP connction orkload balancing in th Parl Sysplx. It is similar in concpt to th IBM WbSphr Edg Componnts Load Balancr or th Cisco Multi-Nod Load Balancr. It provids a clustr IP addrss for th ntir DB2 UDB for z/os data sharing group. Sinc it is built on Dynamic VIPAs, it is cd a Distributd DVIPA. DB2 rfrs to this as th location or group Dynamic VIPA. Whn coupld ith JDBC T4 and its sysplx orkload balancing support, th Sysplx distributor IP addrss is only usd for initial connction to DB2. All subsqunt JDBC T4 rqusts us th mmbr-spcific IP addrsss rturnd to th JDBC T4 clint by DB2 UDB for z/os. Th group DVIPA is dfind on a primary TCP/IP stack (or routing stack) in th sysplx through coding of th VIPADEFINE and VIPADISTRIBUTE profil statmnts. Information about th DVIPA is distributd automaticy to dsignatd candidat targt TCP/IP stacks in th sysplx, and on ths targt stacks, th sam IP addrss is activatd as a hiddn DVIPA. Th addrss is hiddn in th sns that th targt stacks don t advrtis th prsnc of this IP addrss to th ntork. Only High Availability Architcturs for Linux on IBM Systm z 31

th routing stack advrtiss onrship of th group DVIPA to th orld. Thn, th routing stack aits to rciv connction rqusts. Whn DB2 data sharing group mmbrs start up, thy bind thir contact port (446) to th group DVIPA on thir local (targt) stack. Whn this happns, th targt stack notifis th routing stack. Th routing stack thn knos it can snd futur DB2 connction rqusts to that targt stack. If th DB2 mmbr should fail, its targt stack is immdiatly aar of this, and notifis th routing stack. Ths updats to th targt stack ar virtuy instantanous (sub-scond), and do not rly on any sort of advisor function to mak priodic quris for availability status. Whn th group DVIPA routing stack rcivs a TCP connction rqust, it dos th folloing: 1. Consults z/os Work Load Managr (WLM) to find th rlativ availabl capacitis on th z/os nods hosting th targt stacks and DB2 mmbrs. 2. Consults th z/os Communication Srvic Policy Agnt for ntork prformanc and dfind policis that might affct th distribution dcision. 3. Slcts a targt stack and forards th rqust for procssing. Should a targt TCP/IP stack fail, in-flight connctions also fail. Th routing stack immdiatly snds out a connction rst to avoid th dlay associatd ith TCP connction timouts, and rconnct rqusts ar snt to on of th surviving targt stacks. N connction rqusts ar not routd to th faild targt stack or systm. Backup routing stacks may also b configurd by coding th sypslx clustr IP addrss on a VIPABACKUP profil statmnt on th backup TCP/IP stack(s). If th routing stack suffrs an outag, th backup routing stack taks ovr global rsponsibility for th group DVIPA. Each targt stack is aar of th takovr, and snds th n routing stack its currnt connction routing tabl. Th n routing stack consolidats of th individual tabls into its ovr connction routing tabl. Any inflight inbound TCP data lost ith th failing routing stack is rtransmittd by th individual clint TCP stacks, and asid from a momntary paus for rtransmission, clints and surviving srvrs continu ithout connction outag. This is handld by collaborating Sysplx Distributor TCP/IP stacks; DB2 is ntirly unaar of th failur of th formr routing stack. Whn th original routing stack is rstartd, it non-disruptivly taks ovr its dutis from th backup stack. WLM: z/os WLM os you to dfin prformanc goals and assign a businss importanc to ach goal. You dfin th goals for ork in businss trms, and WLM dcids ho much rsourc, such as CPU and mmory, should b givn to th application to mt th goal. WLM ill constantly monitor th systm and adapt procssing to mt th goals. It can not only hlp ith dircting n ork to availabl rsourcs ithin a Parl Sysplx, but it also dircts rsourcs to hr thy ar ndd to procss xisting ork. Whn orkload arrivs at th JDBC T4 clint, a orkload classification occurs hich can b basd upon a numbr of factors including authorization ID, accounting strings, clint information, stord procdur invocation, and so on. ARM: Rstarting a DB2 UDB for z/os data sharing group mmbr aftr it fails is important, as this ill hlp fr up any locks it holds as soon as possibl and thrby minimiz impact of th failur on othr mmbrs of th DB2 data sharing group. Th z/os Automatic Rstart Managr (ARM) hlps nsur this rstart procss is carrid out quickly and fficintly, basd on a usr-dfind policy. SFM: z/os Sysplx Failur Managmnt (SFM) os you to optiony dfin a sysplx-id policy that spcifis actions z/os is to tak hn crtain failurs occur in th sysplx. Th goals of failur managmnt in a sysplx ar to minimiz th impact that a failing systm might hav on th High Availability Architcturs for Linux on IBM Systm z 32

sysplx orkload, so that ork can continu ith littl or no oprator intrvntion. On of svral functions it can prform is to dtct and isolat a systm that has faild or is no longr rsponsiv (that is, a hung systm), rmov it clanly from th sysplx, and fr up rsourcs hil nsuring that data intgrity in th sysplx is prsrvd. This can b don ithr ntirly automaticy ithout any human intrvntion, or by prompting an oprator for thir go-ahad bfor procding. Flo: Th DRDA (Distributd Rlational Databas Architctur) protocol usd by JDBC T4 provids an initial contact port and a rsync port. For this architctur, JDBC T4 s initial contact port (DB2 port 446) is bound to th group DVIPA. To addrss th cas hr a DB2 mmbr trminats and is rstartd on anothr imag (possibly alongsid anothr DB2 mmbr), ach DB2 mmbr configurs its rstart addrss ith a rsync port numbr uniqu to that mmbr. Each mmbr s rsync port nds to b rachabl through a spcific IP addrss to nsur corrct rsynchronization aftr a failur, so it is bound to a mmbr-spcific DVIPA. Aftr an initial connction is stablishd, JDBC T4 can prform its on load balancing for subsqunt ork among th availabl DB2 data sharing group mmbrs. This is accomplishd by th DB2 mmbr snding back a list of IP addrsss, on for ach DB2 mmbr in th data sharing group. Bcaus th initial contact listning sockt is listning on a group DVIPA, ach DB2 mmbr opns an additional listning sockt on th sam port as th group listning sockt (446), but is bound to its mmbr-spcific DVIPA. It s ths DVIPA addrsss that ar snt back to th JDBC T4 clint. JDBC T4 thn balancs across ths mmbr-spcific DVIPAs, basd upon information providd to it through WLM. To summariz, th connction flo gos as follos: 1. JDBC T4 running ithin a WbSphr Application contacts th group DVIPA to mak an initial connction. This connction rqust may travl ovr any of svral physical ntork adaptrs ond by th TCP/IP stack to hich th Distributd DVIPA is associatd. 2. Sysplx Distributor filds th rqust. It consults ith WLM and Srvic Policy Agnt to dcid hich DB2 data sharing group mmbr to snd that connction rqust to, thn snds it to th group mmbr that has th most availabl capacity (pr WLM goals, and so on.) to handl this first rqust. 3. Th rciving DB2 mmbr handls th rqust and rsponds back to JDBC T4 dirctly. Also includd in its rspons is a list of mmbr-spcific DVIPAs, on for ach DB2 mmbr in th sysplx, and WLM rcommndations on hr to snd th nxt rqust. Aftr this point, sysplx distributor is no longr involvd. 4. JDBC T4 prforms its on load balancing across th mmbr-spcific DVIPAs, basd on WLM data it rcivs from th DB2 data sharing group mmbrs. It frquntly rcivs updats on th status of mmbrs of th DB2 data sharing group that it can us to mak load balancing dcisions. Bcaus connction concntrator is bing usd, ach routing dcision is mad on a transaction basis, rathr than a connction basis. Bcaus DVIPAs ar usd, connctions to any givn DB2 mmbr can travl on any of svral ntork adaptrs ond by th TCP/IP stack ith hich th DVIPA is associatd. 5. If a DB2 data sharing group mmbr (or its host TCP/IP stack or z/os) fails, WLM ill b aar of it, so JDBC T4 ill b notifid and ill rout n transactions to surviving DB2 mmbrs ho shar concurrnt rad/rit accss to th sam data as th faild mmbr (by virtu of thir High Availability Architcturs for Linux on IBM Systm z 33

participation in th Parl Sysplx data sharing group). Basd on usr-dfind policy, z/os ARM may rstart th faild DB2 mmbr in plac or on anothr z/os systm in th sysplx. Bcaus ach DB2 mmbr has its on uniqu rsync port, th DB2 mmbr can vn b rstartd on a z/os imag in hich anothr DB2 mmbr is alrady activ. Whn th DB2 mmbr rstarts, it ill fr up any hld locks on shard data, its mmbr-spcific DVIPA ill b r-nabld, and it ill bcom accssibl as bfor. If an ntir z/os systm faild or is not rsponding, thn basd on usr-dfind policy, z/os SFM can also prform systm isolation to clanly rmov th systm from th sysplx. 6. If a ntork adaptr fails, surviving adaptrs ill automaticy b usd to accss th affctd mmbr-spcific or group DVIPAs. 7. If th routing TCP/IP stack that ons th group DVIPA fails, anothr stack in th sysplx ill tak ovr onrship for th group DVIPA and bcom th n routing stack. Any in-flight inbound TCP data lost ith th failing routing stack ill b rtransmittd by th individual clint TCP stacks, and asid from a momntary paus for rtransmission, clints and surviving srvrs continu ithout connction outag. Product Vrsions This architctur rquirs th folloing softar vrsions: JDBC typ 4 drivr lvl 2.7 or abov is rquird to gt th sysplx orkload balancing and connction concntrator functions. This ships ith DB2 V8.2 FP3, a.k.a. DB2 V8.1 FP10 or abov. DB2 Distributd Data Facility for z/os V6 and V7 support for DVIPA and Dynamic DVIPA is providd through APAR PQ46659. Bas Dynamic VIPA support is availabl ith OS/390 V2R8 and highr. Bas Sysplx Distributor support is availabl ith OS/390 V2R10 and highr. Support for intgration of Sysplx Distributor ith Cisco MNLB function (not shon in this architctur) is availabl ith z/os V1R2 and highr. Fast connction rst support is availabl ith z/os V1R2 and highr. Plannd Outags This sction discusss ho ach of th componnts can b takn don for softar upgrads or any othr plannd outag. Plannd Outags Componnt Procdur Load Balancr Sam as in prvious architctur. Sam as in prvious architctur. WbSphr Application Sam as in prvious architctur. DB2 z/os 1. Stop th DB2 z/os data sharing group mmbr you ish to upgrad or srvic. It ill quisc currnt ork and notify th rmaining DB2 group mmbrs and WLM that it is no longr availabl to procss n rqusts. JDBC Typ 4 drivr and Sysplx Distributor ill b told to rout n rqusts to th surviving DB2 group mmbrs 2. Prform hatvr maintnanc is dsird on th DB2 mmbr. 3. Rstart th DB2 mmbr. Sysplx distributor and th JDBC Typ 4 High Availability Architcturs for Linux on IBM Systm z 34

Sysplx Distributor drivr ill b notifid that th DB2 mmbr is availabl and ill bgin routing ork to it. 4. Rpat ith rmaining DB2 z/os data sharing group mmbrs, on at a tim. Whn th TCP/IP routing stack is brought don, its backup ill takovr. Th takovr occurs non-disruptivly, and hn th stack is rstartd it automaticy rclaims status as th routing stack. What Larnd in Tsting Whn this architctur as st up and tstd for plannd and unplannd outags larnd th folloing: Did th softar failovr as xpctd? Ys. Did usrs xprinc any outag tim or transactions that thy ndd to rtry? No Did usrs xprinc any prmannt data loss? No Ho long did th failovr tak? (Ho long did usrs xprinc outags): Instantanous. Usrs do not prciv an outag. Dtails: Stopping on of th mmbrs of th datasharing group, did not impact th prformanc of th orkload coming to WbSphr. Th orkload continud to run at th sam pac ithout any transactions nding in rror or timing out. W s th folloing mssag in th WbSphr SystmOut.log: A connction faild but has bn r-stablishd. Th hostnam or IP addrss is "J90VIPA.pdl.pok.ibm.com" and th srvic nam or port numbr is 446. Spcial rgistrs may or may not b r-attmptd. This indicats that WbSphr rcognizd th faild connction and as abl to r-stablish a connction to a surviving mmbr of th datasharing group, through th JDBC typ 4 drivr. Aftr r-starting th faild datasharing group mmbr, it startd accpting ork again, ithout any intrruption to WbSphr. Rfr to th Systm z Platform Tst Rport for z/os and Linux Virtual s for dtails on ho to stup th JDBC typ 4 drivr for Sysplx aarnss. Thr ar som configuration rrors that can mak th Sysplx datasharing failovr not ork, or prform poorly. Plas rfr to th Systm z Platform Tst Rport for z/os and Linux Virtual s for dtails. Architctural Dcisions DB2 Connct EE vs. JDBC Typ 4. Rathr than using th JDBC Typ 4 drivr to contact DB2 UDB for z/os in th Parl Sypslx dirctly, ach WbSphr srvr could b configurd to hav its JDBC drivr (typ 2 or typ 4) go through DB2 Connct EE. A common, intrmdiat DB2 Connct EE srvr running in its on Linux gust imag undr z/vm could b configurd. DB2 Connct ould b usd to communicat ith th DB2 UDB for z/os data sharing group mmbrs, utilizing capabilitis similar to JDBC T4 for th sysplx orkload balancing and th connction concntrator functions. A backup could b configurd for th DB2 Connct EE srvr. Failovr could b configurd for th DB2 Connct srvr, such that if th TCP/IP connction to th DB2 Connct srvr is lost, th clint ould automaticy attmpt to rstablish th connction. Th clint ould first attmpt to rstablish th connction to th original srvr. If th connction is not rstablishd, th clint ould fail-ovr to an altrnat DB2 Connct srvr (not that this clint-lvl rrout to altrnat DB2 Connct srvrs is not supportd for to-phas commit (XA) orkload from WbSphr. XA orkload rquirs High Availability Architcturs for Linux on IBM Systm z 35

that th faild DB2 Connct srvr b rstartd bcaus transaction status information is stord ithin th DB2 Connct SPM log). Hovr, ith th rcnt introduction of th sysplx orkload balancing and connction concntrator functions in JDBC T4, it bcoms th prfrrd mans for driving traffic from WbSphr on Linux to DB2 in a Parl Sysplx. Rmoving DB2 Connct from th pictur in favor of JDBC T4 liminats anothr potntial point of failur, rducs administrativ burdn, and ill improv prformanc for avrag orkloads. Not that hil JDBC T4 rmovs th nd for DB2 Connct to b instd, a DB2 Connct EE licns is still rquird in ordr to nabl JDBC T4 to ork dirctly ith DB2 UDB for z/os. DNS/WLM vs. Sysplx Distributor. DNS/WLM solvs th sam problm as Sysplx Distributor distributing orkload across th sysplx by distributing clint connctions among activ srvr instancs. With th DNS/WLM approach, intllignt sysplx distribution of connctions is providd through coopration btn WLM and DNS (Domain Nam Srvic). For customrs ho lct to plac a nam srvr in a z/os sysplx, th nam srvr can us WLM to dtrmin th bst systm to srvic a givn clint rqust. In gnral, DNS/WLM rlis on th hostnam to IP addrss rsolution for th mchanism by hich to distribut load among targt srvrs. As a rsult, th singl systm imag providd by DNS/WLM is that of a spcific hostnam. Not that th systm most suitabl to rciv an incoming clint connction is dtrmind only at connction stup tim. Aftr th connction is mad, th systm bing usd cannot b changd ithout rstarting th connction. This solution is only availabl ith th BIND 4.9.3 nam srvr and not ith th BIND 9 nam srvr. DNS/WLM orks for both TCP and UDP; Sysplx Distributor orks only for TCP. Not also that many clints and intrmdiat DNSs cach IP addrsss, so th clint might continu to try to rconnct to an IP addrss that is unavailabl until th oning and supporting TCP/IP stack is rstartd, thus intrfring ith th availability providd by this approach. Sysplx Distributor is IBM s stratgic solution for connction orkload balancing in a sysplx, so it as chosn ovr th DNS/WLM approach for this architctur. Inclusion of Sysplx Distributor: In this architctur, Sysplx Distributor is only usd for th initial contact btn DB2 Connct EE and DB2 z/os. Aftr that contact, it stps out of th pictur and os thos to ntitis to communicat dirctly. You can choos to not implmnt Sysplx Distributor. In this cas, th choic of th initial DB2 z/os DVIPA to b contactd could b prdfind to JDBC T4. This approach ould ork, but rquirs that initial DB2 z/os to b activ in th Parl Sysplx during initialization of th orkload. Altrnativly, som othr orkload routing tchnology (such as Cisco s Multinod Balancr) could b usd instad of Sysplx Distributor to mak th initial routing dcision. DB2 Connction Pooling vs. DB2 Connct Connction Concntrator: You could choos to us ordinary DB2 connction pooling rathr than connction concntrator. But connction pooling has a draback hn usd in conjunction ith Parl Sysplx. With Parl Sysplx aarnss activatd in JDBC T4, JDBC T4 procsss information about th availability of th mmbrs in a DB2 data sharing group only on th cration of a n connction to DB2 z/os. With connction pooling also activatd, thr is a chanc that connctions ould rmain ith a particular mmbr of th data sharing group vn if that mmbr had problms. JDBC T4 ould also not us WLM information to dtrmin hich of its poold connctions ould b th bst connction to b rusd for a n rqust. High Availability Architcturs for Linux on IBM Systm z 36

Connction concntrator builds on th faturs of connction pooling, prsrving and nhancing connction pooling s prformanc bnfits hil liminating th abov aknsss. Thrfor, it as chosn for this architctur. Not hovr that applications that do not rlas rsourcs, such as CURSOR WITH HOLD, TEMP TABLES, or bound ith KEEPDYNAMIC(YES), do not rlas th physical connction. Ky rstrictions or problm aras Thr is no mans of tlling JDBC T4 to giv prfrnc for routing ork to DB2 mmbrs locatd on th sam physical Systm z srvr that ar rachabl ovr high spd hiprsockt connctions. JDBC T4 s orkload balancing dcision is focusd on th ability of th DB2 data sharing group mmbrs on z/os to procss th ork, ith no considration bing givn to ho long it ill tak to transport th rqust/rspons data btn Linux and z/os. As a rsult, ork may b frquntly snt to a DB2 data sharing group mmbr on a diffrnt physical Systm z srvr, in hich cas an OSA connction, rathr than a lor-latncy Hiprsockts connction, ill b usd for communication. Solution stimating guidlins Us th WbSphr on Linux on Systm z Sizing procss to stimat th numbr of IFLs rquird for and WbSphr. Us th Siz390 procss for sizing th numbr of standard CPUs rquird for DB2 z/os. Chaptr 7: Rfrnc Architctur: WbSphr ith DB2 databas on z/os, in sparat citis. (GDPS /PPRC Multiplatform Rsilincy for Systm z ith HyprSap Managr) Scnario Bing Solvd You hav a ky Wb-connctd, WbSphr application that runs in Linux on Systm z. Th primary databas for this application is DB2 for z/os dployd in a data sharing group running in a Parl Sysplx on IBM Systm z9 or Systm z CPUs. Th dploymnt aims not only to provid nar-continuous availability by protcting from th failur of any on componnt, it also aims to protct, and to rcovr from, a Sit 1 failur in hich th applications and th production data ar lost. In this architctur th appliancs, th application, zvm, DB2 and z/os componnts ar sprad across to activ sits that shar th procssing of th application s ork. Systm A is in sit 1. Systm B is in sit 2. Th primary production data runs in Sit 1 and is mirrord to th scondary disks in sit 2. If thr is a Sit 1 failur, procssing ill continu at Sit 2 using th mirrord copy of th data. Aftr th disastr, aftr Sit 1 is rstord, automation ill rstart th softar, rsync th primary and scondary databass, and sitch ntork and data accss to th original configuration in Sit 1. Automation ill also rstor Sit 2 to th original configuration. From a softar prspctiv, th front nd,, Linux for Systm z and zvm softar dploymnts ar dscribd in Chaptr 4 and th DB2 dploymnt is dscribd in Chaptr 6. Addd to th pictur ar th componnts of th GDPS /PPRC Multiplatform Rsilincy for Systm z ith Hypr Sap offring that provid th monitoring, oprating systm and application rstarts, and th data mirroring and sitching tchnologis rquird in th vnt of a sit failur or plannd dontim and rcovry. High Availability Architcturs for Linux on IBM Systm z 37

Th ky lmnts that this architctur adds to th architctur in Chaptr 6 ar th folloing: To data cntrs that ar: o far nough apart to not b influncd by a singl disastr o ithin 100 km to o for synchronous data rplication by ay of to divrs fibr routs Production orkloads configurd to run at both sits ith no primary or backup sit. Primary databas storag ESS/DS rplicatd to th scond sit using synchronous PPRC, maintaining data consistncy btn th primary and scondary dvics on th to sits. HyprSap function th in oprating systms that nabls th surviving z/os and ZVM systms to sitch to th scondary disks. Th sitchovr is transparnt so that th applications do not nd to b quiscd. Not that for z/os th sitchovr coms from a dynamic r-dirction of I/O (UCB) pointrs. For z/vm th HyprSap function os th dynamic sapping of virtual dvics associatd ith on ral disk to anothr ral disk. Ntork connctivity and routing capability at both sits. GDPS 3.1 Control Systm softar running at both sits to monitor systms and control failovr and fback of th ntork, systms and data. IBM Tivoli Systm Automation Mulitplatform (SA MP) running in Linux for zsris and on z/vm. This monitors th functioning of th Linux, front nd and applications and can rstart thm, ithr in plac or on anothr systm in th sam or diffrnt sit. Rquird IGS Srvics (Ths nsur that th original stup is fficintly and accuratly don and that th clint staff is proprly traind to bcom slf-sufficint. It also lads to long trm costs fficincis, including th fact that GDPS arranty costs ar lss than standard.) Architctur Principls This architctur is dsignd to follo ths principls: Any failurs in th srvrs that mak up this architctur ill not b noticabl to th usr. Th currnt transaction may fail, but subsqunt transactions succd. Aftr any singl failur, transactions may continu at th sam rat ith only a brif, or possibly no, dgradation in throughput or rspons tim. Th architctur must b rapidly scalabl to support incrass and dcrass in businss volum. No singl point of failur o Hardar: Thr should b no componnt ithin a sit that ould rsult in loss of nd usr function if it faild. Thr ar xcptions, hovr. Whr componnts ar hardnd from an availability point of vi (usuy by having intrnal rdundancy lik dual por supplis and so on), thn a singl componnt is od. o Facilitis: Thr should b to gographicy distant sits. o Softar: Thr should b fback application srvr(s) ith nough capacity to carry on th ssntial ork in th vnt of a sit failur. This rquirs gographicy disprsd oprating systms, appliancs, applications, and databas nvironmnts. Data Managmnt ith data intgrity o Data mirroring btn sits is rquird. o Data mirroring and sit sitchovr should both b prformd at th lost lvl (hardar) for spd, accuracy, and indpndnc of any application. o Data consistncy is rquird for fast orkload rstart. o Data mirroring frz is rquird to prvnt rolling databas corruption. Automation o End-to nd automation is rquird. Humans cannot b guarantd to provid adquat monitoring and rspons, spciy in tims of disastr. High Availability Architcturs for Linux on IBM Systm z 38

Rfrnc Architctur Routr SITE 1 Routr SITE 2 H M C H M C z/vm: 1 zsris 1 SITE 1 Primary Load Balancr Backup Load Balancr SAMP SAMP z/vm: 2 zsris 2 SITE 2 Dmgr 100 KM WbSphr Clustr z/os:1a DVIPA -------- DB2 SD DVIPA -------- DB2 z/os: 2A SD Backup z/os 1B G D P S K2 z/os 2B G P D S K1 PPRC Copy Primary GDPS /PPRC Multiplatform Rsilincy for Systm z ith HyprSap St up Th normal running configuration of GDPS /PPRC Multiplatform Rsilincy for Systm z ith HyprSap hn thr is no disastr is: A GDPS/PPRC V3 systm is running GDPS automation basd upon Tivoli NtVi and Systm Automation or z/os at both sits. Th primary Controlling Systm (K1) is running in Sit 2: o Monitors th Parl Sysplx clustr, coupling facilitis, and storag subsystms and maintains GDPS status o Invoks HyprSap o Invoks ntork sitching, basd on usr-dfind automation scripts Systm Automation Multiplatform (SA MP) proxy srvrs run in ach VM partition and SA MP agnts run in ach Linux softar componnt (fir, load balancr, srvr, ). Ths provid GDPS automation ith aarnss of th stat of th softar running in th Linux gusts and ork ith GDPS to rstart thm during th rcovry phas. All of th oprating systms must run on Systm z srvrs that ar connctd to th sam Hardar Managmnt Consol (HMC) Local Ara Ntork (LAN) as th Parl Sysplx clustr imags. All critical data rsids on storag subsystm(s) in sit 1 (th primary copy of data) and is mirrord to sit 2 (th scondary copy of data) through PPRC synchronous rmot copy. Sit 1 systm has connctivity to scondary disks Parl Sysplx ith DB2 production data-sharing systms is running in both sits 1 and 2. High Availability Architcturs for Linux on IBM Systm z 39

Flo of rqusts through this architctur Sit1 Failur If thr is a failur at Sit 1, th GDPS K1 Systm in sit2 (using Sysplx communications) dtcts this and automats th failovr procss: 1. Invok HyprSap to sitch th surviving z/os and z/vm systms at th scondary sit to us th scondary disks hich contain th mirrord data. 2. Sitch ntork connctivity to th Sit 2 routr, basd on customr-providd scripts. 3. Tak stps to nsur that 100% of th incoming orkload can b handld by Sit 2. Us on or of th folloing options: Stop xpndabl ork in Sit 2 LPARS to accommodat th primary sit orkloads. Or Invok On/OFF Capacity Backup (O/O CBU) to incras th numbr of IFLs availabl to th VM LPAR and th numbr of CPUs availabl to th z/os partition. Not, as discussd in th sction, Altrnativs Considrd, blo, IFLs, CPUs and mmory cannot b shard btn th srvrs, so if it is rquird that failovr tak plac immdiatly and ithout any intrruption to th usrs, thn ach srvr at both sits must hav sufficint IFLs, CPUs and mmory to run 100% of th orkload should th othr srvr fail. This mans that ach srvr ould run its production orkload at lss than 50% utilization. Softar costs incras bcaus of this. Th softar costs can b rducd by using O/O CBU to duplicat CPUs and IFLs onlin at th scondary sit, should th primary sit fail. Hovr, it mans that for th first f minuts aftr a failur, only 50% of th ndd IFLs and CPUs ar availabl to handl 100% of th orkload. It ill tak approximatly 5 minuts to bring th n IFLs onlin to z/vm. This solution givs lor cost as long as rducd prformanc can b tolratd until th capacity upgrad CPUs ar brought onlin. 4. DB2 non-disruptivly continus procssing on z/os 2A in Sit 2. Sit 1 Rcovry: Aftr th sit and th Systm z CPUs ar rstord: 1. Sit 1 systms (z/os, z/vm, Linux,, and applications) ar rstartd as ndd. 2. DB2 synchronizs data btn Sit 2 and Sit 1 disks. Not that th tim for this can tak anyhr from hours to days, dpnding upon th amount of data changd. GPDS/PPRC ith HyprSap v3.2 adds xploitation for Mtro Mirror Failovr/Fback hich ssntiy tracks changs mad to th data on th scondary disks so that hn fing back only th changd data nds to b copid. This gratly rducs th fback tim. 3. GDPS rstart of z/os and orkloads (DB2, Sysplx Distributor)), and zvm and Linux Gusts. Stop unndd systms in Sit 2. 4. SA MP rstart of Linux Gust orkloads. Stop unndd orkloads in Sit 2. 5. HyprSap z/os and z/vm (and Linux) sitch connctivity to primary disks. 6. Sitch ntork connctivity, if ndd, basd on customr-providd scripts. 7. Rlas On/Off Capacity Backup 8. DB2 non-disruptivly continus procssing using th rstartd Sit 1 rsourcs. High Availability Architcturs for Linux on IBM Systm z 40

Product Vrsions This architctur rquirs th folloing softar vrsions: z/os currntly supportd rlass DB2 currntly supportd rlass GDPS V3.2 or highr, or V3.1 ith nabling APAR for GDPS/PPRC Multiplatform Rsilincy for Systm z z/vm v5.1 ith th HyprSap function Tivoli NtVi for z/os V5.5 or highr Tivoli Systm Automation for z/os V2.2 or highr Tivoli Systm Automation for Multiplatforms R2 o Fixpack (1.2.0-ITSAMP-FP02) for IBM Tivoli Systm Automation for Multiplatforms Linux For Systm z o SuSE SLES 8 rfrsh (4/2004 or nr) Plannd Outags This sction discusss ho ach of th componnts can b takn don for softar upgrads or any othr plannd outag. Plannd Outags Componnt Procdur Load Balancr Sam as in prvious architctur. Sam as in prvious architctur. WbSphr Sam as in prvious architctur. Application DB2 z/os Sam as in prvious architctur Sysplx Distributor Sam as in prvious architctur GDPS Control Invokd by oprator commands. Systm Disk Subsystms Th plannd HyprSap Function provids th ability to transparntly sitch primary disk subsystms ith th scondary subsystms for plannd rconfigurations. This nabls plannd sit maintnanc ithout rquiring any applications to b quiscd. What Larnd in Tsting Th GDPS/PPRC Multiplatform Rsilincy for Systm z ith HyprSap Managr has bn tstd at th Linux and z/os oprating systm lvl and th failovr and fback is shon to prform as dscribd. Th and othr linux-basd applications dscribd in th prcding scnarios hav not bn spcificy tstd ith GDPS/PPRC Multiplatform Rsilincy for z ith HyprSap Managr, but thir failovr and fback should not b impactd by this automatd, cross sit scnario. HyprSap from primary to scondary disks occurs at th VM oprating systm lvl is transparnt to th applications. IBM rcommnds that th failovr and fback of applications dployd in a GDPS/PPRC HyprSap Managr nvironmnt b vrifid by tsting on a rgular basis. High Availability Architcturs for Linux on IBM Systm z 41

Architctural Dcisions Th ky Architctural componnts slctd to achiv th goal of cross-sit disastr rcovry ar th GDPS control systm, PPRC (Mtro Mirror), and HyprSap. GDPS Control Systm Gographicy Disprsd Parl Sysplx (GDPS) is an automatd oprations solution that runs in z/os (th LPAR is oftn rfrrd to as th K systm, or Control Systm.) It must maintain its viability and thus, IBM rcommnds that it alays run in its on z/os LPAR in th Parl Sysplx. Using Tivoli Systm Automation for z/os and its on local disks, GDPS can monitor plannd and unplannd xcption conditions and manag a controlld sit sitch for outags of z/os and Parl Sysplx, z/vm, Linux for Systm z (as a gust undr zvm), and VSE/ESA systms and th applications that run in thm. In this scnario, GDPS/PPRC ith Muliplatform Rsilincy for Systm z ill manag th HyprSap to th scondary disks in Sit 2, th optional activation of rsourcs in Sit 2 to provid quivalnt capability as th lost Sit 1, and th rcovry of ork in to Sit 1, aftr th facilitis and CPU systms hav bn rstord. GDPS includs standard actions to: Stop: quisc a systm's orkload and rmov th systm from th Parl Sysplx clustr (stop th systm prior to a chang indo) Start : IPL a systm (start th systm aftr a chang indo) Rcycl: quisc a systm's orkload, rmov th systm from th Parl Sysplx clustr, and r-ipl th systm (rcycl a systm to pick up softar maintnanc) Manag Parl Sysplx Coupling Facility Structurs Th standard actions can b initiatd against a singl systm or a group of systms. Additiony, usr-dfind actions ar supportd (for xampl, plannd sit sitch in hich th orkload is sitchd from procssors in sit 1 to procssors in sit 2). GDPS unplannd rconfiguration support not only automats procdurs to handl sit failurs, but ill also minimiz th impact and potntiy mask a z/os systm or procssor failur basd upon GDPS policy: If a z/os imag fails, th faild systm ill automaticy b rmovd from th Parl Sysplx clustr and th orkload rstartd on th scond systm in th sysplx. (S th illustration abov. If z/os: 1A fails, its ork can b rstartd on Z/OS: 2A. ) If th hardar fails (zsris 1 SITE1, for xampl), th z/os imag ill b startd up (IPLd) (on zsris 2 SITE2) at th scond sit, and its orkload rstartd. PPRC (also knon as Mtro Mirror) Pr-to-Pr Rmot Copy (PPRC) manags th ongoing synchronous rmot data copy, frz, and Flashcopy functions that mak it possibl to sitch sits ith no data loss, and full data intgrity across multipl volums and storag subsystms. HyprSap HyprSap function provids for a sit sitch of disk subsystms that liminats th nd for an IPL at th rcovry sit. In this architctur it substituts PPRC scondary for th primary dvic and is controlld by GDPS automation. Th GDPS/PPRC HyprSap architctur can manag vry sm to vry larg DASD nvironmnts. For xampl, th HyprSap componnt can manag an unplannd disk rconfiguration for 6,545 volum pairs (19.6 trabyts) in 15 sconds. High Availability Architcturs for Linux on IBM Systm z 42

Not that Microcod-rlatd disk failurs ar among th most common causs for systm outags. GDPS/PPRC HyprSap addrsss this in a scnario hr of th production is in Sit 1 ith disk mirroring to Sit 2. Any disk failur in Sit 1 ill triggr a sap to a scondary disk in Sit 2 and th Sit 1 systms ill continu to run. If of Sit 1 is gon, PPRC frz ill b invokd and th nvironmnt can continu to run in Sit 2. Altrnativs Considrd: Activ - Passiv dploymnt Bcaus IFLS and mmory cannot b shard across Systm z systms and sits, ach srvr must hav sufficint IFLs and mmory to run th orkload if th othr srvr fails. This can b costly to configur. A potntiy lss xpnsiv architctur that provids nar continuous availability is dpictd in th folloing diagram. It assums an activ-passiv modl ith production application srvrs at th primary sit (Sit 1) and standby srvr capability in th parl sysplx at th scondary sit (Sit 2). Starting th standby srvrs on Sit 2 lngthns th rcovry tim by th amount of tim rquird to start th standby oprating systms and applications. Whil a standby configuration lngthns application srvr rcovry tim, this dsign can tak advantag of th Systm z Capacity Upgrad on Dmand fatur (bringing additional CPUs onlin at th scondary sit only if ndd), saving hardar and associatd softar costs. Not that both architcturs rquir th sam primary and scondary disk configurations at th to sits and mploy PPRC and HyprSap as dscribd abov. Thy both rquir a K1 systm running at th scond sit. High Availability Architcturs for Linux on IBM Systm z 43

z/vm 1A zsris 1 Sit 1 Dmgr z/os 1A z/os 1B Primary Load Balancr DB2 Connct EE DVIPA --------- DB2 SD G D P S K2 Primary Routr Backup Load Balancr Backup DB2 Connct EE DVIPA --------- DB2 SD Backup H M C z/vm 1B WbSphr Clustr z/os 1C 100 KM PPRC Copy Routr zsris 2 Sit 2 Bfor failur of Sit1 Capacity Upgrad on Dmand z/os 2A G D P S K1 Bfor failur of Sit1 Aftr failur of Sit 1 zsris 2 Sit 2 Aftr failur of Sit1 Dmgr z/os 2A z/os 2A Routr Primary Load Balancr DB2 Connct EE DVIPA --------- DB2 G D P S K1 SD High Availability Architcturs for Linux on IBM Systm z 44

Ky rstrictions or problm aras IGS Srvics for Instation ar rquird to nsur th propr st up and ducation of customr prsonnl. GDPS cannot b purchasd indpndntly. Achiving 100 km distanc btn sits rquirs a Sysplx Timr xtndr to b positiond at a third sit btn th to. Othris th Sysplx Timr os for a 40 km sparation. Sit 1 Rcovry: DB2 Coupling Facility Data Considrations: In a GDPS/PPRC nvironmnt, hn coupling facility structurs, such as thos for DB2 Group Buffr Pools, ar configurd for duplxing across to coupling facilitis (usuy to avoid SPoFs or to provid structur rcovry) and a primary sit failur occurs, GDPS cannot nsur that th contnts of th coupling facility structur ar tim-consistnt ith th data on DASD. For sit 1 application rstart and rcovry procssing GDPS must discard coupling facility structurs at th scondary sit during th procss of rstarting th orkload. Th rsult is a loss of changd data in coupling facility structurs. You must xcut potntiy vry lngthy and highly variabl DB2 data rcovry procdurs to rstor this lost data. Th lngth of tim dpnds mainly on th numbr of objcts that must b rcovrd, and th amount of logs that must b procssd. IBM tsting and customr xprincs indicat that th rcovry tim is anyhr from a minut or lss (10-30 objcts and sm amount of log) to to hours (thousands of objcts and many logs). Synchronous Data Mirroring Prformanc Ovrhad Som transaction procssing systms may not tolrat th amount of transaction procssing ovrhad causd by PPRC data mirroring. Tsts sho that, for th DS8000 storag systms, ovr fibr channls, rspons tim is 0.4 microsconds ith an additional 10 microsconds for ach km distanc that th disk is from th host srvr. For xampl, if th host systm is 5 km from th disk, rspons tim incrass by 50 microsconds (0.050 millisconds) This to sit architctur can rquir data ncryption not rquird hn a solution is dployd on a singl Systm z srvr. Solution stimating guidlins Us th WbSphr on Linux on Systm z Sizing procss to stimat th numbr of IFLs rquird for and WbSphr. Us th Siz390 procss for sizing th numbr of standard CPUs rquird for DB2 z/os. 1. Clint Vi: Th tim and rsourcs rquird to implmnt this Continuous Availability Rfrnc Architctur dpnd upon th clint s starting point. Typicy, clints spnd from 6 to 12 months dpnding upon th nd of th spctrum from hich thy ar starting. Th ordr of implantation typicy follos: a. Provision a scond sit b. Implmnt PPRC c. Implmnt a multi-sit sysplx d. Implmnt GDPS Critical skills rquird ar for Parl Sysplx, automation and rmot copy. High Availability Architcturs for Linux on IBM Systm z 45

Appndix 1: Rfrncs IBM Systm z Platform Tst Rport for z/os and Linux Virtual s, Jun 2006 Edition (Part 3. Linux Virtual s). This rport dtails th stup and tsting of ths rfrnc architcturs. -03.ibm.com/srvrs/srvr/zsris/zos/intgtst/ IBM Whitpapr: Automating DB2 HADR Failovr on Linux using Tivoli Systm Automation. IBM Rdbook SG24-6392: WbSphr Application V6 Scalability and Prformanc Handbook. IBM Rdbook SG24-6688: WbSphr Application Ntork Dploymnt V6: High availability solutions. IBM Prss, ISBN 0-13-146862-6: WbSphr Dploymnt and Advancd Configurations, Chaptrs 21, 22. Barcia, Hins, Alcott, Botzum. IBM Rdbook SG24-6093: PoplSoft V8 on zsris Using Sysplx Data Sharing and Entrpris Storag Systms, Chaptrs 3 and 4, IBM, March 2004 IBM Whitpapr: Lvraging z/os TCP/IP Dynamic VIPAs and Sysplx Distributor for Highr Availability, Jay Aikn, July 2002 IBM hitpapr: Improv Your Availability With Sysplx Failur Managmnt, Riaz Ahmad, - 1.ibm.com/srvrs/srvr/zsris/pso/imag/sfm.pdf IBM Documnt Numbr SC31-8775-02: z/os V1R4 Communications : IP Configuration Guid IBM documnt numbr SC09-4835-01: DB2 Connct Usrs Guid V8.2 IBM Rdbook SG24-6517: Communications for z/os V1R2 TCP/IP Implmntation Guid, Volum 5: Availability, Scalability and Prformanc, Rodriguz, Fascini, t al, Octobr 2002 IBM GDPS Wbsit:.ibm.com/srvrs/srvr/zsris/gdps.html IBM Rdbook SG24-6344: Linux for zsris: Fibr Channl Protocol Implmntation Guid IBM Rdbook SG24-6694: Linux for IBM Systm z9 and zsris IBM Rdpapr REDP-0205: Gtting Startd ith zsris Fibr Channl Protocol Commnts on this papr. Plas addrss commnts to: Stv Whr, shr@us.ibm.com. High Availability Architcturs for Linux on IBM Systm z 46

Appndix 2: Lgal Statmnt Tradmarks and Disclaimrs Th folloing ar tradmarks of th Intrnational Businss Machins Corporation in th Unitd Stats and/or othr countris. For a complt list of IBM Tradmarks, s.ibm.com/lgal/copytrad.shtml: AIX, AIX 5L, BladCntr,Blu Gn, DB2, -businss logo,, IBM, IBM Logo, Infoprint,IntlliStation, isris, psris, OpnPor, POWER5, POWER5+, Por Architctur, TotalStorag, Wbsphr, xsris, z/os, zsris Th folloing ar tradmarks or rgistrd tradmarks of othr companis: Java and Java basd tradmarks and logos ar tradmarks of Sun Microsystms, Inc., in th Unitd Stats and othr countris or both Microsoft, Windos,Windos NT and th Windos logo ar rgistrd tradmarks of Microsoft Corporation in th Unitd Stats, othr countris, or both. Intl, Intl logo, Intl Insid, Intl Insid logo, Intl Cntrino, Intl Cntrino logo, Clron, Intl Xon, Intl SpdStp, Itanium, and Pntium ar tradmarks or rgistrd tradmarks of Intl Corporation or its subsidiaris in th Unitd Stats and othr countris. UNIX is a rgistrd tradmark of Th Opn Group in th Unitd Stats and othr countris or both. Linux is a tradmark of Linus Torvalds in th Unitd Stats, othr countris, or both. Othr company, product, or srvic nams may b tradmarks or srvic marks of othrs. NOTES: Any prformanc data containd in this documnt as dtrmind in a controlld nvironmnt. Actual rsults may vary significantly and ar dpndnt on many factors including systm hardar configuration and softar dsign and configuration. Som masurmnts quotd in this documnt may hav bn mad on dvlopmnt-lvl systms. Thr is no guarant ths masurmnts ill b th sam on gnry-availabl systms. Usrs of this documnt should vrify th applicabl data for thir spcific nvironmnt. IBM hardar products ar manufacturd from n parts, or n and srvicabl usd parts. Rgardlss, our arranty trms apply. Information is providd AS IS ithout arranty of any kind. All customr xampls citd or dscribd in this prsntation ar prsntd as illustrations of th mannr in hich som customrs hav usd IBM products and th rsults thy may hav achivd. Actual nvironmntal costs and prformanc charactristics ill vary dpnding on individual customr configurations and conditions. This publication as producd in th Unitd Stats. IBM may not offr th products, srvics or faturs discussd in this documnt in othr countris, and th information may b subjct to chang ithout notic. Consult your local IBM businss contact for information on th product or srvics availabl in your ara. All statmnts rgarding IBM's futur dirction and intnt ar subjct to chang or ithdraal ithout notic, and rprsnt goals and objctivs only. Information about non-ibm products is obtaind from th manufacturrs of thos products or thir publishd announcmnts. IBM has not tstd thos products and cannot confirm th prformanc, compatibility, or any othr claims rlatd to non-ibm products. Qustions on th capabilitis of non-ibm products should b addrssd to th supplirs of thos products. Prics ar suggstd US list prics and ar subjct to chang ithout notic. Starting pric may not includ a hard driv, oprating systm or othr faturs. Contact your IBM rprsntativ or Businss Partnr for th most currnt pricing in your gography. Any proposd us of claims in this prsntation outsid of th Unitd Stats must b rvid by local IBM country counsl prior to such us. Th information could includ tchnical inaccuracis or typographical rrors. Changs ar priodicy mad to th information hrin; ths changs ill b incorporatd in n ditions of th publication. IBM may mak improvmnts and/or changs in th product(s) and/or th program(s) dscribd in this publication at any tim ithout notic. Any rfrncs in this information to non-ibm Wb sits ar providd for convninc only and do not in any mannr srv as an ndorsmnt of thos Wb sits. Th matrials at thos Wb sits ar not part of th matrials for this IBM product and us of thos Wb sits is at your on risk. IBM maks no rprsntation or arranty rgarding third-party products or srvics including thos dsignatd as Provn, ClustrProvn or BladCntr Introprability Program products. Support for ths third-party (non-ibm) products is providd by non-ibm Manufacturrs. IBM may hav patnts or pnding patnt applications covring subjct mattr in this documnt. Th furnishing of this documnt dos not giv you any licns to ths patnts. Snd licns inquirs, in riting, to IBM Dirctor of Licnsing, IBM Corporation, N Castl Driv, Armonk, NY 10504-1785 USA. High Availability Architcturs for Linux on IBM Systm z 47