Stephan Arenswald DB2 z/os Data Sharing arens@de.ibm.com 05.07.2011 DB2 z/os Data Sharing
Ich / Me / ぼく Entwickler am IBM Tivoli OMEGAMON XE for DB2 Performance Expert on z/os Seit über 2 Jahren fest angestellt, vorher Hiwi / Praktikant / Werkstudent / Diplomarbeit Thema: Autonomes Datenbank Tuning Entwicklung mit System Z / z/os / Z Auf z/os mit REXX, CLIST, ISPF Panel/Skeleton Gegen DB2 z/os mit Java von Distributed aus Betreuung von Studenten für verschieden Projekte Workload Service Test-Development / Test-Automatisierung 2
First the challenges, then the solution... Availability Access to data any time of the day Capacity Single system may be constraint to rapid business growth, but splitting the database would introduce several problems Growth Easily accomodating business targets means providing scalable, non-disruptive growth options Workload Balancing Effective utilization of available processor capacity for mixed workloads and the ability to handle unpredictable workload spikes Systems Management Consolidation of multiple systems allows easier system management 3
Data Sharing The data sharing function 1 of the licensed program DB2 for z/os 2 enables multiple application to read from, and write to, the same DB2 data concurrently 3. The applications can run on different DB2 subysystems resideing on multiple central processor complexes 4 (CPC) in a Parallel Sysplex 5. 1: It s a function of DB2 for z/os no separate product 2: Only available on DB2 for z/os 3: REAL CONCURRENT access to data 4
REAL CONCURRENT access? App 1 App 2 App 1 App 2 DBMS DBMS 1 DBMS 2 5
Data Sharing The data sharing function 1 of the licensed program DB2 for z/os 2 enables multiple application to read from, and write to, the same DB2 data concurrently 3. The applications can run on different DB2 subysystems residing on multiple central processor complexes 4 (CPC) in a Parallel Sysplex 5. 1: It s a function of DB2 for z/os no separate product 2: Only available on DB2 for z/os 3: REAL CONCURRENT access to data 4: Different DB2 subsystems on different CPCs Various different configurations possible 5: Parallel Sysplex 6
Parallel Syplex Base technology for different important technologies Data Sharing (IBM DB2 z/os) CICS PLEX (IBM CICS) Shared Queues (IBM WebSphere Message Queue) Parallel Sysplex only on System Z no comparable technology anywhere else (Oracle, Microsoft,...) Sysplex (System Complex): Group of z/os systems that communicate and cooperate with one another Synchronization through Sysplex Timer Conection using ESCON, FICON Parallel Sysplex Syplex + Coupling Facility Lock processing High speed caching 7
Sysplex / Shared Disk z/os 1 XCFAS XCFAS z/os 2 XCF 8
Parallel Sysplex / Shared Data z/os 1 XCFAS X E S CF XCF X E S XCFAS z/os 2 9
Data Sharing Architecture Shared Nothing Database partitioned Discs not shared Distributed commit required Data repartitioning when adding nodes (+) Performance (-) If one node dies... Shared Disks No partitioning required Dynamic load balancing (-) Inter-Node communication von messages Locking Data Integrity Buffering Shared Data Handling of locking and caching CF Load Balancing Continuous availability Flexible growth no additional overhead 10
Shared Disk / Sysplex vs. Shared Data / Parallel Sysplex 11
Coupling Facility Connector between z/os Systems Minimal OS that contains different data structures Lock Structures Cache Structures List Structures CF requires 1 LPAR (as separate OS) External CF CF in separate box Internal CF (ICF) CF in same box as DB2 Requires special purpose processor 12
Scalability 2 critical factors for scalability Concurrency Control (Locking) Buffer coherency control (Buffering managing changed data) (1) Concurrency Control Global Locking Allow multiple read operations or single write operation in DSG (2) Coherency Control Managing Changed Data Situation in which one DB2 member changes data rows that already reside in the buffers of other members 13
Global Locking & Lock Structure IRLM (Internal Resource Lock Manager) Separate Address Space responsible for Lock Management in DB2 and IMS! Responsible for local and global locking Uses Cross-Coupling Facility False Contention Hash value is the same for two different resources Rare lock structure in general large enough No message between IRLM as long as lock grant is successful 14
Data Sharing Lock Types Logical Locks (L-Locks) Locks held by transactions Physical Locks (P-Locks) Only in data sharing Part of process to manage buffer coherency Locks on table-/index-spaces to discover inter-db2 read/write interest Inter-DB2 read/write Interest = situation where at least one member in DSG is reading a table-/index-space and another member is writing to them 15
Deadlock Detection in DSG Local vs. Global Deadlock Detection Local Deadlock Detector (LDD): (at least) two waiters are on the same DB2 Global Deadlock Manager (GDM): (at least) two waiters are on different DB2s GDD requires all IRLM in DSG Each IRLM knows its own local locks IRLM IRLM IRLM GDM LDD LDD LDD 16
Group Bufferpools Challenge in managing data in DSG is that DB2 buffers data Problem: 1 Page is in BP of Member 1 and Member 2 Page on Member 1 gets update Page on Member 2 hast to be updated, too Group Bufferpool is a cache structure in CF DB2A BP0 BP1 BP2 CF GBP0 GBP1 GBP2 DB2B BP0 BP1 BP2 17
Group Bufferpools DB2A BP4 GBP4 DB2B BP4 18
Group Bufferpools DB2A BP4 GBP4 DB2B BP4 Inter-DB2 read/ write interest 19
Group Bufferpools DB2A BP4 GBP4 Updated Page DB2B BP4 20
Group Bufferpools DB2A BP4 GBP4 DB2B BP4 21
Group Bufferpools Castout Writes pages from the primary GBP to disk Castout responsibility spread out through all of the members There is no connection between CF and data Benefit? Copy data from buffer to buffer Performance DB2A BP4 GBP4 DB2B BP4 CU-B CU-B 22
Outages Planned Outages Installing Maintenance Migration to new version Unplanned Outages Unwanted of course But more interesting 23
Planned Outages DB2 Outage 1. Direct any workload away from DB2 z/os 2. Shut down DB2 3. Do changes 4. Restart DB2 5. Start accepting workloads again 6. Do 1-5 with every other member in DSG Rolling Maintenance or Rolling IPLs 24
DB2 z/os Migration V10 CM8* V10 CM8* V8 NFM V10 CM8 V10 ENFM8 V10 NFM V10 ENFM8* V9 NFM V10 CM9 V10 ENFM9 V10 ENFM9* V10 CM9* V10 CM9* CM8/9 Just to make sure that code runs In DSG: CM8/9 & NFM8/9 can coexist No v10 new functions available CM8/9* Same as CM8/9 but system was already in ENFM8/9 or ENFM8*/9* or NFM In DSG: CM8/9 & NFM8/9 can not coexist Objects created in ENFM8/9 or NFM can still be used 25
DB2 z/os Migration V10 CM8* V10 CM8* V8 NFM V10 CM8 V10 ENFM8 V10 NFM V10 ENFM8* V9 NFM V10 CM9 V10 ENFM9 V10 ENFM9* V10 CM9* V10 CM9* ENFM8/9 In DSG: ENFM8/9 & NFM8/9 can not coexist No v10 new functions available ENFM8/9* Same as ENFM8/9 but system was already in NFM (or in CM8*/9*) In DSG: ENFM8*/9* & NFM8/9 can not coexist Objects created in NFM can still be used Creating new NFM objects not possible 26
DB2 z/os Migration V10 CM8* V10 CM8* V8 NFM V10 CM8 V10 ENFM8 V10 NFM V10 ENFM8* V9 NFM V10 CM9 V10 ENFM9 V10 ENFM9* V10 CM9* V10 CM9* NFM Catalog completely migrated All new v10 functions available Customer stays in CM weeks ENFM minutes 27
Planned Outages DB2 Outage 1. Direct any workload away from DB2 z/os 2. Shut down DB2 3. Do changes 4. Restart DB2 5. Start accepting workloads again 6. Do 1-5 with every other member in DSG Rolling Maintenance or Rolling IPLs CF Outage Move Structures to secondary CF Rebuild Should be scheduled during low activity 28
Unplanned Outages DB2 New work is automatically rerouted to other member in DSG Transactions, tasks, queries are candidates for recovery DB2A DB2B Problem? DB2A may hold locks to resources on behalf of running transactions If those locks are global locks, then update locks are changed to retained locks No other access is allowed to these resources until changes are committed or backed out Resolve retained locks? DB2A restart Database Availability in % hr/yr Sysplex 99,999 < 1 Simplex 99,5 43,8 29
Unplanned Outages DB2 Only DB2 AS or IRLM failed? Restart DB2 (e.g. Using ARM) During restart retained locks are released z/os, CPC, LPAR,... failed? Problem: release retained locks as quickly as possible But: IPL takes 30 mins Solution: Start DB2A on another LPAR that is online and that is connected to CF (Cross-System Restart) Then, use Restart Light option Requires minimal resources Immediatly shuts down after releasing locks! 30
Unplanned Outages CF Not a real problem In High Availability Scenarios CF is mirrored Switch to secondary CF in sub-second area Performance may be slow down a little bit but DB2 still works! 31
Flexibility Network Transaction Managers App 1 App 2 App n DB2A DSG DB2n Environement Change? No App Change Portability: Execute App on more than one member Frequent Commits Lock Avoidence: Return rows without holding locks Concurrency: Execute more than one instance of an app on different members at the same time Replace/Add/Remove Member in DB2 Data Sharing As shown in migration No problem except minimal performance loss as worst case Replace/Add/Remove LPAR in Parallel Sysplex Means that at least one DB2 member will changed see above SYSA z/os Parallel Sysplex SYSn 32
Questions? Comments? 33
Praktika! DB2 Workload Entwicklung Weiterentwicklung eines existierenden Frameworks für Workloads Neuentwicklung von Workloads Umsetzung von Requirements in einem Entwicklungsprozess Erforderliche Kenntnisse: Java und Datenbanken (Grundkenntnisse) Spannende, abgeschlossene Aufgaben zur Eigenverantwortung Möglichkeit zur aktiven Mitgestaltung Test-Entwicklung / Test- Automatisierung Testautomatisierung für ein Java basiertes GUI oder eine commandline basierten UI Weiterentwicklung der bereits vorhandenen Testcases Implementierung von neuen Anforderungen für das Framework Erforderliche Kenntnisse: Java, XML und Datenbanken (Grundkenntnisse) Mitarbeit in einem komplexen Software- Entwicklungsprojekt Möglichkeit zur aktiven Mitgestaltung ID: JO15356 / ID: JO12163 34
zsummer University 2 Wochen bezahlter Workshop Alles rund um System Z und dessen wichtigster Technologien DB2 z/os CICS IMS Hands On Labs Selbst auf z/os arbeiten 29.08.2011 09.09.2011 ibm.com/de/entwicklung 35