High Availability and Disaster Recovery for SAP HANA with SUSE Linux Enterprise Server for SAP Applications Uwe Heinz Product Manager SAP Uwe.Heinz@sap.com Fabian Herschel Senior Architect SAP LinuxLab SUSE SAP Global Alliance Fabian.Herschel@suse.com
Agenda SAP HANA typical implementations Outlook for the next 12 18 months Disaster Recovery Capabilities of SAP HANA Automate SAP HANA System Replication SAPHanaSR - Setup and Implementation SAPHanaSR - Roadmap Our Community
Introduction & Overview (SAP and LINUX) 15 years Linux develpoment plattform SAP Netweaver SoH and BWonH SAP BWA and SAP HANA Foundation of Linuxlab and growing... CeBit 99 1999 2000 2001 2002 2003 2004 2005 SAP supports only : 2014 SAP SE or an SAP affiliate company. All rights reserved. 2006 2007 2008 2009 2010 2011 2012 2013 2014 SLES, RHEL and Oracle Linux Public 1
Joined activities NFS local loopback NUMA optimiz ation bench marks Gcc optimization harde ning SUSE HA KVM/ XEN SLES4SAP HANA for Linux on Power Joined support > 15 years Development Teched Berlin/Las Vegas >50 customer WS Priority support Maintenance Events DSAG/ ASUG SuseLabs conference 4 Developer in SAPLinuxlab SuseCon 2014 SAP SE or an SAP affiliate company. All rights reserved. Public 2
Joined activities
High Availability & Disaster Recovery Customer http://www.saphana.com/docs/doc-2010
High Availability Disaster Recovery Overview Business Continuity High Availability Disaster recovery per Data Center between Data Centers 1 SAP HANA Host Auto-Failover (Scale-Out with Standby) 3 SAP HANA Storage Replication 2 SAP HANA System Replication Performance Optimized 4 SAP HANA System Replication Performance Optimized Cost Optimized Cost Optimized 2014 SAP SE or an SAP affiliate company. All rights reserved. Public 5
High Availability Options 1 Scale-Out or Host Auto-Failover
Storage Connector API Shared Storage SAN Storage SAP HANA High Availability: Host Auto-Failover High Availability configuration N active servers in one cluster M standby server(s) in one cluster Shared file system for all servers Server 1 Server 2 Services Name and index server on all nodes Statistics server (only on one active server) Name server active on Standby Failover Server X fails Server N+1 reads indexes from shared storage and connects to logical connection of server X Storage Connector API ensures remount of necessary disk areas (Note 1900823 - Storage Connector API Attachments) Server 3 Server 4 Server 5 Server 6 Standby Server 2014 SAP SE or an SAP affiliate company. All rights reserved. Public 7
HANA High Availability Host Auto-Failover (standby) Different implementation of High Availability by HW partners Using storage solution inside NFS/cluster fs / reallocating block devices Using internal disk Name Server Index Server Name Server Index Server Standby Name Server Index Server Data Disks Data Disks Data Disks GPFS Log Disks Log Disks Log Disks GPFS 2014 SAP SE or an SAP affiliate company. All rights reserved. Public 8
High Availability Options 2 SAP HANA System Replication
HA Solution Partner HA Solution Partner SAP HANA High Availability: System Replication Performance Optimized Data Center 1 OS: DNS, hostnames, virt. IPs Internal Disks Primary (active) Name Server Index server Clients Transfer by HANA database kernel Application Servers Secondary (active, data pre-loaded) Name Server Index server Internal Disks Performance optimized option Secondary system completely used for the preparation of a possible take-over Resources used for data pre-load on Secondary Take-overs and Performance Ramp shortened maximally Data Disks Log Disks Data Disks Log Disks 2014 SAP SE or an SAP affiliate company. All rights reserved. Public 10
HA Solution Partner HA Solution Partner SAP HANA High Availability: System Replication Cost Optimized Data Center 1 OS: DNS, hostnames, virt. IPs Internal Disks Primary (active) Data Disks Name Server Index server Clients Log Disks Transfer by HANA database kernel Application Servers Secondary PRD shadow operation Data Disks PRD Log Disks Name Server Index server QA/DEV running Internal Data Disks Disks QA/DEV Log Disks Cost optimized with Operating non-prod systems on Secondary Resources freed (no data pre-load) to be offered to one or more non-prod installations During take-over the non-prod operation has to be ended Take-over performance similar to cold start-up 2014 SAP SE or an SAP affiliate company. All rights reserved. Public 11
Disaster Recovery Options 3 SAP HANA Storage Replication
Storage Mirroring HA Solution Partner HA Solution Partner SAP HANA Disaster Recovery: Storage Replication Cluster across Data Centers with non-prod on 2 nd site Data Center 1 OS: DNS, hostnames Primary Clients Application Servers Data Center 2 Secondary Prod. (inactive), QA&DEV (active) Arrangement usually offered with a strong part of hardware partners involvement Support issues handled by/routed to HW partners Name Server Index server Name Server Index server OS: Mounts Name Server Index server Name Server Index server Name Server Index server Name Server Index server TCO reduction by combined operation with non-prod on Secondary Needs another disk stack for nonprod usage load Data Volumes Log Volume Data Volumes Log Volume Data Volumes Log Volume Data Volumes Log Volume Data Volumes Log Volume Data Volumes Log Volume Cluster management often included and delivered as a whole package 2014 SAP SE or an SAP affiliate company. All rights reserved. Public 13
Disaster Recovery Options 4 SAP HANA System Replication
HA Solution Partner HA Solution Partner SAP HANA Disaster Recovery: System Replication Cluster across Data Centers with DB controlled transfer Data Center 1 OS: DNS, hostnames, virt. IPs Name Server Index server OS: Mounts Data Volumes Log Volume Primary (active) Name Server Index server Data Volumes Log Volume Clients Name Server Index server Transfer by HANA database kernel Application Servers Data Center 2 Secondary (active, data pre-loaded) Name Server Index server Data Volumes Log Volume Name Server Index server Data Volumes Log Volume Name Server Index server Performance optimized option Faster Take-Over Shortened Performance Ramp (seconds to less minutes) SYNC & ASYNC possible Several cluster options Some HW Partners offer prepackaged options Step-by-Step Implementation Guide (updated recently to SPS8): https://scn.sap.com/docs/doc- 47702 2014 SAP SE or an SAP affiliate company. All rights reserved. Public 15
How to configure HANA system replication
SAP HANA in Data Centers Video about SAP HANA System Replication http://www.saphana.com/docs/doc-4152 https://www.youtube.com/watch?v=obuiwmjarpc 2014 SAP SE or an SAP affiliate company. All rights reserved. Public 17
System replication using SAP HANA studio Same SID Same Systemnumber Same number of services (index,name, ) 2014 SAP SE or an SAP affiliate company. All rights reserved. Public 18
System replication using SAP HANA studio Complete data backup 2014 SAP SE or an SAP affiliate company. All rights reserved. Public 19
System replication using SAP HANA studio Apply a logical systemname e.g. location 2014 SAP SE or an SAP affiliate company. All rights reserved. Public 20
System replication using SAP HANA studio First stop the secondary site Apply a logical name e.g. location 2014 SAP SE or an SAP affiliate company. All rights reserved. Public 21
System replication using SAP HANA studio 1. Start the secondary site 2. replication starts using backup and logs 3. Takeover done using SAP HANA Studio (admin) 2014 SAP SE or an SAP affiliate company. All rights reserved. Public 22
High Availability for SAP with SUSE Around 10 years of experience with High Availability for SAP NetWeaver Systems Starting around 1 year ago to implement the HANA SR Automation Solutions are jointly developed between SUSE, SAP, customers and partners in the SAP Linux Lab in Walldorf
SAP HANA System Replication and SLES for SAP Applications node 1 node 2 resource failover active / active N M HANA PR1 primary N M HANA PR1 secondary A B A B HANA Database A System Replication B HANA memory-preload
Automate SAP HANA System Replication Service Level Agreement SAP HANA System Replication SUSE High Availability Solution
SUSE Linux Enterprise High Availability Extension Cluster for SAP HANA System Replication node 1 node 2 Pacemaker SAP HANA PR1 primary vip System Replication SAP HANA PR1 secondary System PR1 System PR1
SUSE Linux Enterprise High Availability Extension Cluster for SAP HANA System Replication node 1 node 2 Pacemaker SAP HANA PR1 primary System Replication SAP HANA PR1 secondary System PR1 System PR1
SUSE Linux Enterprise High Availability Extension Cluster for SAP HANA System Replication node 1 node 2 Pacemaker SAP HANA PR1 [primary] System Replication vip SAP HANA PR1 primary System PR1 System PR1
SUSE Linux Enterprise High Availability Extension Cluster for SAP HANA System Replication node 1 node 2 Pacemaker SAP HANA PR1 secondary System Replication vip SAP HANA PR1 primary System PR1 System PR1 Direction of the system replication will only be changed if the parameter AUTOMATED_REGISTER is been changed to true
From Concept to Implementation suse01 vip suse02 SAP HANA Primary Cluster Communication SAP HANA Secondary Master SAPHana Master/SlaveResource Slave Clone SAPHanaTopology CloneResource Clone Fencing
HANA System Replication in HAWK
SAPHanaSR Delivery Package SAPHanaSR with two resource agents: SAPHanaTopology and SAPHana Setup Guide SAPHanaSR HAWK Wizard and
Four Steps to Install and Configure Install SAP HANA Configure SAP HANA System Replication Install and initialize SUSE Cluster Configure SR Automation using HAWK wizard
SAPHanaSR HAWK Wizard Technical preview included in the shipping.
Outlook: SAPHanaSR-monitor
Internal Testing: Test-Driver with >25K Tests
The Five Interfaces HANA Startframework: sapstartsrv/sapcontrol/ HDB (calls, output format GetProcessList ) HANA-Topology: landscapehostconfiguration.py (rc, output format) SR-Topology: hdbnsutil (calls, output format -sr_state--sapcontrol=1 ) SAPHostagent: saphostctrl (call, output format ListInstances ) SR-Status: hdbsql(now) / systemreplicationstatus.py (SPS09) (rc, calls, output format)
Allowed Scenarios (yet) Two-node clusters Scale-up (single-box to single-box) HANA system replication Single-tier System Replication ( A B ), no multi-tier Preferred site takeover active - there is no other SAP HANA system (like DEV, TST, QAS) on the replicating node that needs to be stopped during takeover (not a technical limit, but requires additional testing) Both physical and virtual SAP host names
Requirements Both SAP HANA instances have the same SAP Identifier (SID) and Instance Number Both cluster nodes in-time sync (ntp) Both nodes are in the same network segment (layer2) Technical users and host names resolved locally Distance / Latencies
Roadmap / Next Steps Scale-Out ( @A @B ) - Currently under development - PoC Tests expected in Q1/2015 Multi-tier System Replication Chain Topology ( A B C ) - Currently under testing - Partner Tests expected in Q4/2014 Single-tier System Replication and DEV / TST ( A [B] + DEV ) - Cluster configuration already available - Partner Tests expected in Q4/2014
Outlook: HANA in a SUSE Linux Enterprise High Availability Extension Cluster HANA Multi Node System Replication / Scale-OUT swarm 1 swarm 2 resource failover active / active N M HANA PR1 primary N M HANA PR1 secondary A B A B System Replication HANA Database This scenario is currently in development A B HANA memory-preload
Our Community Developed jointly in the SAP Linux Lab in Walldorf Integration of the solution in partner products Upstream open-source project Scoping, discussing and implementing Scale-Out You are invited to join our community :-) Visit our booth or contact us via sapalliance@suse.com or saphana@suse.com
SUSE SAPHanaSR in 3 Facts Reduces complexity - provides a wizard for easy configuration with just SID, instance number and IP address - automates the sr-takeover and IP failover ("bind") Reduces risk - includes always a consistent picture of the SAP HANA topology - provides a choice for automatic registrations and site takeover preference Increases reliability - provides short takeover times in special for table preload scenarios - includes the monitoring of the system replication status to increase data consistency
Find our Best Practices at: www.suse.com/products/sles-for-sap/resource-library/ Thank you.
Corporate Headquarters Maxfeldstrasse 5 90409 Nuremberg Germany +49 911 740 53 0 (Worldwide) www.suse.com Join us on: www.opensuse.org
Unpublished Work of SUSE. All Rights Reserved. This work is an unpublished work and contains confidential, proprietary and trade secret information of SUSE. Access to this work is restricted to SUSE employees who have a need to know to perform tasks within the scope of their assignments. No part of this work may be practiced, performed, copied, distributed, revised, modified, translated, abridged, condensed, expanded, collected, or adapted without the prior written consent of SUSE. Any use or exploitation of this work without authorization could subject the perpetrator to criminal and civil liability. General Disclaimer This document is not to be construed as a promise by any participating company to develop, deliver, or market a product. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. SUSE makes no representations or warranties with respect to the contents of this document, and specifically disclaims any express or implied warranties of merchantability or fitness for any particular purpose. The development, release, and timing of features or functionality described for SUSE products remains at the sole discretion of SUSE. Further, SUSE reserves the right to revise this document and to make changes to its content, at any time, without obligation to notify any person or entity of such revisions or changes. All SUSE marks referenced in this presentation are trademarks or registered trademarks of Novell, Inc. in the United States and other countries. All third-party trademarks are the property of their respective owners.