Database high availability



Similar documents
High Availability for Database Systems in Cloud Computing Environments. Ashraf Aboulnaga University of Waterloo

A SURVEY OF POPULAR CLUSTERING TECHNOLOGIES

Module 14: Scalability and High Availability

Contents. SnapComms Data Protection Recommendations

Scalable and Highly Available Database Systems in the Cloud

Disaster Recovery for Oracle Database

Designing, Optimizing and Maintaining a Database Administrative Solution for Microsoft SQL Server 2008

The Future of PostgreSQL High Availability Robert Hodges - Continuent, Inc. Simon Riggs - 2ndQuadrant

Disaster Recovery Solutions for Oracle Database Standard Edition RAC. A Dbvisit White Paper

DISASTER RECOVERY STRATEGIES FOR ORACLE ON EMC STORAGE CUSTOMERS Oracle Data Guard and EMC RecoverPoint Comparison


Microsoft SharePoint 2010 on VMware Availability and Recovery Options. Microsoft SharePoint 2010 on VMware Availability and Recovery Options

MS Design, Optimize and Maintain Database for Microsoft SQL Server 2008

High Availability Databases based on Oracle 10g RAC on Linux

<Insert Picture Here> Oracle Database Directions Fred Louis Principal Sales Consultant Ohio Valley Region

Yiwo Tech Development Co., Ltd. EaseUS Todo Backup. Reliable Backup & Recovery Solution. EaseUS Todo Backup Solution Guide. All Rights Reserved Page 1

Virtual Infrastructure Security

Chapter 13 File and Database Systems

Chapter 13 File and Database Systems

TOP FIVE REASONS WHY CUSTOMERS USE EMC AND VMWARE TO VIRTUALIZE ORACLE ENVIRONMENTS

Introduction to Enterprise Data Recovery. Rick Weaver Product Manager Recovery & Storage Management BMC Software

SQL Server on Azure An e2e Overview. Nosheen Syed Principal Group Program Manager Microsoft

ORACLE DATABASE HIGH AVAILABILITY STRATEGY, ARCHITECTURE AND SOLUTIONS

Veritas Cluster Server from Symantec

High Availability and Disaster Recovery for Exchange Servers Through a Mailbox Replication Approach

Eliminate SQL Server Downtime Even for maintenance

Designing Database Solutions for Microsoft SQL Server 2012 MOC 20465

ORACLE DATABASE 10G ENTERPRISE EDITION

Remus: : High Availability via Asynchronous Virtual Machine Replication

Beginning SQL Server Administration. Apress. Rob Walters Grant Fritchey

Neverfail for Windows Applications June 2010

Backup and Recovery Solutions for Exadata. Cor Beumer Storage Sales Specialist Oracle Nederland

Our Cloud Backup Solution Provides Comprehensive Virtual Machine Data Protection Including Replication

Evaluation of disaster recovery in cloud computing

Near-Instant Oracle Cloning with Syncsort AdvancedClient Technologies White Paper

Administering a Microsoft SQL Server 2000 Database

Veritas InfoScale Availability

Virtual Machines and Security Paola Stone Martinez East Carolina University November, 2013.

be architected pool of servers reliability and

<Insert Picture Here> Considerations for Enterprise Cloud Computing

Backup and Recovery Solutions for Exadata. Ľubomír Vaňo Principal Sales Consultant

SQL Server for Database Administrators Course Syllabus

PROTECTING MICROSOFT SQL SERVER TM

Symantec Cluster Server powered by Veritas

Techniques for implementing & running robust and reliable DB-centric Grid Applications

Scalability and BMC Remedy Action Request System TECHNICAL WHITE PAPER

SERVER VIRTUALIZATION IN MANUFACTURING

Asigra Cloud Backup V13.0 Provides Comprehensive Virtual Machine Data Protection Including Replication

Would-be system and database administrators. PREREQUISITES: At least 6 months experience with a Windows operating system.

SQL Server 2012/2014 AlwaysOn Availability Group

Pervasive PSQL Meets Critical Business Requirements

Microsoft SQL Database Administrator Certification

IP Storage On-The-Road Seminar Series

Library Recovery Center

Backups and Maintenance

Cloud Based Application Architectures using Smart Computing

Nutanix Tech Note. Data Protection and Disaster Recovery

Microsoft SQL Server 2008 R2 Enterprise Edition and Microsoft SharePoint Server 2010

Connectivity. Alliance Access 7.0. Database Recovery. Information Paper

Ingres Replicated High Availability Cluster

Administering a Microsoft SQL Server 2000 Database

Oracle Active Data Guard Far Sync Zero Data Loss at Any Distance

HRG Assessment: Stratus everrun Enterprise

Protecting your SQL database with Hybrid Cloud Backup and Recovery. Session Code CL02

High Availability with Postgres Plus Advanced Server. An EnterpriseDB White Paper

How To Write A Server On A Flash Memory On A Perforce Server

NETAPP SYNCSORT INTEGRATED BACKUP. Technical Overview. Peter Eicher Syncsort Product Management

Cyber Security: Guidelines for Backing Up Information. A Non-Technical Guide

Comparing TCO for Mission Critical Linux and NonStop

Real World Enterprise SQL Server Replication Implementations. Presented by Kun Lee

Informix Dynamic Server May Availability Solutions with Informix Dynamic Server 11

Introduction to Virtualization. Paul A. Strassmann George Mason University October 29, 2008, 7:20 to 10:00 PM

Business Continuity with the. Concerto 7000 All Flash Array. Layers of Protection for Here, Near and Anywhere Data Availability

Cloud-dew architecture: realizing the potential of distributed database systems in unreliable networks

OLTP Meets Bigdata, Challenges, Options, and Future Saibabu Devabhaktuni

One Solution for Real-Time Data protection, Disaster Recovery & Migration

Server Virtualization with Windows Server Hyper-V and System Center

Server Virtualization in Manufacturing

High Availability for Citrix XenApp

DB2 9 for LUW Advanced Database Recovery CL492; 4 days, Instructor-led

Relational Databases in the Cloud

Trends in Data Protection and Restoration Technologies. Mike Fishman, EMC 2 Corporation (Author and Presenter)

EMC VPLEX FAMILY. Continuous Availability and data Mobility Within and Across Data Centers

Database as a Service (DaaS) Version 1.02

Storage and Disaster Recovery

Designing a Data Solution with Microsoft SQL Server 2014

High Availability Solutions for the MariaDB and MySQL Database

Case Study: Oracle E-Business Suite with Data Guard Across a Wide Area Network

Restoration Technologies. Mike Fishman / EMC Corp.

Microsoft s Advantages and Goals for Hyper-V for Server 2016

Connectivity. Alliance Access 7.0. Database Recovery. Information Paper

CA ARCserve Replication and High Availability Deployment Options for Hyper-V

Transcription:

Database high availability Seminar in Data and Knowledge Engineering Viktorija Šukvietytė December 27, 2014 1. Introduction The contemporary world is overflowed with data. In 2010, Eric Schmidt one of the Google executive chairman described this information flow as follows: these days all information that people have developed until 2003 can be created in just two days. Imagine what would happen if all this information flow stops. We usually use databases to collect knowledge and it is very important that these sources of knowledge would be accessible all the time, i.e. database high availability is very important for us. Using various technologies, database developers in all possible ways try to protect their databases from failures. In such occasions, they try to make them work again as soon as possible with the lowest possible loss of data. Developers require quite a lot of additional resources and knowledge in the specific database technologies to maintain the high availability (HA). We will show other checkpoint-based replication method for DB HA assurance on live migration (the ability to migrate virtual machines from one physical host to another while they are running). We will discuss advantages and disadvantages of this approach and compare it with database itself methods used to ensure high availability. This paper describes the DB high availability assurance ways in such a way in order to clearly show the evolution of ideas. In Sections 2.1 and 2.2, we present Active Data Guard [1] and Real Application Cluster [2]. These solutions applied to ensure the HA DB by DB developers. Sections 3.1 3.3 intended to describe the assurance of DB HA by using VM technologies. These would be ReVirt system [3], Remus DB [4] and SecondSite [5] approaches. In principle, this is totally different manner to ensure the DB HA. However, the idea is borrowed from DBMS methods, but in other environment. 2. High availability assurance by applying DBMS technologies These capabilities take care of most scenarios that might impact the availability of a database. Many reasons may cause that we will not be able to access the resources of database. It can be administrator, human mistakes, data corruption caused by system, software faults, complete site failure, system maintenance operations and the data maintenance operations. It can be also malicious activity into database by hackers. Developers of databases are interested in the protection of these or similar scenarios. So, they use a variety of technologies to ensure the DB high availability. We will examine some used solutions (methods) that ensure DB high availability, such as Active Data Guard and Real Application Cluster. These methods use many DB technologies that help to solve encountered problems. Also, we will provide an overall assessment according to the HA assurance. 2.1 Active Data guard (DB technology) Active Data Guard is the most compact high-level solution to ensure the DB high availability if DB crashes. This solution prevents from the loss of data, ensures the minimal downtime and the minimal DB unavailability for the user. 1

The principal schema of Active Data Guard is shown in the Figure 1. It consists of primary database and one or more standby databases. The primary database generates the log files. The latter are transferred over the network to the standby database, which is always in recovery mode, i.e. the standby database adapts these transferred files. Active Data Guard is responsible for: Both primary and standby DB performance monitoring. Log files transferring from the primary DB to the standby DB. Standby DB conversation to the primary DB if this crashes. Also, it provides as fast as possible access to the database. Fig. 1 Principal schema of Active Data Guard. Intensive working with DB (especially executing operations like insert, delete and update), it generates a lot of changes in the log files of DB. Created DBMS technologies are applied to reduce this amount of information because all of this need to transfer over a network to standby DB. As an example reduction of amount of data, which need to be transferred to standby DB, we can provide Oracle DB applied methods: In the DB or session level can be set LOGGING/NOLOGGING option. This allows turn off/on of some operation logging. This is usually applied to temporary, easily rebuilt or generated data. Log files contains not the changed records of tables, but only those records changed blocks. Log files are archived and only then are transferred to standby DB. 2.2 Real application cluster (DB technology) Real application cluster (RAC) is premier shared disk database clustering technology. RAC provides customers with the highest database availability by removing the single database server as a single point of failure. In a clustered server environment, the database itself is shared across a pool of servers, which means that if any server in the server pool fails, the database continues to run on surviving servers. RAC principal schema is shown in Figure 2. RAC consists of several or more DB servers connected to one cluster. These servers are serving the same shared disk database. Among the DB servers participating in the cluster must be good bandwidth and fast interconnect, because all of these servers use the shared cache. Users transparent connect to the cluster and thus gain access to the DB. If any DB server fails, the client-server communication can survive. All connections automatically are taken over by other less loaded DB server from the cluster. Users can continue their work with no breaks. It is worth to note that technology used in RAC lets to use the shared cache. This technology allows taking over ones server running work to other server, when the first one fails. 2

Fig. 2 Principal schema of RAC. To use the shared cache, developers have to solve a problem related with changing large memory bandwidth and its transmission. For this purpose, they use various data compressions, filtrations, etc. As a requirement is fast and high-bandwidth interconnect between cluster servers. 2.3 Evaluation of HA assurance by applying the DBMS technologies Pros: Database developers use the DBMS technologies in order to ensure the high availability. Therefore, it is effective, flexible and helps to save resources. As an example of flexibility, we can mention the Active Data Guard possibility to have not only physical but also the logical standby DB. Both of them have the same log shipping technology. The difference between physical and logical standby DB is that in case of physical, the changes made in the primary DB will be replicated in the block level of standby DB, and in case of logical, the changes will be replicated in DDL (Data Definition Language), DML (Data Manipulation Language), DCL (Data Control Language) and TCL (Transaction Control) statements level. In RAC case, the effective system load distribution/imbalance is ensured through the DB instances. However, developers try to reduce overhead of resources in Data Active Guard by deploying standby DB (allows to open DB in read-only (RO) mode, generate reports, etc.). Also, users practically do not even notice if DB instance crashes in RAC case. Meanwhile, Active Data Guard terminates user sessions until users will be able to connect to the DB. It takes not so long, but it is necessary time period. During that time, while users cannot to connect, standby DB takes over the failed DB work Cons: Need to have the good specialists, who have good knowledge in the specific DBMS technologies, which we need to ensure high availability. This is costly. Tools, which usually ensure DB high availability, are licensed separately and their price is similar to the productive DB license price. 3. DB HA assurance, using the VM technologies There is another approach demonstrated by developers of ReVirt, Remus DB and SecondSite systems. Their solutions were based on virtual-machine technologies. We will examine the principles of these systems below and will discuss their advantages and disadvantages. 3

3.1 ReVirt: enabling intrusion analysis through virtual- machine logging and replay Revirt system was intended not only to support the DB high availability, but also to analyse an intrusion to the operations system actions. VM utilization and their action logging allows to move away from the target OS, lets the replication its status at any time and its action replay. Developers of ReVirt system use VM technologies (Live Migration of Virtual Machines, Virtual-Machine Logging and Replay) that are very similar as technology used in the Data Guard, called log shipping with time shipping, i.e. log files are applied not immediately upon receipt, but with a certain time lag. 3.2 Remus DB transparent high availability for database system (VM technology) Developers of Remus DB were one of the first, who took a look at the DB HA assurance from a different angle. They were eager to raise discussions, what is better: the HA is best implemented within the DBMS, or as a service by the infrastructure below it. The first ideas for Remus DB came from the predecessor ReVirt and VM life migration technology was chosen for DB HA assurance. The principal schema of Remus DB is shown in Figure 3. The basis of Remus DB system is consisted of two servers, which used to provide HA for a DBMS. One server hosts the active VM, which handle the DB and all client requests. The second one hosts the standby VM in the active mode. While active VM runs, the entire its state including memory, disk, and active network connections are continuously check-pointed to a standby VM on a second physical server. If the Active VM failure occurs, client connections on primary DB are connected to the DB in the standby VM. In such a way, clients can continue work with DB with a minimum delay and a minimum loss of data. Fig. 3 Principal schema of Remus DB. Here developers face to the challenges such as very large amount of data passing from Active VM to standby VM. This flow is due to the necessary transfer of entire Active VM state to standby VM. Developers of Remus DB set their sights on the DBMS technologies ensuring HA such as RAC and Active Data Guard that were mentioned above in order to cope with these problems. The idea to reduce the amount of required memory to pass on standby VM was borrowed from RAC. Transmitted data compression from Active Data Guard. Developers of RemusDB system mention, that their decision requires not so much additional system resources, i.e. overhead of system resources is 10-15%. 4

3.3 SecondSite disaster tolerance as a service (VM technologies) SecondSite, a high-availability and disaster tolerance service for virtual machines running in cloud environments. This service runs based on Remus DB system. All VM containing DBs are replicated to a backup image at an alternate geographic location. SecondSite service tracks the entire process and in the event of any VM failure, takes care of all the work that has been done on the failed VM would be transferred to the standby VM. Further, ensures that standby VM clone is made and Live Migration of Virtual Machines is restored. In such a way, SecondSite allows completely transparent process to ensure the HA of DBs located in cloud. Fig. 4 Principal scheme of SecondSite. In Figure 4, we can see a principal scheme of SecondSite service and Network Failover without Service Interruption. SecondSite has the ability to regulate the amount of data transferred from the primary site to Backup site. This is a compromise to a possible small data loss, but it is very important, because data are transferred to the geographically remote servers over the WAN. 3.4 Evaluation of HA assurance by using VM technologies Pros: DB HA assurance by using VM technologies allows moving away from DB itself. We do not need to care what kind of DB do we have and what tools to ensure the HA has it. Time taken to access DB after failure is relatively short. Using VM technologies to ensure DB HA requires only about 10-15% performance overhead. Cons: Using VM technologies for DB HA does not ensure such as diversity and flexibility as using DBMS technologies. There is no any logical standby DB. It is not possible to deploy standby DB in any other way than for its direct purpose. There are strict restrictions for a virtualization layer. It should meet the captured changes in the state of the whole VM at the active host. There is always performance overhead, although a small. It cannot distribute overhead through the several instances like in RAC. 5

4. Conclusion Step by step, we have looked at ways to ensure the DB HA. One of these ways is provided by DBMS. Supporters of VM technologies can offer alternatives. It was common that only DBMS can offer how to ensure the HA of databases. However, it can be stated that methods offered by supporters of VM technologies do not give in to first ones. DBMS methods can be more flexible, more optimal, but they are more expensive and require specific and good knowledge in databases. Meanwhile, another ones are versatile (it does not matter what kind of DB we have), can be relatively cheaper, but also have a greater variety of restrictions. These VM technologies ensuring DB HA has strong future perspective and this is the best proof of SecondSite system applied for DB in clouds. REFERENCES 1. Technical Comparison of Oracle Database 12c vs. Microsoft SQL Server 2012 Focus on High Availability, An Oracle White Paper, November 2013, http://www.oracle.com/technetwork/database/availability/ha-oracle12c-sqlserver2012-2049933.pdf. 2. Oracle Real Application Clusters (RAC), An Oracle White Paper, June 2013, http://www.oracle.com/technetwork/database/options/clustering/rac-wp-12c- 1896129.pdf?ssSourceSiteId=ocomen. 3. George W. Dunlap, Samuel T. King, Sukru Cinar, Murtaza A. Basrai, Peter M. Chen, ReVirt: Enabling intrusion analysis through virtual-machine logging and replay, in Proceedings of the 5th Symposium on Operating Systems Design and Implementation, December 09 11, 2002, Boston, Massachusetts, ACM New York, pp. 211 224, http://dl.acm.org/citation.cfm?id=1060309. 4. Umar Farooq Minhas, Shriram Rajagopalan, Brendan Cully, Ashraf Aboulnaga, Kenneth Salem, Andrew Warfield, RemusDB: transparent high availability for database systems, The VLDB Journal, 2(1):29 45, 2013. 5. Shriram Rajagopalan, Brendan Cully, Ryan O'Connor, Andrew Warfield, SecondSite: disaster tolerance as a service, ACM SIGPLAN Notices, 47(7):97 107, 2012. 6