WHITE PAPER: TECHNICAL

Similar documents

Veritas Cluster Server from Symantec

Ultimate Guide to Oracle Storage

VERITAS and HP A LONG-TERM COMMITMENT

Database Storage Management with Veritas Storage Foundation by Symantec Manageability, availability, and superior performance for databases

Consulting Services for Veritas Storage Foundation

VERITAS Storage Foundation 4.3 for Windows

Veritas Storage Foundation High Availability for Windows by Symantec

Symantec Storage Foundation High Availability for Windows

An Oracle White Paper November Oracle Real Application Clusters One Node: The Always On Single-Instance Database

VERITAS Business Solutions. for DB2

Veritas InfoScale 7.0 Storage and Availability Management for Oracle Databases - Linux

End-to-End Availability for Microsoft SQL Server

Veritas Cluster Server by Symantec

Overview of I/O Performance and RAID in an RDBMS Environment. By: Edward Whalen Performance Tuning Corporation

Best Practices for Managing Storage in the Most Challenging Environments

High Availability for Databases Protecting DB2 Databases with Veritas Cluster Server

Symantec Cluster Server powered by Veritas

recovery at a fraction of the cost of Oracle RAC

Veritas InfoScale Availability

WHITE PAPER PPAPER. Symantec Backup Exec Quick Recovery & Off-Host Backup Solutions. for Microsoft Exchange Server 2003 & Microsoft SQL Server

Application Brief: Using Titan for MS SQL

TOP FIVE REASONS WHY CUSTOMERS USE EMC AND VMWARE TO VIRTUALIZE ORACLE ENVIRONMENTS

WHITE PAPER: ENTERPRISE SECURITY. Symantec Backup Exec Quick Recovery and Off-Host Backup Solutions

Symantec NetBackup 7 Clients and Agents

High Availability Solutions for the MariaDB and MySQL Database

An Oracle White Paper January A Technical Overview of New Features for Automatic Storage Management in Oracle Database 12c

Data Sheet: Disaster Recovery Veritas Volume Replicator by Symantec Data replication for disaster recovery

June Blade.org 2009 ALL RIGHTS RESERVED

VMware Virtual Machine File System: Technical Overview and Best Practices

High Availability Databases based on Oracle 10g RAC on Linux

The Revival of Direct Attached Storage for Oracle Databases

VERITAS Database Edition for Oracle on HP-UX 11i. Performance Report

EMC Symmetrix V-Max with Veritas Storage Foundation

VERITAS Storage Foundation 4.0

Total Data Storage Management for Oracle E-Business

Scale and Availability Considerations for Cluster File Systems. David Noy, Symantec Corporation

Configuring and Tuning Oracle Storage with VERITAS Database Edition for Oracle

ORACLE DATABASE 10G ENTERPRISE EDITION

Scalable NAS for Oracle: Gateway to the (NFS) future

BUSINESS CONTINUITY AND DISASTER RECOVERY FOR ORACLE 11g

Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle

Hitachi Path Management & Load Balancing with Hitachi Dynamic Link Manager and Global Link Availability Manager

ESG Lab Review. The Challenges

High Availability Implementation for JD Edwards EnterpriseOne

Alternative Backup Methods For HP-UX Environments Today and Tomorrow

ORACLE DATABASE HIGH AVAILABILITY STRATEGY, ARCHITECTURE AND SOLUTIONS

<Insert Picture Here> Oracle Cloud Storage. Morana Kobal Butković Principal Sales Consultant Oracle Hrvatska

Hard Partitioning and Virtualization with Oracle Virtual Machine. An approach toward cost saving with Oracle Database licenses

Realizing the True Potential of Software-Defined Storage

Instant-On Enterprise

High Availability and Disaster Recovery for Exchange Servers Through a Mailbox Replication Approach

Veritas Replicator from Symantec

High Availability Infrastructure for Cloud Computing

Cloud Based Application Architectures using Smart Computing

Comprehending the Tradeoffs between Deploying Oracle Database on RAID 5 and RAID 10 Storage Configurations. Database Solutions Engineering

EMC DATA PROTECTION FOR SAP HANA

INCREASING EFFICIENCY WITH EASY AND COMPREHENSIVE STORAGE MANAGEMENT

VERITAS NetBackup 6.0 Database and Application Protection

Veritas Storage Foundation 4.3 for Windows by Symantec

CDPindepth CDP. to recover from replicas, select point-in-time recovery

Module 14: Scalability and High Availability

Symantec NetBackup Snapshots, Continuous Data Protection, and Replication

The Benefits of Virtualizing

Oracle Database Solutions on VMware High Availability. Business Continuance of SAP Solutions on Vmware vsphere

The Modern Virtualized Data Center

High Availability Database Solutions. for PostgreSQL & Postgres Plus

Virtualization, Business Continuation Plan & Disaster Recovery for EMS -By Ramanj Pamidi San Diego Gas & Electric

Veritas NetBackup 6.0 Database and Application Protection

Server and Storage Virtualization with IP Storage. David Dale, NetApp

Backing up a Large Oracle Database with EMC NetWorker and EMC Business Continuity Solutions

OVERVIEW. CEP Cluster Server is Ideal For: First-time users who want to make applications highly available

ORACLE CORE DBA ONLINE TRAINING

DISASTER RECOVERY STRATEGIES FOR ORACLE ON EMC STORAGE CUSTOMERS Oracle Data Guard and EMC RecoverPoint Comparison

Cisco Active Network Abstraction Gateway High Availability Solution

Microsoft SQL Server 2005 on Windows Server 2003

SQL Server 2014 New Features/In- Memory Store. Juergen Thomas Microsoft Corporation

Veritas Storage Foundation and High Availability Solutions HA and Disaster Recovery Solutions Guide for Enterprise Vault

EMC Backup and Recovery for Microsoft SQL Server 2008 Enabled by EMC Celerra Unified Storage

Eliminate SQL Server Downtime Even for maintenance

How To Make An Org Database Available On Linux

An Oracle White Paper July Oracle Primavera Contract Management, Business Intelligence Publisher Edition-Sizing Guide

EMC VFCACHE ACCELERATES ORACLE

Data Protection with IBM TotalStorage NAS and NSI Double- Take Data Replication Software

Symantec Storage Foundation for Windows

Oracle Databases on VMware High Availability

What s New with VMware Virtual Infrastructure

Emerging Technologies Shaping the Future of Data Warehouses & Business Intelligence

A Unique Approach to SQL Server Consolidation and High Availability with Sanbolic AppCluster

Veritas Storage Foundation Tuning Guide

WHITE PAPER: HIGH CUSTOMIZE AVAILABILITY AND DISASTER RECOVERY

EMC MIGRATION OF AN ORACLE DATA WAREHOUSE

High Availability with Windows Server 2012 Release Candidate

Microsoft SharePoint 2010 on VMware Availability and Recovery Options. Microsoft SharePoint 2010 on VMware Availability and Recovery Options

Tivoli Flashcopy Manager Update and Demonstration

Whitepaper Continuous Availability Suite: Neverfail Solution Architecture

SQL Server Storage Best Practice Discussion Dell EqualLogic

How VERITAS Storage Foundation TM for Windows Compliments Microsoft Windows Server 2003

DATASHEET. Proactive Management and Quick Recovery for Exchange Storage

How VERITAS Storage Foundation TM for Windows Compliments Microsoft Windows Server 2003

<Insert Picture Here> Oracle Database Directions Fred Louis Principal Sales Consultant Ohio Valley Region

Transcription:

WHITE PAPER: TECHNICAL Veritas Storage Foundation Oracle HA Value Proposition and Planning, Installation, Implementation and Operational Considerations February, 2008 Symantec Technical Network White Paper

Content Scope... 3 Introduction... 4 SFORA-HA Components... 5 SFORA-HA Benefits... 6 Standardization... 6 Availability... 8 Scalability... 8 Performance... 9 Manageability... 10 SFORA-HA vs. Oracle only Stack... 11 Planning... 12 What is known about the application and its environment?... 12 Which SF-ORA HA features to use?... 17 Oracle Planning... 21 Storage Layout Planning... 22 Pre-Requisites... 27 SFORA-HA Pre-Requisites... 27 Oracle Pre-requisites... 29 Installation Considerations... 30 SFORA-HA Installation Considerations... 30 Oracle Installation Considerations... 31 Implementation/Operational Considerations... 32 SFORA-HA... 32 Oracle... 33 Minimizing Unplanned Downtime... 36 Minimizing Planned Downtime... 36 Performance Monitoring and Tuning... 42 If and when should migration from single instance to Oracle RAC take place?... 44 Conclusion... 48 2

Scope The scope of this document is to focus on Veritas Storage Foundation for Oracle High Availability (SFORA- HA) solution which adds significant value in Oracle Relational Database Management System (RDBMS) environments, and makes Oracle more manageable. As such, much of the discussion throughout this document will be Oracle-centric, highlighting the synergy and interplay of SFORA-HA and Oracle. The focus will be on the SFORA-HA 5.0 release, Oracle 10g, and when showing specific examples or referencing documentation, the Solaris OS. The reader should always keep in mind that SFORA-HA robustly supports other releases of Oracle and multiple OS platforms. The intent of this document is to provide perspectives regarding use of the Oracle database in Enterprise and mission critical environments and how SFORA-HA adds considerable value. This document will not go into great detail of the core Storage Foundation offering Veritas Volume Manager (VxVM) which includes Dynamic Multi-Pathing (DMP) or Veritas File System (VxFS). However, they will be mentioned often within the context of the full SFORA-HA solution. Additionally, Veritas Storage Foundation Manager (SFM) will not be discussed in this paper, but we will point out here that it is a separately installable component (and free!) and strongly recommended to complement the full SFORA- HA stack. This paper is focused on Oracle single instance rather than Oracle RAC (RAC). Symantec offers a product called SFRAC which is a superset of the SFORA-HA product discussed here. Please note that the last section of this paper does discuss considerations surrounding migration from Oracle single instance to Oracle RAC. Additionally, the scope of this paper does not include Disaster Recovery (DR), although we will point out here that Symantec has additional products to address DR such as Veritas Volume Replicator (VVR) and Global Cluster Option (GCO- an additional component of Veritas Cluster Server). There will be no discussion of the capabilities of the Storage Foundation product with regards to operating in virtual environments. We do point out that is another viable environment for our entire stack. Finally, to keep scope manageable within this paper, we will not be covering, in the pre-requisites and install sections, applying of Maintenance Packs after the base SFORA-HA software is installed. 3

Introduction There is no debate that Oracle RDBMS dominates today s market space in open systems environments. The many thousands of Oracle RDBMS Environments are characterized by: Enterprise customers Mission critical applications High uptime requirements often 24 x 7 web based apps. High performance requirements Complex work load patterns Increasing work load over time Increasing database size over time Being able to react quickly to the increasing pace of change Doing more with less resources and budget Thus, there is a real need for software products that can enhance the Oracle value proposition by addressing the challenges enumerated above. Such products have to be, just as Oracle is, hardware agnostic able to run across a spectrum of servers, operating systems, HBAs, SAN switches, storage arrays, etc. With the introduction of Oracle 10g several years ago, Oracle has been touting, most notably because of their Automatic Storage Management (ASM) feature, that they can provide a complete solution on the RDBMS server. However, there are additional aspects to consider when deploying Oracle in mission critical environments. The successful implementation and operation of applications running on top of Oracle RDBMS require the talents of more that the DBA. Other functional groups involved are System Administrators, Storage Administrators, Network Administrators, Applications specialists, etc. And of utmost importance is cooperation between these multiple functional groups. Products that facilitate and enhance this needed cooperation are of great value. The purpose of this paper is to describe how one such product, Veritas Storage Foundation for Oracle High Availability (SFORA-HA), adds significant value in Oracle RDBMS environments, and makes Oracle more manageable. This paper will also discuss planning, implementation, and operational considerations when using the SFORA-HA stack. 4

SFORA-HA Components Let us begin our discussion by listing the components of SFORA-HA. SFORA-HA is built upon Veritas Storage Foundation (SF), which consists of the following main components: Volume Manager (VxVM) Dynamic MultiPathing (DMP) File system (VxFS) Storage Foundation Manager (SFM) Portable Data Containers (PDC) Referring to Table 1, we can see how additional components are layered in to build the SFORA-HA product bundle. The most notable additional features are: Oracle Disk Manager (ODM) - provides raw like i/o performance with the manageability of file systems File System Checkpoints Volume Snapshots Dynamic Storage Tiering (DST) Veritas Cluster Server (VCS) Cluster File System (CFS) All of these features will be covered in detail later sections of this paper. Table 1 SFORA-HA Packaging VERITAS STORAGE FOUNDATION 5.0 FEATURE COMPARISON STD Storage Foundation STD ENT HA ENT HA Storage Foundation for Oracle ENT ENT ENT HA/DR STD ENT HA HA/DR Volume Manager and File System Storage Foundation Manager Dynamic Multi-pathing (DMP) Portable Data Containers (PDC) Dynamic Storage Tiering FlashSnap Database Accelerators Volume Replicator O O O O O O O O O Cluster Server IO Fencing Cluster Server Mgmt Console Database & Application/ISV Agents Firedrill Replication Agents Global Clustering O denotes an option; Volume Replicator is an option that can be purchased separately 5

The result of this product bundle is a mature, flexible, complete, and robust infrastructure software set with all the feature/functionality shown in Figure 1. Figure 1: Features and Functionality of Veritas Storage Foundation for Oracle HA SFORA-HA Benefits Let s now discuss the SFORA-HA benefits that are realized in Oracle RDBMS environments. We will examine 5 main topics: 1. Standardization 2. Availability 3. Scalability 4. Performance 5. Manageability Standardization The typical enterprise customer using Oracle RDBMS has a very heterogeneous IT infrastructure with multiple: Databases (not just Oracle, but also likely to have UDB, SQL-Server, or Sybase, etc.) Server platforms (e.g. Solaris, HP-UX, AIX, Linux, Windows, etc.) Storage arrays (e.g. EMC, Hitachi, IBM, etc.) Applications (in-house and 3 rd party such as SAP, Oracle Apps, Siebel, Peoplesoft, Amdocs, etc.) Multiple application tiers (web, application, DB, etc.) 6

Layer on to this the fact that there are multiple versions of the software and/or hardware for each of these components, and one can readily appreciate the complexity and challenges of managing such an environment. Thus, an IT infrastructure product, such as Veritas Storage Foundation, that cannot only enhance the operation of Oracle RDBMS, but also add value and reduce complexity across the entire IT enterprise has to have great appeal. Standardization, via Storage Foundation as a data center wide strategy, makes sense for many reasons: Avoid hardware vendor lock-in (e.g. servers and storage). When hardware vendors can avoid competition by persuading a customer that its combination of hardware and software is the only acceptable solution, many times higher prices follow. Uplifts of 30-100% or more are frequently achieved. Reduce the number of tools needed to manage the infrastructure. Storage Foundation Manager (SFM) is a great example as it provides a single pane-of-glass into the Storage Foundation-enabled data center and includes over 250 guided operations to enable repeatable processes. The rich set of out of the box Veritas Cluster Server (VCS) agents allows for incorporating high availability across the multi-tiered application environments as well as standardization of clustering infrastructure. VCS includes the ability, via its remote group agent to establish Service Group dependencies across multiple clusters for the multi-tier architecture (DB, APP, WEB layers) of today s mission critical applications. DMP s wide support of servers, HBAs, and storage arrays enables the enterprise to have a single hardware independent host to storage multi-pathing solution. Save money many SF customers have realized significant return on investment (ROI) 5-15% of aggregate storage and server costs. This is not surprising as storage management costs may be three to four times the costs of hardware procurement. Furthermore, tangible savings can be realized because of not being locked into a single storage vendor s platform. Thus, SF features such as storage array migration, Database Management System (DBMS) snapshots, and Dynamic Storage Tiering allow for moving data to less expensive storage as business requirements change. And features such as Portable Data Containers (PDC) make migration to less expensive servers very viable. Lowers training requirements Makes expertise more broadly available as new challenges arise in the data center 7

Availability There are many availability challenges with Oracle RDBMS, on which the most critical applications within enterprise IT environments depend being up. Considerations such as these come immediately to mind: Avoiding Single Point of Failure (SPOF) Software, Server, HBA, Storage Striving towards 24 x 7 Minimizing planned downtime Can upgrades, migrations, etc. be done non-disruptively? Minimizing unplanned downtime Does a SW or HW failure make the application go down? Enabling non-disruptive on-line backups Providing Local Failover Adding storage online Operational simplification and consistency of application control (starting, stopping, etc.) To ensure that these applications stay online and highly available, infrastructure surrounding and interoperating with the Oracle RDBMS must be in place with features such as: HBA multi-pathing adding/breaking mirrors on-line, which can, for example, enable non-disruptive storage array migrations growing/shrinking storage volumes dynamically snapshoting/cloning of database for backup, reporting, test/dev, etc. providing quick server failover in the event that a server loses any critical software or hardware component providing rolling upgrades via patching standby server, forcing failover to it, and then patching 1 st server Furthermore, the critical code that must provide this set of capabilities, to be truly robust, needs to run in the operating system kernel space, not user space. The SF-ORA product is extremely well suited to meet these set of challenges. Scalability Scalability is defined as a desirable property of a system to either handle growing amounts of work in a graceful manner, or to be readily enlarged. For example, it can refer to the capability of a system to increase total throughput under an increased load when resources (typically hardware) are added. A system whose performance improves after adding hardware, proportionally to the capacity added, is said to be a scalable system. Typical scalability challenges that arise in the Oracle RDMS world, where the goal is to maintain consistent performance response times, are: the load for an application increases 50% a new application comes on line a need to merge the operations of a newly acquired company 8

the size of the DB is growing 10% every month distributed databases are being consolidated to fewer servers Features and capabilities of SF-ORA that support these scalability challenges are: Ability to dynamically add new storage Ability to dynamically migrate data from one storage array to another in a heterogeneous environment using VxVM mirroring. Ability to do hardware/software upgrades of servers in a cluster with minimal outages Ability to do server migrations with minimal outages using Portable Data Containers (PDC) Ability to do online defragmentation of file system Ability to do dynamic storage tiering (e.g. move less accessed data from tier 1 storage to tier 2 storage) Ability to create snapshots of databases online for reporting, backup, etc. The above enumerated set of capabilities of SF-ORA provide a robust and full-featured IT infrastructure that allows for the agility needed in dealing with today s dynamic and unpredictable real-world Oracle RDBMS environments. Performance Gaining and maintaining acceptable application performance is a perpetual challenge, with typically a very dynamic and changing environment in terms of workload and transaction mix. Specific performance challenges arise such as: Being able to quickly diagnose performance issues to root cause and implement corrective actions, typically in enterprise storage (EMC, Hitachi, IBM, HP) where the following questions need to answered: Is there an I/O issue? Is it intra-server or inter-server? Is it a database issue? Is it an application issue? Is it repeatable/predictable or aberrational? Being able to dynamically re-balance/reorganize the DB Being able to pro-actively manage performance Being able to prevent response times from degrading with increasing work load. (closely tied to scalability) Being able to do online backups without performance degradation Being able to prevent Decision Support users from affecting OLTP performance To effectively deal with performance, in addition to a having a good set of performance monitoring tools, one needs infrastructure with a full set of features that can be applied dynamically and in a very granular fashion, as well as globally, depending on the specific problem being addressed. Being able to discretionarily use striping and mirroring for any or all volumes are good examples of granular controls. Also, being able to use ODM (Oracle Disk Manager) in a discretionary way is important here. These features contribute greatly to enabling very granular control of the underlying Oracle database objects (tables, indexes, redo logs, archive logs, etc.). In addition to enhancing availability, the HBA multi-pathing capability also contributes to performance by load balancing I/O across all available paths. 9

A valuable capability in dealing with the hard to know and ever changing access patterns of the critical oracle tables, indexes, etc. is that of being able to do de-fragmentation while online. Depending on the specifics of the environment, one might invoke this functionality on a scheduled basis, and or on-demand as circumstances dictate. Because today s enterprise class storage arrays (e.g. EMC, Hitachi, IBM, HP, ) are configured with ever increasing capacity (number of disks and individual disk capacity), more often than not, storage arrays and individual disks are shared across multiple servers running multiple applications. Thus, not only does one have to deal with intra-server I/O contention, but with inter-server I/O contention as well. Thus, having the ability to do deep mapping through all the layers from Oracle object to partition on the individual spindle inside the storage array is a very valuable capability to facilitate diagnosing both intraserver and inter-server i/o performance issues. As mentioned in the Availability section, critical code that must provide this set of capabilities, to be truly robust, needs to run in kernel space, not user space. The benefit of this shows up most dramatically when servers get extremely heavily loaded that s when kernel space processes will clearly differentiate from user space processes. The SF-ORA product is extremely well-suited to meet these set of performance challenges with the functionality/features described above. Manageability Manageability challenges are many in enterprise Oracle RDBMS environments. The list includes: Do more with less resources and budget Minimize number of tools needed to manage entire IT infra-structure Reduce complexity in an ever-growing environment File management Granular or coarse? Ability to manage all files, not just Oracle Ability to dynamically enable/disable host level mirroring with storage array mirroring Storage management Ability to flexibly stripe (vary stripe size as well as number of members in stripe set) or not stripe Online storage growth (auto extend) Online backups Dynamic Storage Tiering OS migrations Storage array migrations SF-ORA provides a complete solution for heterogeneous online storage management that is equal to the task for the most complex Oracle environments, providing a standard set of integrated tools to centrally manage explosive data growth, maximize storage hardware investments, provide data protection, and adapt to unpredictable and changing business requirements. Because SF-ORA is an enterprise solution, not a point solution, it enables IT organizations to manage their storage infrastructure in a consistent manner. With advanced features such as centralized storage management (SFM), online configuration and administration, Dynamic Storage Tiering, Dynamic Multi-pathing, data migration, and local and remote replication, SF-ORA enables organizations to reduce operational costs and capital expenditures across the data center. 10

SFORA-HA vs. Oracle only Stack As discussed in the previous sections, SFORA-HA is a complete enterprise solution for heterogeneous online storage management. As shown in Table 2 below, SFORA-HA adds great value in Oracle environments. Table 2 - Summary of SFORA-HA vs Oracle Functionality Feature Oracle SF-ORA HA Standardization Support beyond Oracle RDBMS N Y Support beyond Oracle RDBMS tier N Y Availability ASM is not SPOF N Y Multi-pathing N Y Local Failover Y (CRS weak support) Y (VCS strong support) Critical code runs in kernel space N Y Online Array Migrations Y- Difficult Y - Easy Minimally disruptive Server Migrations N Y Scalability Granular control of DB objects N Y Multi-pathing N Y Performance Granular control of DB objects N Y Multi-pathing for HBA load balancing N Y Awareness/visibility of physical disk N Y Critical code runs in kernel space N Y Manageability Manage entire infra-structure N Y File management granular/full-featured N Y Host level striping non-mandatory N Y 11

Planning What is known about the application and its environment? Perhaps the most important consideration for successful implementations of SF-ORA HA is that the multiple cross-functional groups have good working relationship that foster effective communications. The key groups in this case would be Applications Architects, DBA, System Administrator, and Storage Administrator. As the subsequent discussion will show, in order to provision optimal storage for the Oracle RDBMS, tight cooperation between these three groups is essential. This is a persistent requirement to assure successful on-going operations after implementation. The value of doing comprehensive planning BEFORE installation of SF-ORA HA/Oracle RDBMS cannot be overstated. Of course, the more that is known up-front about the application being deployed, the more fruitful and on-target the planning effort will be. The amount and quality of discovery information will vary widely depending on such factors as: Ability to install in Dev/Test/QA before Production? If not, this would be a very undesirable situation! The rest of this discussion assumes that there will be Dev/Test/QA environments in addition to Production. New or existing application? Amount of time allotted for Dev/Test/QA Level of testing efforts and how closely they represent production workloads performance monitoring tools available Is application in-house written, or 3 rd party provided, and if 3 rd party provided, how much customization is being done? Are there implementation/planning/best practice documents provided by the vendors? (e.g. oracle, server vendor, application vendor, etc.) So, let s ask the question: What is known about the application and its environment? The following list is not exhaustive, but these questions should be relevant no matter the specific details of the Oracle server environment. This is not a comprehensive list but should be used to help spur generation of additional relevant discovery questions. Useful drill down questions would include: General questions: How closely do the Test/QA environments resemble production infrastructure? Are there tools to generate load in Test/Dev/QA environments, and to what extent can full load of the production environment be replicated? Will there be rogue activity and/or can it be prevented? It is not an infrequent occurrence where unexpected/unauthorized activation production databases occur that significantly affect performance e.g. DBA doing maintenance tasks, or running very resource consumptive queries against system tables, or discovering that test servers have active sessions on the Production database. Are applications other than Oracle sharing the server? If so, this will make for a more challenging implementation and on-going operation of the Oracle DB. Are there extreme stress points known? An example could be, for a very high transaction OLTP application, having to write at such a high rate to the redo log file that one may have to consider running Oracle lgwr process with realtime priority. 12

Database-related questions: Will multiple Oracle database instances be running on the same server? If so, care must be taken to isolate resources and ascertain that the total resources available are adequate. Are the most heavily accessed database objects known? More often than not, DB i/o activity exhibits the 80/20 rule - 80% of the activity involving just 20% of the objects. Sometime the skew is even more extreme than that e.g. 90/10. If these database objects can be identified, then obviously they will become candidates for the best disk real estate and care should be taken to separate them from one another on disk. DB Read vs. Write activity? This will influence the decision to use or not use ODM, which will be discussed in more detail in the Which SFORA-HA features to use? section of this paper. How will Binary Large Object (BLOB) data storage be managed? Will there be any data partitioning? MB/sec of i/o? This certainly has direct bearing on number of HBAs that will be needed. With Oracle cost based optimizer in play, how intrusive will it be keeping statistics up-to-date? What performance monitoring tools are available? If none are in use, this will be a serious handicap in being able to do thorough planning. Are database sessions done via connection pooling (e.g. via a JDBC tier) or directly by application client or a combination of both? If the former, connections tend to be persistent and the total number relatively low. If the latter, connections can tend to be of short duration and the total number relatively high. This can have significant influence on memory requirements and Oracle SGA fragmentation. Application-related questions: Any application tuning opportunities? This kind of information would be available, most likely, only if an Oracle performance monitoring tool is being used proactively (e.g. Precise I3 Indepth for Oracle or the suite of tools that Oracle provides (ADDM/AWR/ASH, etc.) Lack of indices, inefficient indices? If this kind of application detail is known, there is the opportunity to do tuning of the application. For in-house applications, there should be no constraints on acting on this information. For 3 rd party applications, there may or may not be limitations on what changes can be made, but there still would be value in knowing where the inefficiencies are and dealing with them accordingly. Running SQL statement too many times? There could be opportunity to modify the application to execute statement much less frequently. Using literals instead of bind variables? There could be opportunity to modify the application to use bind variables instead of literals. Running batch jobs during peak load when they could run during off-peak hours? If so, reschedule to off-peak times. Does application request data from the DB intelligently? For example, is a report that is interested in just the last week s worth of data pulling in the entire last year s data set? Does the application include files external to the Oracle RDBMS? If so, are they heavily accessed? If lightly accessed they could be candidates for Dynamic Storage Tiering (DST). Is application OLTP, Decision Support/Data Warehouse, or both? Is workload repeatable/predictable or anomalous/aberrational? OLTP applications tend to be more predictable/repeatable than Decision Support/Data Warehouse. Does any part of the application require real-time performance? If so, then it is imperative that its resources be isolated from other applications, other servers, etc. An example of this would be a time and attendance system where employees clock in and clock out in very small time intervals where response time has to be optimal. Do end users of the application have access to tools that allow them to build queries that could potentially cause query from hell kind of scenarios? If so, can safeguards be implemented to prevent 13

this? One would expect that this is more likely to happen in Decision Support/Data Warehouse environments where power users have access to OLAP tools that allow them to build such queries. But it is certainly not unheard of for this to happen in OLTP environments, where for example, the end user is able to populate a form with very minimal information which results in creating a query that may have no filtering at all. Storage related questions: What are the storage capacity requirements? Will Storage be SAN, NAS, DAS, JBOD or a combination? Does storage support SCSI-3 Persistent Reservation? Does the storage employ intelligent software (e.g. EMC Symmetrix Optimizer)? Is more than one class of storage available (e.g. RAID-10 and RAID-5, 146GB disks and 300GB disks, EMC DMX and EMC CLARiiON, etc Are PROD and TEST/DEV/QA servers sharing resources (e.g storage array)? If they are, recognize that it will be difficult, if not impossible, to get repeatable performance results when testing. Additionally, production performance can vary and suffer when load testing is going on. Are multiple PROD servers sharing resources? If so, then there is the concern that if multiple applications are sharing busy database objects on the same spindles, they will impact each others performance, and probably, in unpredictable ways. How long does it take to provision new storage, HBAs, CPUs, memory, etc.? This can greatly influence how much growth capacity needs to be provided up-front. What is the number of Oracle sessions and number of Oracle files? This is important information to have when initially sizing Oracle System Global Area (SGA). 14

High Availability (HA) related questions: Figure 2 Example of 4 node cluster with shared storage What kind of HA clustering will be used? 2-node Active/Passive most rudimentary and most expensive 2-node Active/Active requires each server to be over-configured and/or has built-in degradation of performance when failover occurs N-to-1 has assigned spare (for example, in Figure 2, the 4 th server could be the assigned spare, which would take over whenever the 1 st, 2 nd, or 3 rd server failed. Requires fail-back as soon as possible after the failover, which means taking another outage, because the spare assignment is static or fixed to that server. Assumes identically configured resources on each server. N+1 like N-to-1 except the spare can roam across the cluster community. Thus there is no need to failback. Assumes identically configured resources on each server. Can be used for rolling application and rolling OS upgrades N-to-N allows for any application on any server to failover to any other server with no need to failback. There is no designated spare server, but each server has spare capacity. Requires very good understanding of application compatibility AND very good understanding of application performance. All of these cluster configurations require access to shared storage, and all of them are supported by the VCS component of SFORA-HA. 15

Does storage support SCSI-3 Persistent Reservation (PR)? If yes, SFORA-HA s robust i/o fencing can be implemented. In a server cluster environment, best practices dictate that if this feature is available it should be taken advantage of.. In this scenario, maintaining data integrity is as important as the high availability provided by clustering servers. With shared storage in place - meaning that more than one server has access to the same data - it is extremely important that the servers never change the data in an uncoordinated manner. To safeguard against this requires i/o fencing, architected around SCSI-3 Persistent Reservation, which prevents split brain scenarios where multiple servers touch data in an uncoordinated way. Implementation of i/o fencing is done using coordinator disks where servers race to gain majority of control whenever the cluster heartbeat goes away. The winner(s) take control of the data disks and fence them off from the loser(s), thus guaranteeing data integrity. This algorithm works regardless of whether the cause of the lost heartbeat is because of a server going down, a server getting hung, or a network link going down. Will CVM/CFS be used? The most obvious value add of the Cluster Volume Manager/Cluster File System (CVM/CFS) component of SFORA-HA is that it can provide for faster server failover. This is because it avoids the need to import disks groups and volumes and mount file systems. Other benefits of having a clustered file system include: Fewer copies of data (non-oracle) Ability to mount space optimized snapshots to servers other than the source server Having necessary infrastructure in place to migrate to Oracle RAC at a later time It should be pointed out that CVM/CFS does add some complexity to the storage management infrastructure. To maintain data integrity of shared files across multiple servers, the i/o fencing feature of SFORA-HA feature is required, and thus mandates that the storage be SCSI-3 PR capable. This adds some steps to installing the SFORA-HA stack as CVM/CFS needs to be laid down and coordinator disks created. The advisability of going with CVM/CFS will depend on the comfort level of IT staff in dealing with these aspects, the importance of faster failover, and the degree of likelihood that the environment will move to Oracle RAC in the foreseeable future. Even if CVM/CFS is not used, the best practice is to still employ i/o fencing so that the highest level of data integrity for the cluster can be provided. I/O corruption can occur in many different ways, even when CVM/CFS is not in play. Bottom line, because HA environments are architected with shared storage (2 or more servers having access to the same disk[s]), the potential for data corruption (accidental or intentional) always exists. With i/o fencing enabled on top of SCSI-3 PGR, this scenario can be avoided. For detailed information on Veritas Cluster Server, please refer to the Veritas Cluster Server Installation Guide 5.0 available at http://seer.entsupport.symantec.com/docs/283868.htm For detailed information on Veritas Cluster File System, please refer to the Veritas Cluster File System Installation Guide 5.0 available at http://seer.entsupport.symantec.com/docs/283894.htm 16

Which SF-ORA HA features to use? This section discusses considerations regarding use of the advanced features of SF-ORA HA. We will be looking at Oracle Disk Manager (ODM), Database Snapshots, File System Checkpoints, Dynamic Storage Tiering, and Portable Data Containers (PDC) ODM ODM (Oracle Disk Manager) is an API developed by Oracle circa 1999 initially available with Oracle 9i. At that time, the Storage Foundation product was architected to include its own ODM library to take advantage of this Oracle i/o interface when using the SF stack (volume manager and file system). The value-add of ODM, to state it succinctly, is Raw-like performance with the manageability of File System. The capabilities of ODM that make this possible are: I/O calls are optimized according to types within a database redo logs, archive logs, temporary tables, undo, tables, and indexes Parallel updates to database files increase throughput Asynchronous I/O is built-in Eliminates moving data through file system buffers Lets Oracle handle locking for data integrity which avoids OS single-writer lock contention (related to OS context-switch overhead). Reduces file-system handles by utilizing sharable File Identifiers (1 per Oracle file vs 1 per Oracle file per session when file-system handle are used) Reduces system calls and context switches Reduces CPU utilization Efficient file creation and disk allocation Thus, traditional UNIX file system overhead is eliminated. ODM provides the most performance benefit in heavy Oracle DB write environments (rough rule of thumb of 20% or greater write activity). Most applications will benefit from ODM, Keep in mind that even if the overall write activity is less than 20%, there is still a high probability that there will be periods of time where write activity will exceed 20% because Oracle DB write activity over time can be bursty. For heavy read environments, because there is no File System level caching or read-ahead with ODM, consider increasing the Oracle SGA parameters DB_BLOCK_BUFFERS, DB_CACHE_SIZE, and DB_FILE_MULTIBLOCK_READ_COUNT which can help performance especially in very high # of Oracle concurrent sessions environments. This compensates for the fact that there is no File System level read-ahead with ODM. And there may be cases where you will choose not to use ODM, just the standard File System which is often referenced as Buffered File System (BFS). In these cases, which would be heavy read and much more likely to be Decision Support/Data Warehouse vs. OLTP, consider: Hosting Oracle redo logs on file system mounted with Direct IO (DIO) (convosync=direct) to improve logging performance File System read-ahead tuning may be required If OLTP (issues mostly single-block random reads) can reduce read_nstream parameter When File System resides on concatenated volumes, can increase read_pref_io from default of 64K to maximum size supported by SCSI driver (1MB on Solaris, 256KB on AIX, ) For read intensive workloads such as Data Warehouses, tune max_seqio_extent_size which can substantially improve read sequential I/O performance. Set it to 10GB/block_size or the largest file in the file system to be created divided by file system block size If multiple oracle instances on a single server 17

Can set small Oracle SGAs for individual instances File System cache can serve as global cache for all Oracle Instances Avoids issue of long Oracle instance start-up times associated with OS taking long time to allocate contiguous physical memory for SGA To summarize the ODM discussion, in most cases it will improve performance, in a few cases not. The good news is that it is very easy to enable/disable and requires no data file conversion. For detailed information on ODM, please refer to Veritas Storage Foundation for Oracle Administrator s Guide 5.0 available at http://seer.entsupport.symantec.com/docs/283896.htm Snapshots The snapshot capability of SFORA-HA is known as Oracle Database FlashSnap. This feature, which leverages the ability of Volume Manager to dynamically create/break mirrors, provides the ability to clone Oracle databases by capturing of an online image of an actively changing database. A few high level points are: Database snapshots can be used on the same host as the production database, or on a secondary host that shares the same storage Does not requires root privileges to use; DBA privileges are sufficient. Uses VxVM mirroring capabilities Typical use cases are: Database Backup and Restore Decision-Support Analysis and Reporting Application Development and Testing Logical Error Recovery Snapshots can be online, instant, or offline Online requires the database to be put in hot backup mode and the resulting snapshot is both re-startable and recoverable (subsequent redo logs can be applied). If your purpose is to use the snapshot for backup or to recover the database after logical errors have occurred choose online. Instant does not require putting the database in hot backup mode and the resulting snapshot is re-startable, but not recoverable. If you intend to use the snapshot for decision-support analysis, reporting, development, or testing, choose 'instant. The above listed use cases often lead to the decision to use this feature. A common phenomenon today is that of the shrinking batch/report window. With the ubiquitous internet world of today, most applications have to be up 7 x 24 and are finding little or not time during the 24 hour cycle where activity lightens to the point that the batch/reporting load can be executed without noticeable performance consequences for the OLTP environment. Oracle Database FlashSnap can solve this problem, with the flexibility to present the cloned instance on either the source production server, or, and probably more likely, bring it up on a second server. Of course, FlashSnap requires additional storage for the cloned database, and, when using a second server, shared storage would have to be available to both servers. However, the cloned database storage can be different (less expensive) from the source database (e.g. source could be EMC DMX and the clone could be CLARiiON). The value-add of being able to use a FlashSnap image for backup/restore and to provide DB copies for Test and Dev environments are further reasons why this feature should be strongly considered. For detailed information on Oracle FlashSnap, please refer to Veritas Storage Foundation for Oracle Administrator s Guide 5.0 available at http://seer.entsupport.symantec.com/docs/283896.htm 18

Checkpoints The Storage Checkpoint feature can provide for the efficient backup and recovery of Oracle databases. Checkpoints can be mounted, allowing regular file system operations to be performed. The Checkpoint feature is similar to the snapshot volume manager mechanism (FlashSnap). A Checkpoint creates an exact image of a database instantly and provides a consistent image of the database from the point in time the Checkpoint was created. Veritas NetBackup, which will be discussed briefly in the Planning/Oracle/ Backup/Restore tools/methodology section later in this paper, also makes use of Checkpoints to provide a very efficient Oracle backup mechanism. A direct application of the Checkpoint facility is Storage Rollback. Because each Checkpoint is a consistent, point-in-time image of a file system, Rollback is the restore facility for these on-disk backups. Rollback rolls back changed blocks contained in a Checkpoint into the primary file system for restoring the database faster. Storage Checkpoints can provide for online recovery from logical corruption. One can think of it as a first line of defense before having to invoke server failover in the Oracle HA environment. The additional space requirements for Storage Checkpoints can be very modest as only changed blocks need to be tracked. Thus, the less write-intensive the DB activity is and the more focused the write activity is (e.g. hitting a relatively small set of blocks many times vs. a large set of blocks less frequently), the less space will be required. A rule of thumb is about 10% of the size of the DB being checkpointed. Performance impact for OLTP environments, when Checkpoints are in use, can range from as low as 5% when Rollback is not being used to 10% to 20% when Rollback is being used. The performance impact for Decision Support/Data Warehouse is typically less than for OLTP. There is a strong synergy between Checkpoints and FlashSnap that could work in this fashion: Via FlashSnap, take snapshot to build clone. Perform a Checkpoint on the clone. From that single Checkpoint one can then create multiple clones. The multiple checkpoints can then be made available for Test, Dev, QA etc., of course, with the restriction that all the clones would be on the same server. Since the FlashSnap instance can be brought up on other than the server where the source DB instance resides, this allows for quite a bit of flexibility. For detailed information on Checkpoints, please refer to Veritas Storage Foundation for Oracle Administrator s Guide 5.0 available at http://seer.entsupport.symantec.com/docs/283896.htm 19

Database Dynamic Storage Tiering (DBDST) DBDST is a wrapper around the Storage Foundation Dynamic Storage Tiering capability that enables DBAs, instead of System Administrators, to directly use this feature. The key capability of Storage Foundation that DBDST leverages is the ability to create multi-volume file systems. Thus, underneath a single file system, one volume could reside on one class of storage, a second volume on another class of storage, a third volume on yet another class of storage and so on. Obviously, the first consideration as to whether or not to use this feature is whether or not, there is more than one class of storage available for the application being deployed (e.g. RAID-10 and RAID-5, 146GB disks and 300GB disks, EMC DMX and EMC CLARiiON, etc.). If so, using DST feature can be considered. If not, the answer will be no. If there is more than one class of storage available, then one should closely consider the application s needs and suitability for this capability. Figure 3 DBDST Ready File system Figure 3 shows an example of using DBDST where 2 major advantages are depicted: Being able to use a single name space (single file system) to repose all entities of a database Transparently mapping the busiest data base objects to the highest tiered storage, and the least busy to the lowest tier For this particular example, policies would be in place to automatically handle on-going growth of the file system e.g. when /2007 directories get added to the /index and /data directories, the respective /2006 directories would move to lower tiered storage. And we should point out, that even if tiered storage is not available, there is a use case of DBDST known as Load Balanced File System (LBFS) that does not require different classes of underlying storage. It provides benefits of striping without the administration complexity. File extents are evenly distributed and when volumes are added/removed, extents get redistributed evenly across the now current set of volumes. For detailed information on DBDST, please refer to Veritas Storage Foundation for Oracle Administrator s Guide 5.0 available at http://seer.entsupport.symantec.com/docs/283896.htm 20

PDC Portable Data Containers (PDC), built upon the Cross-Platform Data Sharing (CDS) technology of Storage Foundation, can synergize with the Oracle Transportable TableSpaces (TTS) capability to reduce the time and resources required to move data between unlike platforms because it eliminates the copy step. CDS technology creates portable data containers (PDCs), building blocks for virtual volumes that make it possible to transport the volumes between unlike platforms. When combined with the Oracle data mobility feature of TTS, portable data containers greatly reduce the time required to move databases between unlike platforms (Solaris, HP-UX, AIX, Linux), either permanently or periodically for off-host processing. Use cases for PDC would be for server migration/consolidation involving changing of OS, and for providing subsets of a database to other servers/applications where the source and target OS are different. Keep in mind, that as long as the underlying disks are created as CDS disks, the basic foundation is in place to utilize PDC further down the road. There does not have to be an immediate use or need of PDC. For more detailed information on PDC and TTS please refer to the Faster, Safer Oracle Data Migration white paper available at http://eval.symantec.com/mktginfo/enterprise/white_papers/portable_data_containers_for_oracle. pdf Oracle Planning This section discusses considerations concerning the Oracle RDBMS redo log mirroring and use of Large Objects (LOBs). Is Oracle redo log mirroring being used? Oracle RDBMS provides the capability for redo logs to be mirrored by Oracle, to add another layer of protection on top of the storage mirroring that is being used (RAID-10 or RAID-5 most commonly). This protects for situations where software corruption could occur in one set of logs and is detected in time before corrupting the 2 nd set of logs. This feature is typically employed in enterprise Oracle environments. If it is in play, this will double the storage requirements for redo log files, and because redo log write performance is critical to overall performance of the database, will make it more challenging in insuring physical storage isolation for all redo log activity. This will be discussed in further detail in the Storage Layout Planning section. And, if multiple DB instances are going to be running on the same server, this will add additional complexity and concern. Large Object data type? If the Large Object data type (LOB) is being used to support the application, it is important to know if the LOB data is being stored in the Oracle database or as external files. The likelihood is strong that it will be as external files. This will obviously have implications for overall storage management and for Backup/Restore tasks. If storage requirements for LOB data are substantial, this could very well be an area where the Dynamic Tiered Storage capability would be appropriate to use. 21

Backup/Restore tools/methodology This is a broad topic that will not be covered in any detail in this paper. As mentioned above it is important to note whether or not there are files external to the DB. If RMAN is being used for backup, it will not be able to back up these files. RMAN is an Oracle backup solution, but not a complete backup solution. Note this excerpt from Oracle Database Backup and Recovery Reference Manual10g Release 2 (10.2 : ======================================================= Note: RMAN can only back up datafiles, controlfiles, SPFILEs, archived redo log files, as well as backups of these files created by RMAN. Although the database depends on other types of files for operation, such as network configuration files, password files, and the contents of the Oracle home, these files cannot be backed up with RMAN. Likewise, some features of Oracle, such as external tables or the BFILE datatype, store data in files other than those listed above. RMAN cannot back up those files. You must use some non-rman backup solution for any files not in the preceding list. Thus, just as one needs a comprehensive solution for storage infrastructure (SFORA-HA), one also needs a comprehensive solution for backup/restore in enterprise Oracle environments. The Veritas NetBackup (NBU) platform from Symantec is one such a solution. Not only does it interface with Oracle s RMAN, but also is integrated with the SFORA features of FlashSnap and Checkpoints. Thus, NBU provides: Backup/Restore Oracle Backup/Restore everything else in your environment including o network configuration files o password files o Oracle home o external tables o BFILE o And, of course, all non-oracle files Storage Layout Planning The primary considerations when doing storage layout for Oracle DB (logical to physical database design) are: Database Object separation make sure the redo logs, busy tables and indexes do not share physical storage. Ability to take corrective action quickly and non-disruptively as the environment changes to ensure that the object separation is maintained. These changes may be internally caused within the immediate application environment or externally caused because of resource sharing between multiple servers/applications (e.g. enterprise storage arrays). In recent years, with the great technological strides with enterprise class storage, it has become much more challenging to intelligently provision storage, as compared to back in the days when the predominant storage paradigm was JBOD. The main reasons for this challenge are: 22

A single enterprise storage array provides storage for many servers and applications. As SAN connectivity and single disk capacities continue to increase, the many will continue go grow to even more Enterprise storage arrays are getting smarter, having been able for quite a few years now, to do their own striping and virtual LUNs, in addition to data protection (RAID-10, RAID-5, etc.), the details of which are totally hidden from the servers to which they are providing storage. For example, as shown in Figure 4, the server is presented with a 34GB LUN by an EMC DMX storage array. The server has no idea, that under the covers inside the storage array, the real physical storage is 8 8.5GB partitions on parts of 8 different 146GB physical disks that is configured as RAID-10 (mirrored and then striped) with a stripe of 960KB. To add to the challenge and generate consternation, Oracle touts their SAME (Stripe and Mirror Everything) approach as the single satisfactory solution for dealing with Oracle DB storage layout. Storage vendors are often heard saying you don t have to worry about any of this because the cache in the storage array will take care of everything. For example, EMC recently introduced 2 new Quality of Service features - Dynamic Cache Partitioning and Symmetrix Priority Controls. EMC recognizes that some data is so heavily accessed that it needs more than its fair share of cache and i/o priority on the disk spindle that it shares with other servers/applications. This is in contradiction to Oracle s view of the world where users simply Stripe And Mirror Everything (SAME) and never have to worry about tuning i/o. Figure 4 OS LUN vs. Real Physical Disk Nevertheless, because of the widespread sharing of disk resources across multiple servers, and the additional layers of abstraction between the storage array and the servers, one must be very vigilant in configuring storage layout for Oracle RDBMS environments. The deep storage mapping capability of SFORA-HA can help here, but still, this should raise a red flag for the need of the DBA, System Administrator, and Storage Administrator to communicate early and often on this task. Storage Layout Considerations: LVM and/or Storage array striping? Striping of Oracle files is done to improve performance with the intent of balancing i/o load over all available storage devices available to the instance. So, in general, one would want to see striping in play. So the question is: Do I use VM striping or Storage array striping or both? The theoretical advantage of using both VM striping and Storage array striping (also known as double striping or plaid striping) is that you get more disks in play. You certainly will get more LUNs in 23