Maximizing Oracle RAC Uptime



Similar documents
Oracle 11g: RAC and Grid Infrastructure Administration Accelerated R2

Exadata Database Machine Administration Workshop NEW

Het mag dan wel niet meer vriezen. <Insert Picture Here> maar de High Availability SIG Giet Oan!

Oracle Database 11g: RAC Administration Release 2

Exadata and Database Machine Administration Seminar

Oracle Database 11g R1 & R2: New Features for Administrators

Rob Zoeteweij Zoeteweij Consulting

Advanced Oracle DBA Course Details

What s New with Oracle Database 12c on Windows On-Premises and in the Cloud

About the Author About the Technical Contributors About the Technical Reviewers Acknowledgments. How to Use This Book

Management Packs for Database

Oracle Database 12c Upgrade Tools and Best Practices from Oracle Support

Oracle Database 11g: New Features for Administrators

Oracle Database 11g: New Features for Administrators DBA Release 2

Automated Deployment of Oracle RAC Using Enterprise Manager Provisioning Pack

PLUMgrid Toolbox: Tools to Install, Operate and Monitor Your Virtual Network Infrastructure

Why Standardize on Oracle Database 11g Next Generation Database Management. Thomas Kyte

Oracle Quality of Service Management - Meeting Availability and SLA Requirements in the Database Cloud

Oracle Database 10g: New Features for Administrators

Adatbázis hibrid felhő - egyszerűbb, mint gondolná

Virtualized Oracle 11g/R2 RAC Database on Oracle VM: Methods/Tips Kai Yu Oracle Solutions Engineering Dell Inc

Oracle 11g New Features - OCP Upgrade Exam

Managing your Red Hat Enterprise Linux guests with RHN Satellite

An Oracle White Paper June, Provisioning & Patching Oracle Database using Enterprise Manager 12c.

Monitor Your Engineered Systems From a Single Pane Of Glass

ORACLE ENTERPRISE MANAGER 10 g CONFIGURATION MANAGEMENT PACK FOR ORACLE DATABASE

D12C-AIU Oracle Database 12c: Admin, Install and Upgrade Accelerated NEW

WebLogic on Oracle Database Appliance: Combining High Availability and Simplicity

Oracle Database 12c: Performance Management and Tuning NEW

Oracle Database Cloud Services OGh DBA & Middleware Day

INTRODUCTION TO CLOUD MANAGEMENT

<Insert Picture Here>

Workflow Templates Library

Ultimate Guide to Oracle Storage

An Oracle White Paper July Oracle Primavera Contract Management, Business Intelligence Publisher Edition-Sizing Guide

Oracle Net Service Name Resolution

ORACLE DATABASE HIGH AVAILABILITY STRATEGY, ARCHITECTURE AND SOLUTIONS

An Oracle White Paper January A Technical Overview of New Features for Automatic Storage Management in Oracle Database 12c


Oracle Database - Engineered for Innovation. Sedat Zencirci Teknoloji Satış Danışmanlığı Direktörü Türkiye ve Orta Asya

Enabling ITIL Best Practices Through Oracle Enterprise Manager, Session # Ana Mccollum Enterprise Management, Product Management

An Oracle White Paper July Oracle ACFS

Installing and Using the vnios Trial

Objectif. Participant. Prérequis. Pédagogie. Oracle Database 11g - New Features for Administrators Release 2. 5 Jours [35 Heures]

OpenAdmin Tool for Informix (OAT) October 2012

Oracle 11gR2 two node step by step installation guide on Linux using Virtual Box 4.1.4

SysPatrol - Server Security Monitor

SIEMENS. Teamcenter Windows Server Installation PLM

Oracle Enterprise Manager 12c New Capabilities for the DBA. Charlie Garry, Director, Product Management Oracle Server Technologies


Oracle vs. SQL Server. Simon Pane & Steve Recsky First4 Database Partners Inc. September 20, 2012

UNIVERSITY AUTHORISED EDUCATION PARTNER (WDP)

<Insert Picture Here> Exadata Support Model and Best Practices

Contents Introduction... 5 Deployment Considerations... 9 Deployment Architectures... 11

Monitoring, Managing and Supporting Enterprise Clouds with Oracle Enterprise Manager 12c Name, Title Oracle

How To Install An Org Vm Server On A Virtual Box On An Ubuntu (Orchestra) On A Windows Box On A Microsoft Zephyrus (Orroster) 2.5 (Orner)

Why Not Oracle Standard Edition? A Dbvisit White Paper By Anton Els

Oracle Enterprise Manager

PATROL Console Server and RTserver Getting Started

User's Guide - Beta 1 Draft

Implementation Guide. Version 10

APPLICATION MANAGEMENT SUITE FOR ORACLE E-BUSINESS SUITE APPLICATIONS

Planning, Provisioning and Deploying Enterprise Clouds with Oracle Enterprise Manager 12c Kevin Patterson, Principal Sales Consultant, Enterprise

Installing, Uninstalling, and Upgrading Service Monitor

You need to recommend a monitoring solution to ensure that an administrator can review the availability information of Service1. What should you do?

MySQL Enterprise Monitor

ORACLE CONFIGURATION SERVICES EXHIBIT

Oracle Audit Vault. Server Installation Guide Release 10.3 for Linux x86-64 E

Database Decisions: Performance, manageability and availability considerations in choosing a database

ORACLE DATABASE: ADMINISTRATION WORKSHOP I

Oracle Recovery Manager

INTRODUCTION APPLICATION DEPLOYMENT WITH ORACLE VIRTUAL ASSEMBLY

Deploying System Center 2012 R2 Configuration Manager

Lessons Learned while Pushing the Limits of SecureFile LOBs. by Jacco H. Landlust. zondag 3 maart 13

Provisioning, Patch Automation, and Configuration Management Lab

Oracle Database 11 g Performance Tuning. Recipes. Sam R. Alapati Darl Kuhn Bill Padfield. Apress*

Beyond the Basics: Making the Most of Oracle Enterprise Manager 12c Monitoring

User's Guide - Beta 1 Draft

TÜRKIYE. Course Schedule. February June Oracle University Turkey Telephone:

HP POLYSERVE SOFTWARE

managing planned downtime with RAC Björn Rost

ORACLE DATABASE ADMINISTRATOR RESUME

Vector Asset Management User Manual

REDCENTRIC MANAGED DATABASE SERVICE SERVICE DEFINITION

Total Cloud Control with Oracle Enterprise Manager 12c. Kevin Patterson, Principal Sales Consultant, Enterprise Manager Oracle

Quick Deployment: Step-by-step instructions to deploy the SampleApp Virtual Machine v406

<Insert Picture Here> Managing Storage in Private Clouds with Oracle Cloud File System OOW 2011 presentation

1 P a g e Delivering Self -Service Cloud application service using Oracle Enterprise Manager 12c

An Oracle White Paper October Oracle Database Appliance

Risk-Free Administration for Expert Database Administrators

Maximum Availability Architecture. Oracle Best Practices For High Availability

Managing R12 EBS using OEM with the Application Management and Application Change Management Packs

Oracle Database Public Cloud Services

Oracle Networking and High Availability Options (with Linux on System z) & Red Hat/SUSE Oracle Update

IBM WebSphere Application Server Version 7.0

1 Certification Information


Secure Messaging Server Console... 2

Transcription:

Maximizing Oracle RAC Uptime Ian Cookson, Markus Michalewicz Oracle Real Application Clusters (RAC) Product Management / Development September 29, 2014

Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle s products remains at the sole discretion of Oracle. 3

The System Lifecycle Installation Implementation Diagnosis Operation Monitoring 4

The System Lifecycle Installation Implementation Diagnosis Operation Monitoring 5

Installation System assumed for this presentation germany Oracle RAC Oracle GI HUB argentina Oracle RAC Oracle GI HUB Server OS: HUBs 4GB+ memory recommended One HUB at a time will host GIMR database. Only HUBs will host (Flex) ASM instances. Leafs can have less memory, dependent on the use case. Installer enforces HUB minimum memory requirement. brazil Oracle RAC Oracle GI HUB italy Oracle GI Leaf OL 6.5 UEK (other kernels are supported) spain Oracle GI Leaf 6

Installation [root@germany ~]# uname a 3.8.13-16.2.1.el6uek.x86_64 #1 SMP Thu Nov 7 17:01:44 PST 2013 x86_64 x86_64 x86_64 GNU/Linux #Get the pre-install package [root@germany Desktop]# yum list oracle-* oracle-rdbms-server-11gr2-preinstall.x86_64 1.0-7.el6 oracle-rdbms-server-12cr1-preinstall.x86_64 1.0-8.el6 ol6_latest ol6_latest Installation is an infrequent task It should be standardized Follow: http://www.slideshare.net/markusmichalewicz/oracle-rac- 12c-collaborate-best-practices-ioug-2014-version and come to the Oracle RAC demo booth (3787) Tools to use: 1. Linux: pre-install package 2. Cluster Verification Utility (CVU) 3. Oracle Universal Installer (OUI) 7

Oracle Universal Installer (OUI) OUI provides a simple GUI for: Installation and Configuration Upgrades OUI calls cluvfy for: Verification checks Generating fixup scripts 8

The System Lifecycle Installation Implementation Diagnosis Operation Monitoring 9

Implementation Implementation is a recurring task Initial implementation Change implementation(s) as required OraChk Implementation tasks are system-specific CVU Tools to use: 1. CVU 2. OraChk 10

Cluster Verification Utility (CVU) Introduction Purpose: Verification of pre-install & post-install cluster setup Run manually (command: cluvfy) or as part of the OUI Available from OTN and included in Oracle Grid Infrastructure Supports the Oracle RAC stack since version 10g Rel. 1 What does it do?: Runs specified verification tests and optionally generates a fixup script (run under root) Utilizes a stage concept, enabling users to run the necessary tests for a pre or post installation 11

What does CVU Check? System requirements Are the installation requirements met for Clusterware, or RAC? Network and connectivity Cluster Time Synchronization (CTSS or NTP) Existence of required OS users and permissions Prerequisites for adding nodes etc. 12

CVU for Pre-Implementation Checks Purpose: Verification of configuration after installation, prior to implementation (is the system ready?) What Checks to be Made?: Use post checks to verify that system is indeed ready, and Confirm that post-installation changes made to the system will not cause problems Examples: cluvfy comp healthcheck -collect cluster -mandatory deviations -save 13

CVU for Pre-Implementation Checks - Example $ cluvfy stage -post hwos -n germany,argentina verbose Performing post-checks for hardware and operating system setup Checking node reachability... Check: Node reachability from node "germany Destination Node Reachable? ------------------------------------ ------------------------ germany yes argentina yes Result: Node reachability check passed from node "germany Checking user equivalence... Check: User equivalence for user "grid Node Name Status ------------------------------------ ------------------------ argentina passed germany passed Result: User equivalence check passed for user "grid 14

OraChk Engineered Systems require less initial testing OraChk OraChk Formerly RACchk or RACcheck aka ExaChk RAC Configuration Audit Tool For details see MOS note ID 1268927.1 Checks Oracle Stack: Standalone Database Grid Infrastructure & RAC Maximum Availability Architecture (MAA) Validation Oracle Hardware 15

OraChk Installation and Configuration Installation: Download the latest version of orachk (90 day reminder ) Unzip in local directory under the oracle user Check permission are 755 on orachk Configuration: Run manually or in silent mode (via daemon) Implementation run singly (manually) to validate system setup, etc prior to going live 16

OraChk Usage Usage :./orachk [-abvhpfmsuso:c] -a - all checks -b - best practices only -p - patch recommendations only -f - offline (reports from existing data only) -u - pre-upgrade checks -S or -s - for silent installs, with or without SUDO capabilities -c - check individual components (ie. orachk a c ASM) -o - to invoke optional functionality (ie. to display only non-passing audit checks, verbose format, etc) -m - exclude MAA checks -v - what is the tool version? 17

OraChk Example Oracle orachk Assessment Report OraChk report in html format Summary with links to content Database Server Check Id Status Type Message Status On Details E960DB20CA5A634F E04312C0E50A62E0 6580DCAAE8A28F5B E0401490CACF6186 5ADD88EC8E0AFF2E E0401490CACF0C10 84BE4DE1F00AD833 E040E50A1EC07771 66E70B43167837ABE 040E50A1EC02FEA FAIL SQL Check WARNING OS Check WARNING OS Check INFO OS Check System Health Score is 75 out of 100 (detail) Table containing SecureFiles LOB storage belongs to a tablespace with extent allocation type that is not SYSTEM managed (not AUTOALLOCATE) The number of async IO descriptors is too low (/proc/sys/fs/aio-max-nr) net.core.wmem_max Is NOT Configured According to Recommendation Kernel Parameter fs.file-max Is Lower Than The Recommended Value INFO OS Check ORA-00600 errors found in alert log All Databases All Database Servers All Database Servers All Database Servers All Database Servers View View View View View 18

OraChk Example Oracle orachk Assessment Report OraChk highlights failures Here: Data Guard not setup MAA Scorecard System Health Score is 75 out of 100 (detail) FAIL OS Check Active Data Guard is not configured All Database Servers View DATA CORRUPTION PREVENTION BEST PRACTICES FAIL SQL Parameter Check Database parameter DB_BLOCK_CHECKSUM is NOT set to recommended value All Instances View 19

The System Lifecycle Installation Implementation Diagnosis Operation Monitoring 20

Operation CVU OraChk Operation is an ongoing task Oracle Grid Infrastructure provides all necessary tools for normal operation. Operation should not create extra tasks Automation is the key Tools to use: 1. CVU (periodic runs) 2. OraChk (interval runs via daemon) 3. Cluster Health Monitor (CHM/OS) 21

Operations Periodic CVU Checks are the Default [GRID]> crsctl status res -t -------------------------------------------------------------------------------- Name Target State Server State details -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.asmnet1lsnr_asm.lsnr ONLINE ONLINE argentina STABLE ONLINE ONLINE brazil STABLE ONLINE ONLINE germany STABLE... ora.cvu 1 ONLINE ONLINE brazil STABLE ora.germany.vip 1 ONLINE ONLINE germany... [GRID]> crsctl status res ora.cvu -p NAME=ora.cvu TYPE=ora.cvu.type ACL=owner:grid:rwx,pgrp:oinstall:rwx,other::r-- ACTIONS= ACTION_SCRIPT= ACTION_TIMEOUT=60 ACTIVE_PLACEMENT=0 AGENT_FILENAME=%CRS_HOME%/bin/oraagent%CRS_EXE_SUFFIX% AUTO_START=restore CARDINALITY=1 CHECK_INTERVAL=60 CHECK_RESULTS=PRVF-4090 : Node connectivity failed for interface "*",PRVF-4090 : Node connectivity failed for interface "*",PRVF-4090 : Node connectivity failed for interface "*",PRVF-4090 : Node connectivity failed for interface "*",PRVG-1101 : SCAN name "cupscan.cupgnsdom.localdomain" failed to resolve,prvf- 4657 : Name resolution setup check for "cupscan.cupgnsdom.localdomain" (IP address: 10.1.1.55) failed,prvf-4090 : Node connectivity failed for interface "*",PRVG-11050 : No matching interfaces "*" for subnet "172.149.0.0" on nodes "argentina,brazil,germany",prvg-11050 : No matching interfaces "*" for subnet "172.149.0.0" on nodes "argentina,brazil,germany",prvf-7530 : Sufficient physical memory is not available on node "germany" [Required physical memory = 4GB (4194304.0KB)],PRVF-4354 : Proper hard limit for resource "maximum open file descriptors" not found on node "germany" [Expected = "65536" ; Found = "4096 22

Operations Setup Periodic OraChk System Checks <<< Configure & start orachk daemon for scheduled interval runs >>> $./orachk -id DBA -set \ > "NOTIFICATION_EMAIL=your.email@company.com;\ > AUTORUN_SCHEDULE = 4,8,12,16,20 * * *;\ > AUTORUN_FLAGS=-profile dba; COLLECTION_RETENTION=30 $./orachk -d start 23

Cluster Health Monitor (CHM/OS) OLOGGERD osysmond osysmond Service integrated with the Oracle Clusterware stack Introduced in 11.2.0.2 (Linux, Solaris, Windows), 11.2.0.3(AIX) germany Oracle GI argentina Oracle GI Gathers OS level metrics to monitor resource degradation and failure Stores data in a central repository (GIMR) brazil osysmond Oracle GI italy osysmond Oracle GI Runs real time with locked down memory for last gasp analysis Integration with QoS (Memory Guard) and CRS (server pool categorization) Integrated into EM Cloud Control 24

Cluster Health Monitor Deamons / Processes Function Collect OS metrics Process raw data for subset of processes Compress and send data to ologgerd Store/forward in case of network failures osysmond ologgerd oclumon Consume data from all active osysmonds Store data in the repository Service requests from clients Display OS level metrics in historic/ real time mode Perform repository management operations Managed by ohasd osysmond Command line utility Instances and location Every node of the cluster (including leaf nodes) One per cluster (Replica for 11.2.x) Can be invoked from any hub node in the cluster 25

Cluster Health Monitor in EM Cloud Control 26

Cluster Health Monitor in EM Cloud Control 27

Cluster Health Monitor command line reporting Command line reporting of current and historic OS metrics (oclumon) from any hub node in the cluster Example: [germany]: > oclumon dumpnodeview -process 28

The System Lifecycle Installation Implementation Diagnosis Operation Monitoring 29

Monitoring Monitoring is an ongoing task There is optional monitoring available for an Oracle RAC cluster via QoS and Oracle EM Quality of Service Management (QoS) comes with a monitoring only feature Monitoring is a pro-active task. Tools to use: 1. Oracle Enterprise Manager 12c CC 2. Oracle Quality of Service Management (Memory Guard) 30

Monitoring the RAC Cluster with EM Cloud Control 31

Quality of Service Management Memory Guard QoS Feature externalized for general use Memory Guard protects resources Receives a stream of OS Memory metrics from CHM/OS Issues alert should any server be at risk Protects existing work and applications by automatically closing the server to new connections (ie. stops service on at-risk node) Automatically re-opens server to connections once the memory pressure has subsided 32

Autonomous Computing QoS CHM Self- Optimizing Self- Protecting Policy CHA HngMgr Self- Healing Self- Configuring 33

Enabling Autonomous Computing Cluster Health Monitor (CHM)/OS & QoS 11.2+ Further QoS & CHM Enhancements in 12.1.0.2 QoS Support for Measure only with Performance Objectives and Alerts QoS Support for Measuring and Monitoring Admin- Managed Databases Cluster Health Advisor Coming soon LOGGERD CHM/OS sysmond 34

The System Lifecycle Installation Implementation Diagnosis Operation Monitoring 35

Diagnosis Diagnosis is a recurring task Ideally, there will be no incidents on system. Realistically, there will be more than one. Diagnosis is a reactive task. It should be performed as efficiently as possible. Tools to use: 1. Trace File Analyzer (TFA) 36

Trace File Analyzer (TFA) log collection in action Trace File Analyzer Improved comprehensive first failure diagnostics collection Efficient collection, packaging and transfer of data Collect for all relevant components (OS, Grid Infra., ASM, RDBMS), including Exadata cell nodes One command to collect all information, from all nodes (or single-instance, single-node) More information: MOS note ID 1513912.1 37

Trace File Analyzer (TFA) intelligent log collection $./tfactl diagcollect One simple command Sending diagcollect request to host : argentina Getting list of files satisfying time range [Tue Sep 03 14:17:43 PDT 2014, Tue Sep 03 18:17:43 PDT 2014] germany: Zipping File: /opt/oracle/oak/oswbb/archive/oswiostat/germany_iostat_14.09.03.1500.dat.gz germany: Zipping File: /u01/app/oracle/diag/rdbms/bill/bill1/trace/alert_bill1.log Trimming file : /u01/app/oracle/diag/rdbms/bill/bill1/trace/alert_bill1.log with original file size : 109kB germany: Zipping File: /opt/oracle/oak/oswbb/archive/oswtop/germany_top_14.09.03.1500.dat.gz germany: Zipping File: /opt/oracle/oak/log/germany/oak/oakd.log Trimming file : /opt/oracle/oak/log/germany/oak/oakd.log with original file size : 9.2MB germany: Zipping File: /u01/app/12.1.0.2/grid/log/germany/gipcd/gipcd.log germany: Zipping File: /u01/app/12.1.0.2/grid/log/germany/agent/ohasd/oraagent_grid/oraagent_grid.log Trimming file : /u01/app/12.1.0.2/grid/log/germany/agent/ohasd/oraagent_grid/oraagent_grid.log with original filesize 4.3MB germany: Zipping File: /var/log/messages germany: Zipping File: /opt/oracle/oak/oswbb/archive/oswslabinfo/germany_slabinfo_14.09.03.1800.dat Collecting ADR incident files... Total Number of Files checked : 10543 Total Size of all Files Checked : 3.9GB Number of files containing required range : 68 Total Size of Files containing required range : 129MB Number of files trimmed : 10 Total Size of data prior to zip : 144MB Saved 63MB by trimming files Zip file size : 8.6MB Total time taken : 47s. ADR Incident files 144MB pruned and compressed down to 8.6MB Logs are collected to: /opt/oracle/tfa/tfa_home/repository/collection_tue_sep_3_18_17_24_pdt_2014_node_all/germany.tfa_tue_sep_3_18_17_24_pdt_2014.zip /opt/oracle/tfa/tfa_home/repository/collection_tue_sep_3_18_17_24_pdt_2014_node_all/argentina.tfa_tue_sep_3_18_17_24_pdt_2014.zip Pruning Relevant files only OS Watcher files 47 seconds! 1 command, 2 nodes, 4 databases, ASM, Clusterware, OS 38

Trace File Analyzer (TFA) Efficiency from A-Z LOGs germany Oracle RAC Oracle GI HUB LOGs brazil Oracle RAC Oracle GI HUB 39

Utility Cluster Utility Cluster Centralize and standardize storage, deployment, management and diagnostics Architecture: ASM Oracle Clusterware ASM An Oracle Grid Infrastructure based cluster IOsrv Oracle ASM IOsrv Enterprise Management (EM) Server Node1 Node2 Utility Cluster Solution-in-a-Box approach on ODA Flex ASM Storage +1 Grid Home Server (Rapid Home Provisioning) Node 1 Application Domain Node 2 Application Domain Application Domain Application Domain Storage Server Application Domain Application Domain Database Domain Database Domain 40

The System Lifecycle Installation Implementation Diagnosis Operation Monitoring 41

42