VMware vsphere 5: Best Practices for Oracle RAC Virtualization Jeff Browning, Webmaster Oracle Solutions Enablement EMC Corporation IOUG November 8, 2012 1
Agenda Why RAC on vsphere? Support Licensing Storage Performance Version 5 new features Case study 2
Why RAC on vsphere? 3
Virtualization is on the rise Source: Toward a Smarter Information Foundation: 2010 IOUG Enterprise Platform Decision Survey Oracle Magazine July / August 2010 4
Virtualization is gaining momentum Source: 2010 IOUG Enterprise Platform Decision Survey: Toward a Smarter Information Foundation *Source: Paul Mauritz keynote, VMware vsphere 5 launch Oracle Magazine July / August 2010 5
Business challenge Infrastructure silos Oracle9i Oracle10g Oracle11g Infrastructure silos lead to higher costs, lower ROI on assets, and increased environment complexity Lower server and storage utilization Difficulty meeting service levels Maintaining and managing separate environments Provisioning inefficiencies Inconsistent data protection strategies 6
VMware vsphere with Oracle RAC benefits Performance DB consolidation DB on demand Quality of service Match native performance even in consolidation scenarios 95%+ Oracle instances match native performance on VMware VMware facilitates migration to current hardware a major performance benefit Reduce hardware and software license costs Consolidate servers Reduce CPU burn by improving utilization $/TPS declines dramatically as utilization improves Provision databases on demand Minutes to provision a new VM from template Seconds to provision database copy using storage semantics Rapidly provision, use and destroy test / dev and other utility servers Increase application quality of service Scale dynamically Built-in high availability and simple disaster recovery Dynamically load balance RAC nodes across physical servers 7
Current Oracle RAC / VMware vsphere customers 8
Support 9
My Oracle Support note 249212.1 Purpose Explain to customers how Oracle supports our products when running on VMware Scope & Application For Customers running Oracle products on VMware virtualized environments. No limitation on use or distribution. Support Status for VMware Virtualized Environments -------------------------------------------------- Oracle has not certified any of its products on VMware virtualized environments. Oracle Support will assist customers running Oracle products on VMware in the following manner: Oracle will only provide support for issues that either are known to occur on the native OS, or can be demonstrated not to be as a result of running on VMware. If a problem is a known Oracle issue, Oracle support will recommend the appropriate solution on the native OS. If that solution does not work in the VMware virtualized environment, the customer will be referred to VMware for support. When the customer can demonstrate that the Oracle solution does not work when running on the native OS, Oracle will resume support, including logging a bug with Oracle Development for investigation if required. If the problem is determined not to be a known Oracle issue, we will refer the customer to VMware for support. When the customer can demonstrate that the issue occurs when running on the native OS, Oracle will resume support, including logging a bug with Oracle Development for investigation if required. NOTE: Oracle has not certified any of its products on VMware. For Oracle RAC, Oracle will only accept Service Requests as described in this note on Oracle RAC 11.2.0.2 and later releases. Source: My Oracle Support website 10
Range of possible issues Statement execution Oracle ASM Propensity to require physical Oracle centric Hardware / OS centric 11
VMware / EMC support for Oracle RAC on vsphere E-Labs tests and validates Oracle RAC 11g clustering on vsphere EMC Global Solutions provides comprehensive testing of Oracle RAC 11g on vsphere 12
Licensing 13
Intel and AMD are the most common Oracle licensing is on the processor core Almost all Oracle software products are licensed on the physical CPU. For most products, a processor core factor (PCF) is applied to the number of processor cores. The higher the PCF, the higher the core count, and the higher the cost for Oracle licensing. Source: Oracle Technology Global Price List (May 12, 2011) Oracle Processor Core Factor Table (March 16, 2009) 14
Transaction cost vs. utilization Cost per TPS for a four-node Oracle RAC 11g cluster running EE Software license cost: around $2,200,000 TPS: Around 4,000 at peak utilization 15
Oracle must be licensed on all processor cores in a VMware ESX cluster VMotion VMware ESX Server 1 Oracle License Applies VMware ESX Server 2 Oracle License Does Now Not Applies Apply As soon as an Oracle VM hits an ESX server, the license and maintenance costs for Oracle apply There is no vcpu licensing of Oracle software These costs are high, and reducing them is very worthwhile 16
Virtualized vs. physical licensing compared Oracle RAC 11g EE running on physical hardware 11g 11g 11g 11g CRS CRS CRS CRS OS OS OS OS 4 Node RAC: 4 * 4 * quad processors: 4 3 cores = 64 *(0.5) PCF 32 licensed cores Oracle core: Database 11g EE license per $47,500 * 32 $1,520,000 Oracle RAC upcharge per core: $23,000 * 32 $736,000 Total: $2,256,000 Annual core: maintenance on 11g EE per $10,450 * 32 $334,400 / year Annual maintenance on RAC: $5060 per year * 32 $161,920 / year Total: $496,320 / year Source: Oracle Technology Global Price List (May 12, 2011) 17
Compatible workloads for consolidation 2 Identical 4-node RAC clusters Workload RAC1: OLTP Workload RAC2: Batch-oriented DSS RAC1 peaks during the day, RAC2 at night Peak utilization: 60%, minimum 20% (both clusters) 18
Virtualized vs. physical licensing compared Oracle RAC 11g EE running on VMware vsphere 2 on 48 Node vcpus): RAC clusters consisting of 4 VMs running on a cluster of 4 ESX servers (each VM runs 4 * 4 * quad processors: 4 3 cores = 64 *(0.5) PCF 32 licensed cores Oracle core: Database 11g EE license per $47,500 * 32 $1,520,000 Oracle RAC upcharge per core: $23,000 * 32 $736,000 Total: $2,256,000 Annual core: maintenance on 11g EE per $10,450 * 32 $334,400 / year Annual maintenance on RAC: $5060 per year * 32 $161,920 / year Total: $496,320 / year Source: Oracle Technology Global Price List (May 12, 2011) 19
Well managed licensing on VMware DRS / HA cluster Oracle DRS / HA cluster Microsoft DRS / HA cluster 20
ROI on vsphere software with well-managed VMware DRS / HA cluster 3 year Oracle physical software costs $3,744,960 3 year Oracle virtualized software costs $1,872,480 Savings on Oracle software costs (physical vs. virtual) $1,872,480 3 year vsphere software costs $92,384 ROI on vsphere software investment 2,027% Assumes 2X compression Higher compression would result in higher savings 21
Poorly managed licensing on VMware DRS / HA cluster DRS / HA cluster 22
Negative ROI on vsphere software with poorly-managed VMware DRS / HA cluster 3 year Oracle physical software costs $1,872,480 3 year Oracle virtualized software costs $3,744,960 Wasted money on Oracle software costs (physical vs. virtual) ($1,872,480) 3 year VMware vsphere software costs $184,768 Negative ROI on vsphere software -1,013% Assumes Oracle VMs are 50% of the cluster Negative ROI increases as the percentage of Oracle VMs decreases 23
Licensing summary Correctly managed, investment in vsphere software yields phenomenal ROI for Oracle RAC environments Incorrectly managed, investment in vsphere can be spectacularly wasteful 24
Storage 25
Block File Possible storage configurations Config 3 Config 4 Config 1 Config 2 Physical Virtual 1 Config 1: Block / SAN storage w/ RDM LUNs mounted directly on VMs then used to make ASM diskgroups 2 Config 2: Block / SAN storage w/ LUNs mounted onto ESX, used to make VMFS volumes;.vmdk files used to create ASM diskgroups 3 Config 3: File / NFS mounted directly on VMs managed by Oracle using Direct NFS Client 4 Config 4: File / NFS mounted on ESX as datastores;.vmdk files used to create ASM diskgroups (uninteresting) 26
Storage config 1: Block / RDM -> VM / ASM: Layer diagram Oracle VM ESX LUNs ESX is bypassed almost completely, and LUNs are mounted on VMs as RDM SCSI disks within VMs are used to create ASM diskgroups Extremely large LUNs and I/O can be supported Minimal overhead vs. physical 27
RAC Interconnect Storage config 1: Block / RDM -> VM / ASM: Network diagram FC Switch SAN Network Physical RAC Nodes Virtual RAC Nodes ESX Server Virtual and physical nodes can co-exist in the same RAC cluster Can also use storage tools which integrate with Oracle (EMC Replication Manager, RecoverPoint) VMotion is not supported so no DRS cluster (but HA cluster still works) SCSI bus must be set to shared mode (VM hack) 10 GbE Switch SAN Storage Array 28
Storage config 2: FC / VMFS /.vmdk / ASM: Layer diagram.vmdk.vmdk.vmdk.vmdk.vmdk.vmdk.vmdk Oracle VM ESX LUNs LUNs are mounted onto ESX and formatted as VMFS volumes (must be eager zero thick) Shared write bit is set in.vmdk files to allow RAC clustering / shared storage (KB Article: 1034165).vmdk files are then used to create ASM diskgroups Maximum RAC nodes supported: 8 No VMware snaps or linked clones (same as FT) 29
RAC Interconnect Storage config 2: FC / VMFS /.vmdk / ASM: Network diagram FC Switch SAN Network Because Oracle datafiles are stored on VMFS, no physical nodes are allowed Virtual RAC Nodes.vmd k.vmd k.vmd k.vmd k Not possible to use storage tools that integrate with Oracle (EMC Replication Manager, RecoverPoint) HA / DRS Cluster But storage replication (VMware layer) can still be performed if follow best practices 10 GbE Switch SAN Storage Array 30
Storage config 3: IP / NFS -> VM / dnfs: Layer diagram Oracle VM ESX NAS Array VM OS mounts NFS mounts using normal Linux / UNIX semantics; ESX is bypassed entirely Oracle then manages the NFS mount points using Oracle Direct NFS Client (dnfs) Very large databases / I/O can be supported Minimal overhead vs. physical 31
RAC Interconnect Storage config 3: IP / NFS -> VM / dnfs: Network diagram 10 GbE Switch IP Storage Network Physical RAC Nodes Virtual RAC Nodes HA / DRS Cluster Virtual and physical nodes can co-exist in the same RAC cluster NFS file systems are mounted on both virtual and physical nodes Can also use storage tools which integrate with Oracle (EMC Replication Manager) DRS / HA cluster works fine 10 GbE Switch NAS Storage Array 32
Storage configs compared Config Config 1: FC / RDM -> VM / ASM Config 2: FC / VMFS /.vmdk / ASM Config 3: IP/ NFS -> VM / dnfs Networ k Storag e Laye r Physic al Compa t Storag e Tools VMware Value Add Scaling FC Block VM Yes Yes No No limit No* FC Block ESX No No** Yes*** 8 nodes Yes IP File VM Yes Yes No No limit Yes VMotion *Therefore, no DRS Cluster with RDM (HA Cluster still works fine) **Replication Manager / RecoverPoint replication possible at VM level, but without Oracle coordination ***Includes Storage DRS, VAAI, VMware Replication, etc. 33
KBps Comparison between VMware RDM and VMFS for Oracle 11g Databases OLTP Transactions per minute ESX server CPU utilization Database transactions response times 40,000 35,000 30,000 25,000 20,000 15,000 10,000 5,000 0 Disk Performance 5 10 15 20 25 30 35 40 45 50 55 VMFS RDM Minutes LUN Number Size (GB) Datafile Type 1 LUN 1 50 Contains DATA device 1 2 LUN 2 50 Contains DATA device 2 3 LUN 3 50 Contains DATA device 3 4 LUN 4 50 Contains DATA device 4 5 LUN5 50 Contains DATA device 5 6 LUN6 50 Contains DATA device 6 7 LUN7 50 Contains DATA device 7 8 LUN8 50 Contains DATA device 8 9 LUN9 50 Contains DATA device 9 10 LUN10 50 Contains DATA device 10 11 LUN11 50 Contains DATA device 11 12 LUN12 50 Contains DATA device 12 13 LUN13 50 Contains FRA device 1 Datastore Type Size (GB) Description 1 Data Datastore 588 Contains all database DATA devices 2 FRA Datastore 98 Contains all database FRA devices 3 4 REDO Datastore TEMP Datastore 98 Contains all database REDO devices 98 Contains all database EMP/UNDO devices 34
VMFS / NFS Oracle RAC: Best practices for file placement Small VMFS 1 / Small NFS 1 Small VMFS 2 / Small NFS 2 Big VMFS / Big NFS 35
Tier 2: NL-SAS Tier 1: SAS FAST VP Pool Tier 0: Flash FAST VP vs. LUN migration Non FAST VP Storage Array FAST VP Storage Array 36
FAST VP performance comparison 17% improvement in response times after FAST policy applied FAST VP moved most active data to EFDs, so ensuring improved response times for entire storage group FAST VP also moved 30% of total allocated storage for storage group to SATA with no negative impact on response times 37
FAST Cache 1 Page requests satisfied from DRAM if available 2 If not, FAST Cache driver checks map to determine where page is located Policy Engine DRAM Driver Page Map 3 Page request satisfied from disk drive if not in FAST Cache 4 Policy Engine copies the page to FAST Cache if it is being used frequently FAST Cache SAS Drives 5 Subsequent requests for this page satisfied from FAST Cache 6 Dirty pages are copied back to disk drives as background activity 38
dnfs (storage config 3) FAST Cache performance comparison Difference between FAST Cache and non- FAST Cache was more than 2X Difference between virtualized and physical was 11%, at most 39
FAST Suite FAST VP + FAST Cache FAST VP Tiers across drives in pool Optimizes drive utilization Relative ranking over time 1 GB slices ideal for deterministic workload Periodic rebalance FAST Cache Copies hottest data to Flash Optimizes Flash utilization Dynamic movement in near real-time 64 KB sub-slices ideal for bursty workload or where working set varies a lot DRAM Cache FAST Cache FAST Virtual Pool 40
Will FAST Suite benefit my application How much improvement is expected from FAST Suite? How to find if an application benefits? Monitor application performance data Analyze it for I/O size, randomness & locality Flash Performance All HDD performance HDD Performance All flash performance % Data set on flash 0 100 41
EMC PowerPath VE Highly recommended for databases VMs ESX Cluster IP /SAN Switch Storage Array PowerPath VE Multi-pathing and failover for virtualized environments Scale virtual machines on SAN storage over FC, iscsi, FCoE Dynamically balance I/O requests Automatically detect and recover from I/O path failures 42
Paravirtualized SCSI (PVSCSI) App OS PVSCSI Driver PVSCSI Driver VM ESX Storage Array PVSCSI Storage Performance white paper For high I/O workloads 2,000 IOPS or greater Improves CPU cost for: Fibre Channel iscsi Works with: RDM VMFS With VMware ESX 4.1 can boot with PVSCSI from: Free Windows Server 2008 Windows Server 2003 Red Hat Enterprise Linux 5 43
Performance 44
Physical & virtual Oracle RAC comparison OLTP database workload SAN / RDM (storage config 1) Virtual environment delivered performance consistently within 4% of physical environment NFS / dnfs difference is higher (see previous slide) 45
Performance assistance for virtualization Hardware Assist Virtualization Intel VT or AMD V AMD RVI or Intel EPT Interrupt Virtualization Directed I/O CPU Compatibility Pools Secure Execution Linux / Windows Linux Huge Pages Windows Large Pages Offers Significant Performance Improvements HBA / Network Cards Other Devices Off loads and increases efficacy of Hypervisor device management CPU multi-generation Support & Security Compatibility support for multi-generations of the same CPU family within a cluster / compatibility pool Secure execution of VMs using boot profiles to prevent attacks 46
Memory virtualization: Non-accelerated LPN table PPN table lpn2ppn Guest operating system (inside the virtual machine) Virtual memory mapping (VMM) addresses seen by the VM Machine pages ppn2ma physical machine addresses seen by the ESX server 47
Memory virtualization: Translation look-aside buffer TLB hit TLB miss TLB hit TLB with ASID (Address Space Identifier) LPN to PPN translation EPT: Hardware assist PPN to MA 48
Memory virtualization: Shadow page table LPN table Shadow page table PPN table LPN PPN PPN MA Machine pages 49
Performance boost: HugePages in Linux Default page map: 2.6 million 4 KB pages HugePages map: 5,000 2 MB pages More Pages: Heavy weight kernel Less Pages: Light weight kernel My Oracle Support Note 361468.1: HugePages on 64-bit Linux My Oracle Support Note 317141.1: How to Configure REL/OEL 4 32-bit for Very Large Memory with ramfs and HugePages 50
Performance boost: HugePages in Linux Default page map: 2.6 million 4 KB pages HugePages map: 5,000 2 MB pages TLB cache TLB cache 2 MB addressable memory 16 MB addressable memory Less TLB hits: Worse performance More TLB hits: Better performance 51
What is Binary Translation? Light weight, efficient, near native performance User Mode Native Performance Ring 3 Least privileged Un-virtualizable Instructions Ring 1 Less privileged Translator Cache Binary Translation Ring 0 Most privileged 52
What is Binary Translation? Light weight, efficient, near native performance Virtualization Management Control Structure (VMCS) Intel VT / AMD-V Un-virtualizable Instructions User Mode Native Performance Virtual Machine Monitor Hypervisor (ESXi) Ring 3 Least privileged Ring 0 More privileged Ring -1 Most privileged 53
VMware performance best practices for Oracle RAC 11g 1 Disable dynamic coalescing on private interconnect (.vmx file hack) 2 Increase Interrupt rate on network adapter to 30K 3 Keep full memory reservation (DRS or resource sharing setting at VM level) 4 Turn off Transparent Page Sharing (VM level setting) 5 Enable hyperthreading 6 Use PreferHT=TRUE (.vmx file hack) 54
Enable CPU hyperthreading Hyperthreading adds an additional 20% of CPU capacity (YMMV) Both vsphere and Oracle 11g RAC are HT aware It costs you nothing Must be enabled from the BIOS 55
Without Prefer HT vcpus 0 1 2 3 4 5 6 7 Threads 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 Cores 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 CPUs 0 1 2 3 4 socket 4 core ESX server 56
With Prefer HT vcpus 0 1 2 3 4 5 6 7 Threads 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 Cores 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 CPUs 0 1 2 3 4 socket 4 core ESX server 57
Version 5 new features 58
vsphere 5: Increased limits on vcpus and memory 59
100 TB NL-SAS datastore 10 TB SAS datastore 200 GB SSD datastore vsphere 5: Storage DRS DRS cluster RAC VMs Datastore cluster.vmdk.vmdk.vmdk 60
VMFS-5 vs. VMFS-3 Comparison Feature VMFS-3 VMFS-5 2TB+ VMFS Volumes Support for 2TB+ Physical RDMs Yes (using extents) No Yes Yes Unified Block size (1MB) No Yes Atomic Test & Set Enhancements (part of VAAI, locking mechanism) Sub-blocks for space efficiency No 64KB (max ~3k) Yes 8KB (max ~30k) NOTE: LUN count limit for a vsphere 5 cluster is still 256, but with larger datastores the norm, should be less of an issue Small file support No 1KB 61
Case study 62
Oracle 11i ebusiness Suite: Replatform One of the largest single global instances of Oracle 11i Core mission-critical applications 75+ Application tiers VMware/RHEL Oracle Database 10g R2 8 TB Database; 8.8 billion rows of data 52 Million transactions per day 79K IOPS 40K Blocks Per Second Interconnect Traffic 40,000+ Named users 4,000+ Peak concurrent users 63
EMC IT: Replatform Sun Fire Server Sun Fire E25K UltraSPARC IV processor CPU 224 Cores CPU utilization 80% OS Solaris 10 Storage Symmetrix DMX-3 Unified Computing System Cisco UCS B440 Intel Nehalem EX processor CPU 128 Cores CPU utilization 10% OS Red Hat Linux / vsphere Storage Symmetrix VMAX 64
After Before EMC IT: Before and after architectures Storage Server Database Application server Load balancer Powerpath CRS 1 2 SQL*Net JDBC Symmetrix DMX 3 720 x 146 GB drives 95 TB Total RAW RAID 10 Protection Sun 25K OS - Solaris 10 CPUs 224 Cores Veritas RAC 5.0 PowerPath 5.0.2 Oracle RAC 10g (version 10.2.0.3) 8 TB (8.8b rows SGA 60 GB each 2 nodes E-Business Suite 11i Shared APPL_TOP Linux RHEL 4 VMware vsphere ACE Load Balancer Powerpath CRS 1 2 3 4 SQL*Net JDBC Symmetrix VMAX 450 x 235 GB drives 95 TB Total RAW RAID 10 Protection Cisco UCS B440 Intel Nehalem EX OS RHEL 5.4 CPUs 128 Cores PowerPath 5.3 Oracle RAC 10g (version 10.2.0.4) 8 TB (8.8b rows) SGA 30 GB each 4 nodes E-Business Suite 11i Shared APPL_TOP Linux RHEL 4 VMware vsphere ACE Load Balancer Legend: Red indicates new architecture 65
EMC IT: System performance statistics 66
EMC IT: 11i performance Improvements - Online CXP Transaction Times (sec) DXP Transaction Times (sec) 50%-90% reduction in times for online transactions (i.e. 2-10 times faster) 67
EMC IT: 11i performance Improvements - Batch Sales Job Timings (sec) CS Job Timings (sec) 85% - 95% reduction in transaction times for the above jobs (i.e. up to 20 times faster) ACT Stats Replay duration will be about 4 times faster Total ACT Transactions : 392,806 Used to require 1 hour of replay for every 3 hours of downtime Now requires 1 hour of replay for every 12 hours of downtime Reduces Future 11i Maintenance windows by 20% 68
EMC IT updates: Underway today Cisco UCS B440 Intel Nehalem EX processor CPU 192 Cores CPU utilization 10% OS Red Hat Linux / vsphere Storage Symmetrix VMAX Unified Computing System vsphere 5.0: 32 Cores per VM Moved to 128 Cores 4 x B440 Blades 69
New EMC Community Network Everything Oracle at EMC (EO@EMC) site Provides a focal point for all of EMC s Oracle-related activities EMC s Oracle-related Proven Solutions content now publicly available and searchable on Google Go to: http://community.emc.com/community/connect/everything_oracle 70
Q&A 71
Thank you! 72