How Oracle Exadata Delivers Extreme Performance



Similar documents
Oracle Exadata: The World s Fastest Database Machine Exadata Database Machine Architecture

Inge Os Sales Consulting Manager Oracle Norway

2009 Oracle Corporation 1

Exadata Performance, Yes You Still Need to Tune Kathy Gibbs Senior Database Administrator, CONFIO Software

An Oracle White Paper May Exadata Smart Flash Cache and the Oracle Exadata Database Machine

Expert Oracle Exadata

An Oracle White Paper June A Technical Overview of the Oracle Exadata Database Machine and Exadata Storage Server

How To Build An Exadata Database Machine X2-8 Full Rack For A Large Database Server

<Insert Picture Here> Oracle Exadata Database Machine Overview

An Oracle White Paper December A Technical Overview of the Oracle Exadata Database Machine and Exadata Storage Server

An Oracle White Paper October A Technical Overview of the Oracle Exadata Database Machine and Exadata Storage Server

SUN ORACLE EXADATA STORAGE SERVER

Overview: X5 Generation Database Machines

Oracle Database In-Memory The Next Big Thing

Safe Harbor Statement

Exadata Database Machine Administration Workshop NEW

SUN ORACLE DATABASE MACHINE

Applying traditional DBA skills to Oracle Exadata. Marc Fielding March 2013

ORACLE EXADATA STORAGE SERVER X2-2

ORACLE EXADATA STORAGE SERVER X4-2

Exadata and Database Machine Administration Seminar

An Oracle White Paper April A Technical Overview of the Sun Oracle Database Machine and Exadata Storage Server

ORACLE EXADATA DATABASE MACHINE X2-8

<Insert Picture Here> Best Practices for Extreme Performance with Data Warehousing on Oracle Database

Extreme Data Warehouse Performance with Oracle Exadata

MACHINE X2-22 ORACLE EXADATA DATABASE ORACLE DATA SHEET FEATURES AND FACTS FEATURES

Expert Oracle Exadata

Oracle Exadata Database Machine for SAP Systems - Innovation Provided by SAP and Oracle for Joint Customers

An Oracle White Paper September A Technical Overview of the Sun Oracle Exadata Storage Server and Database Machine

How To Use Exadata

An Oracle White Paper December Exadata Smart Flash Cache Features and the Oracle Exadata Database Machine

Exadata for Oracle DBAs. Longtime Oracle DBA

MACHINE X2-8 ORACLE EXADATA DATABASE ORACLE DATA SHEET FEATURES AND FACTS FEATURES

Tuning Exadata. But Why?

Oracle Database - Engineered for Innovation. Sedat Zencirci Teknoloji Satış Danışmanlığı Direktörü Türkiye ve Orta Asya

Automatic Data Optimization

Exadata Database Machine

SUN ORACLE DATABASE MACHINE

Main Memory Data Warehouses

Oracle Aware Flash: Maximizing Performance and Availability for your Database

Capacity Management for Oracle Database Machine Exadata v2

Boost Database Performance with the Cisco UCS Storage Accelerator

SQL Server Business Intelligence on HP ProLiant DL785 Server

1Z0-117 Oracle Database 11g Release 2: SQL Tuning. Oracle

Exadata: from Beginner to Advanced in 3 Hours. Arup Nanda Longtime Oracle DBA (and now DMA)

Converged storage architecture for Oracle RAC based on NVMe SSDs and standard x86 servers

Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays. Red Hat Performance Engineering

Preview of Oracle Database 12c In-Memory Option. Copyright 2013, Oracle and/or its affiliates. All rights reserved.

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren

MACHINE X3-8 ORACLE DATA SHEET ORACLE EXADATA DATABASE FEATURES AND FACTS FEATURES

SMB Direct for SQL Server and Private Cloud

ORACLE EXADATA STORAGE SERVER X3-2

Who am I? Copyright 2014, Oracle and/or its affiliates. All rights reserved. 3

Oracle EXAM - 1Z Oracle Database 11g Release 2: SQL Tuning. Buy Full Product.

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1

ORACLE DATA SHEET RELATED PRODUCTS AND SERVICES RELATED PRODUCTS

Oracle Maximum Availability Architecture with Exadata Database Machine. Morana Kobal Butković Principal Sales Consultant Oracle Hrvatska

Performance Baseline of Hitachi Data Systems HUS VM All Flash Array for Oracle

ORACLE EXADATA DATABASE MACHINE X4-2

ORACLE EXADATA DATABASE MACHINE X5-2

SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications. Jürgen Primsch, SAP AG July 2011

Flash Performance for Oracle RAC with PCIe Shared Storage A Revolutionary Oracle RAC Architecture

Oracle Database 12c Built for Data Warehousing O R A C L E W H I T E P A P E R F E B R U A R Y

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

Novinky v Oracle Exadata Database Machine

Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle

Benchmarking Cassandra on Violin

ORACLE SUPERCLUSTER T5-8

<Insert Picture Here> Refreshing Your Data Protection Environment with Next-Generation Architectures

How to Migrate your Database to Oracle Exadata. Noam Cohen, Oracle DB Consultant, E&M Computing

Oracle Database Scalability in VMware ESX VMware ESX 3.5

Oracle Big Data SQL Technical Update

Comparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet. September 2014

EMC Unified Storage for Microsoft SQL Server 2008

High Performance Oracle RAC Clusters A study of SSD SAN storage A Datapipe White Paper

The Revival of Direct Attached Storage for Oracle Databases

An Oracle White Paper June High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database

<Insert Picture Here>

Maximum Availability Architecture

Michael Kagan.

Enkitec Exadata Storage Layout

Solving I/O Bottlenecks to Enable Superior Cloud Efficiency

Optimize Oracle Business Intelligence Analytics with Oracle 12c In-Memory Database Option

Oracle Exadata Database Machine Aké jednoznačné výhody prináša pre finančné inštitúcie

Infrastructure Matters: POWER8 vs. Xeon x86

Benchmarking Hadoop & HBase on Violin

HP ProLiant BL660c Gen9 and Microsoft SQL Server 2014 technical brief

SAN Conceptual and Design Basics

EMC XtremSF: Delivering Next Generation Storage Performance for SQL Server

SUN HARDWARE FROM ORACLE: PRICING FOR EDUCATION

In-memory Tables Technology overview and solutions

SQL Server Consolidation Using Cisco Unified Computing System and Microsoft Hyper-V

An Oracle White Paper November Hybrid Columnar Compression (HCC) on Exadata

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

Evaluation Report: HP Blade Server and HP MSA 16GFC Storage Evaluation

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

Flash Memory Arrays Enabling the Virtualized Data Center. July 2010

HP SN1000E 16 Gb Fibre Channel HBA Evaluation

Best Practices for Optimizing SQL Server Database Performance with the LSI WarpDrive Acceleration Card

ORACLE EXADATA DATABASE MACHINE X3-2

Transcription:

How Oracle Exadata Delivers Extreme Performance John Clarke Sr. Oracle Architect Dallas Detroit Los Angeles Singapore India www.centroid.com 1

Ø Ø Agenda Oracle Exadata: Hardware and Software Optimized Exadata Hardware Components Ø Ø Ø Ø Ø Ø Ø Ø Ø How Oracle Works on Exadata Exadata Software Components Smart Scan and Cell Offload Processing Storage Indexes Smart Flash Cache Hybrid Columnar Compression IO Resource Management How to Get There from Here A New Way to Look at Optimization Page 2

Exadata: Hardware and Software Optimized The Exadata Database Machine is designed for extreme performance, manageability, consolidation, and high-availability. Oracle achieves this by: Ø Utilizing an optimally-balanced hardware infrastructure Ø Using fast hardware components designed to eliminate IO, CPU, and network bottlenecks Ø Delivering significant Oracle software enhancements to leverage balanced hardware configuration The Exadata Database Machine hardware is just part of the solution the combination of a balanced hardware configuration with Exadata s storage software is what delivers extreme performance. Page 3

Exadata Hardware Database servers, or Compute nodes InfiniBand Interconnect Exadata Storage Servers, or Cells Page 4

Ø Ø Ø Ø Ø Ø Ø Exadata Storage Servers Exadata storage servers are self-contained storage platforms that house disk storage and run the Exadata Storage Server software provided by Oracle A single Exadata storage server is also known as a cell A cell is the building block for the Exadata storage grid More cells provide greater capacity and IO bandwidth Databases are typically deployed across multiple cells The storage cells communicate with the Oracle database and ASM instances over InfiniBand Each cell is wholly dedicated to Oracle database files Page 5

Ø Ø Ø Ø Exadata Storage Servers Exadata storage servers provide high-performance storage for Oracle databases Ø Ø Up to 1.8 Gb/sec raw data bandwidth per cell Up to 75,000 IOPs using Flash per cell Based on 64-bit Sun Fire servers Comes installed with Exadata storage server software, Oracle Linux x86_64, drivers and utilities Storage servers only available with the Exadata Database Machine Page 6

Exadata Storage Servers Component Processors Memory Local Disks Flash Disk Controller Network Remote Management Power Supplies Details 2 six-core Intel Xeon L5640 Processors (2.26GHz) 24 GB (6 x 4GB) 12 x 600 GB 15K RPM High Performance SAS, or 12 x 2Tb 7.2k RPM High Capacity SAS 4 x 96Gb Sun Flash Accelerator F20 PCIe Cards Disk controller HBA with 512 MB battery backed cache Two InfiniBand 4X QDR (40Gb/s) ports 1 dual-port PCIe HCA Four embedded GbE ports 1 Ethernet port for ILOM 2 redundant hot-swappable power supplies Sun Fire X4270 M2 Page 7

Ø Ø Ø Exadata Database Servers The compute node grid consists of multiple database servers Oracle 11gR2 and ASM run on the database servers Companies typically run RAC databases on the database server to achieve high availability and maximize the aggregate CPU and memory horsepower in the compute node grid Page 8

Exadata Database Servers Component Processors Memory Local Disks Disk Controller Network Remote Management Power Supplies Details 2 6-core Intel Xeon X5670 Processors (2.93 GHz) 96 GB (12 x 8 GB) 4 x 300GB 10K RPM SAS disks Disk controller HBA with 512MB battery backed cache Two InfiniBand 4X QDR (40Gb/s) ports Four 1GbE Ethernet ports Two 10GbE Ethernet ports 1 Ethernet port for ILOM 2 redundant hot-swappable power supplies Operating System 64-bit Oracle Enterprise Linux 5.5 Solaris on X2-8 Page 9

Ø Ø Ø Ø InfiniBand Network InfiniBand is the Exadata Storage Network Looks like normal Ethernet to hosts Efficiency of a SAN Used for both storage and RAC interconnect Ø Uses high-performance ZDP InfiniBand protocol (RDS V3) Page 10

Ø InfiniBand Network Oracle uses InfiniBand because of its proven track record with high-performance computing - provides 40 Gb/ sec in each direction Ø Looks like Ethernet but is much faster Ø Ø Uses zero copy, which means data is transferred across the network without intermediate buffer copies in the various network layers Uses buffer reservation, which means that the hardware knows exactly where to place buffers ahead of time Ø Ø Ø Unified network fabric for both Exadata storage and RAC interconnect simplifies cabling and networking On top of InfiniBand, Exadata uses the Zero Data loss (ZDP) UDP protocol. The ZDP protocol has a very low CPU overhead with tests showing only a 2 percent CPU utilization while transferring 1 GB/sec of data. Each Exadata server is configured with one dual-port InfiniBand card designed to be connected to two separate InfiniBand switches for high availability. Each InfiniBand link is able to carry the full data bandwidth of the entire cell, which means you can lose an entire network without losing any performance. Page 11

Other Hardware Components Ø Ø Ø Ø Embedded Cisco switch for data center network uplink KVM switch for management of storage server and compute nodes Multiple PDUs with management interfaces ILOM (Integrated Lights Out Management) capability for each component rack Page 12

Exadata Configuration Options Oracle has four (4) configuration options: Ø Ø Ø Ø Ø Ø Ø Ø Exadata X2-2 Quarter Rack Exadata X2-2 Half Rack Exadata X2-2 Full Rack Exadata X2-8 Full Rack Each configuration is available with either High Performance or High Capacity disks The difference between the configurations is the # of storage servers and compute nodes Oracle has a short list of hardware configuration options because it requires the balanced hardware configuration A half rack can be upgraded to a full rack, and multiple racks can be interconnected Ø Quarter racks have 2 InfiniBand Switches; Half and Full racks have 3 Page 13

Exadata Configuration Options Compute Nodes X2-2 Quarter Rack X2-2 Half Rack X2-2 Full Rack X2-8 Full Rack # Compute nodes 2 4 8 2 Processor Cores per Node Total Processor Cores 12 12 12 64 24 48 96 128 Memory/Node 96 Gb 96 Gb 96 Gb 1 TB Total Memory 192 Gb 384 Gb 768 Gb 2 TB Page 14

Exadata Configuration Options Exadata Storage Servers X2-2 Quarter Rack X2-2 Half Rack X2-2 Full Rack X2-8 Full Rack # Storage cells 3 7 14 14 Disks/Cell 12 12 12 12 Total Disks 36 84 168 168 Flash 1.15 Tb 2.7 Tb 5.4 Tb 5.4 Tb Raw Storage (HP) 21.6 Tb 50.4 Tb 100.8 Tb 100.8 Tb Raw Storage (HC) 72 Tb 168 Tb 336 Tb 336 Tb Page 15

Exadata Configuration Options Exadata Storage Servers Raw disk throughput, High Performance disks Raw disk throughput, High Capacity disks Disk IOPs, High Performance disks Disk IOPS, High Capacity disks X2-2 Quarter Rack X2-2 Half Rack X2-2 Full Rack X2-8 Full Rack Up to 5,400 MBPS Up to 12,600 MBPS Up to 25,200 MBPS Up to 25,200 MBPS Up to 3,000 MBPS Up to 7,000 MBPS Up to 14,000 MBPS Up to 14,000 MBPS Up to 10,800 Up to 25,200 Up to 50,400 Up to 50,400 Up to 4,320 Up to 10,080 Up to 20,160 Up to 20,160 Flash IOPs Up to 225,000 Up to 525,000 Up to 1,050,000 Up to 1,050,000 Page 16

How Oracle Works on Exadata Oracle 11gR2 Oracle RAC Database storage uses ASM Page 17

How Oracle Works on Exadata Each storage cell contains: Exadata Storage Software Disks (LUNs, cell disks, and Grid Disks) ASM disk groups built on Exadata Grid Disks Exadata features implemented on storage servers and exploited by databases on compute nodes Page 18

How Oracle Works on Exadata Oracle communicates with the Exadata cells using idb idb is implemented using LIBCELL Exadata binaries are linked with LIBCELL to facilitate cell communication Page 19

How Oracle Works on Exadata CELLSRV is the primary component and provides majority of services Oracle Databases and ASM processes use LIBCELL to communicate to CELLSRV processes CELLSRV is what delivers the unique Exadata features Page 20

Oracle on Exadata Each Exadata cell has 12 physical disks Oracle reserves 29Gb on each of the first two disks in each cell for the system area Page 21

Oracle on Exadata One LUN is created on each physical disk A LUN is the unit on which a Cell Disk can be created Page 22

Oracle on Exadata Cell Disks are created on LUNs Cell Disks represent the storage area for each LUN Page 23

Oracle on Exadata Grid Disks are built on Cell Disks A Grid Disk can consume all space on a cell disk or an administrator-specified chunk of the Cell Disk The outermost tracks on a Grid Disk have the highest performance characteristics Building multiple Grid Disks per Cell Disk allows the administrator to segregate storage by usage and performance demands Interleaved Grid Disks can increase probability of important extents being on higher-performing disk tracks Page 24

Oracle on Exadata Grid Disk are directly exposed to ASM ASM Disk Groups are built on Grid Disks Page 25

Exadata Software Features at a Glance Exadata Software Goals Fully and evenly utilize all Computing resources in Exadata Database Machine to eliminate bottlenecks and deliver consistent high performance Reduce demand for resources by eliminating any IO that can be discarded without impacting result Page 26

Exadata Smart Scan One of the most important Exadata software features Typically where Exadata software features discussions start One of several cell offload features in Exadata cell offload is defined as Exadata s shifting of database work from the database servers to Exadata storage servers Primary goal of Smart Scan is to perform majority of IO processing on storage servers and returned smaller amounts of data from the storage infrastructure to the database servers Provides dramatic performance improvements for eligible SQL operations (full table scans, fast full index scans, joins, etc.) Smart Scan is implemented automatically on Exadata for eligible SQL operations Smart Scan only works on Exadata Page 27

Traditional SQL Processing User submits a query select state,count(*)! from census! group by state! The query is parsed and an execution path is determined. Extents are identified IO is issued Oracle retrieves blocks based on extents identified. A block is the smallest unit of transfer from storage system, contains rows and columns Oracle filters rows and columns, once loaded to buffer cache, and returns to user Database instance processes blocks all required blocks read from storage system to buffer cache Page 28

Smart Scan Processing User submits a query select state,count(*)! from census! group by state! An idb command is constructed and sent to Exadata cells Each Exadata cell scans data blocks and extracts relevant rows and columns that satisfy the SQL query The database consolidates results from across each Exadata cell and returns rows to the client Each Exadata cell returns to the database instance an idb message containing the requested rows and columns. These are not block images and are returned to the PGA Page 29

Smart Scan and Cell Offload Processing Filtering operations are offloaded to the Exadata storage cell Column Filtering Predicate Filtering (i.e., row filtering) Join Filtering Only requested rows and columns are returned to the database server Significantly less IO transfer over storage network Less memory and CPU required on database tier nodes Rows/columns retrieved to user s PGA via direct path read mechanism, not through buffer cache. Large IO requests don t saturate buffer cache Page 30

Smart Scan and Cell Offload Processing What s Eligible for Smart Scan on Exadata? Single-table full table scans (not IOTs, not BLOBs, no hash clusters) - Oracle is a C program - kcfis_read is function that does smart can - kcfis_read is called by direct path read function, klbldrget - klbkdrget is called from full scan functions Serial direct reads enabled Join filtering using Bloom Filters Bloom filtering determines which rows are required to satisfy a join A Bloom filter is created with the values of the join column for the smaller table of a join, and this filter is used by the Exadata storage server to eliminate row candidates from the larger table Page 31

Smart Scan in Action Our table is 4.53 Gb in size It s got almost 180 million rows All its indexes are invisible A full scan with returned in 5.27 seconds! Page 32

Measuring Smart Scan: cell_offload_plan_display cell_offload_plan_display setting AUTO ALWAYS NEVER Meaning Shows offload-related information in plan display if an Exadata cell is present and the objects are on the cell Shows offload-related information if the SQL statement is off-loadable, whether or not running on an Exadata cell or not. Useful for 11gR2 databases not running on Exadata to do simulated plans Never shows offload-related information in plans. TABLE ACCESS STORAGE FULL indicates that the query is cell-off-loadable, eligible for Smart Scan Page 33

Measuring Smart Scan with Statistics Statistic/Event cell physical IO bytes eligible for predicate offloading cell physical IO interconnect bytes Meaning Number of bytes eligible for cell offload Bytes returned from storage cells to database server cell physical IO interconnect bytes return by smart scan Bytes returned from storage cells via Smart Scan operations Page 34

Measuring Smart Scan with Statistics 4.8 Gb eligible for Smart Scan and 45MB returned from storage grid Smart Scan Efficiency of 99.05% Page 35

Measuring Smart Scan with Statistics PGA max before Smart Scan is about 4Mb PGA max after smart scan is over 10Mb Page 36

Controlling Smart Scan Behavior SQL> alter system set cell_offload_processing=<true FALSE>; Control cell offload systemwide SQL> alter session set cell_offload_processing=<true FALSE>; Control cell offload for current session SQL> select /*+ opt_param ( cell_offload_processing, <true false> ) */. Control cell offload for SQL statement Page 37

Controlling Smart Scan Behavior Full scan in 31+ seconds All 4.8GB returned from cell Page 38

Controlling Smart Scan Behavior Page 39

Smart Scan Predicate Filtering Ran in 6.64 seconds 2.1GB returned via Smart Scan, savings of 56.11% Page 40

Smart Scan Predicate Filtering 410K rows returned in 5.48 seconds 45MB returned via Smart Scan, savings of 99.03% Page 41

Smart Scan Predicate Filtering 2 rows returned in 5.39 seconds 8MB returned via Smart Scan, savings of 99.82% Page 42

Smart Scan Column Projection 1.17 MB of data through interconnect Page 43

Smart Scan Column Projection 651,744 bytes through interconnect Page 44

Smart Scan Column Projection 1,176,032 bytes when selecting BIGCOL 651,744 bytes when selecting BIGCOL 2,048 rows returned (1,874,990 651,744) / 2,048 = 524,288 bytes difference measured Predicted delta = 522,240 bytes, which his pretty close Page 45

Smart Scan Column Projection Another Example Page 46

Smart Scan Column Projection Another Example Query ran in 5.47 seconds Query ran in 13.04 seconds Page 47

Smart Scan: Join Filtering In some cases, Oracle will use bloom filters to optimize join filtering Bloom filters are implemented on the storage server on Exadata Bloom filters work by: Examining the columns in a join For the smaller table in a join, Oracle determines the space required to store the result of data and if small compared to the size of the joined (larger) table, may decide to use a bloom filter Bloom filters identify which columns are required to satisfy a join and build an in-memory set of values to compare against the larger table s rows Typically are built to reduce data communication between slaves for PQ joins and work well with PQO and partitioned tables Page 48

Smart Scan: Join Filtering with offloaded Bloom Filters Result returned in 7.45 second SYS_OP_BLOOM_FILTER indicates Bloom Filters. When it happens on Storage line, it means it s offloaded Page 49

Smart Scan: Join Filtering with offloaded Bloom Filters With _bloom_filter_predicate_pushdown_to_storage = false SYS_OP_BLOOM_FILTER indicates Bloom Filters. Here, does NOT exist on storage line, hence not offloaded Page 50

Smart Scan: Table Scans or Index Scans? Do we just drop all indexes and let Smart Scan do its magic? Completed in a little over 3 minutes Over 6.7 GB returned from cell note CPU and LIO values Page 51

Smart Scan: Table Scans or Index Scans? Do we just drop all indexes and let Smart Scan do its magic? Completed in less than a second with indexes TEST TEST TEST Far less IO and less CPU/LIO Indexes still work better for this type of query! Page 52

Smart Scan: Disabling Serial Direct Read Mechanism Query completed in 3.81 seconds 15Gb eligible for cell offload 100Mb transferred over from storage cell Smart Scan efficiency of 99.34% Page 53

Smart Scan: Disabling Serial Direct Read Mechanism Disable serial direct reads Query time increased from 3.8 seconds to 101 seconds Nothing eligible for cell offload no Smart Scan Page 54

Querying V$SQL to Determine Cell Off-loadable Queries How can you determine which SQL statements are Exadata cell offload-able? Page 55

Querying V$SQL to Determine Cell Off-loadable Queries How can you determine which SQL statements are Exadata cell offload-able? Page 56

Querying V$SQL to Determine Cell Off-loadable Queries How can you determine which SQL statements are Exadata cell offload-able? Page 57

Smart Scan Summary Smart Scan is software engineered for Exadata that can provide significant IO savings and dramatically improved performance Smart Scan works by offloading IO operations to the Exadata storage cell and only returning rows and columns requested by the SQL statement back to the database instance Row filtering provides significant cell scan efficiency gains Column filtering provides IO savings Join filtering using Bloom Filters, performed on Exadata cells, also can offer huge performance gains Don t drop all your OLTP indexes without testing single-row index access will often still outpace full scan with cell offload One of the great things about Exadata and Smart Scan is that nothing needs to be explicitly done by a developer or DBA to take advantage of it - it just works with Exadata Page 58

Exadata Storage Indexes The goal of storage indexes is to eliminate IO requests to Exadata disks Storage indexes are NOT at all like traditional Oracle indexes Storage indexes are like anti-indexes they help Oracle determine which blocks the requested data are NOT in Knowing which storage areas will NOT contain the requested data allows Exadata to skip IO requests to these disks Bypassing IO to disks reduces work required on the storage grid and improves performance Page 59

Exadata Storage Indexes: Architecture Each disk in an Exadata storage cell is divided into 1MB pieces called storage regions For each 1MB storage region, data distribution statistics are held in a memory structure called a region index Region indexes maintain min and max values for the storage region as data as selected from the disk storage region, for up to 8 columns in a table Region indexes are maintained over time as data is access from the disk Storage indexes are maintained automatically by Exadata; there is no way to influence their behavior Storage server reboots will erase storage index data region indexes are stored in memory Performance impact can be dramatic, but performance impact is not deterministic as its subject to Exadata s automatic management of region indexes Page 60

Exadata Storage Indexes: Architecture Region index select * from soe.orders where order_total > 60; Min 1 Max 22 Min: 62 Max 100 Min 1 Max 60 Min 1 Max 101 ASM AU Storage region Min = 1, Max = 60 ASM Disk Table: ORDERS Table: ORDERS Table: ORDERS C1 C2 C3 C4 1 11 C1 C2 C3 C4 62 81 Min = 62, Max = 100 C1 C2 C3 C4 1 59 Min = 1, Max = 60 22 99 22 Page 61

Exadata Storage Indexes: Scenarios Data needs to be well-ordered according to query predicates If all regions contain randomly ordered column values, region index min and max values will be too widespread to eliminate the region from an IO request Since Exadata only maintain regions indexes on 8 columns per table, your range of distinct predicates should be relatively small and static Storage indexes can return false-positive, but will never skip regions and jeopardize query integrity select * from soe.orders where order_total between 10 and 15; Min 1 Max 22 Min: 62 Max 100 Min 1 Max 60 Min 1 Max 101 Page 62

Exadata Storage Indexes in Action This is the key statistic: cell physical IO bytes saved by storage index Page 63

Exadata Storage Indexes in Action 0 rows returned in 4.85 seconds ~ 17Mb of IO saved by storage index Smart scan efficiency of 99.98% Page 64

~ 9Gb of IO sav by storage index Exadata Storage Indexes in Action 00 rows returned in in 2.26 4.85 seconds Smart scan efficiency of 99.98% Page 65

Exadata Storage Indexes in Action 0 rows returned in 0.09 seconds!! ~ 15Gb of IO saved by storage index Page 66

Exadata Storage Indexes in Action 7.8 million rows returned in 4.23 seconds No storage index savings Page 67

Exadata Storage Indexes in Action 7.8 million rows returned in 1.37 seconds Over 12Gb of IO savings with storage indexes Page 68

Exadata Storage Indexes in Action No storage index savings Page 69

Exadata Storage Indexes in Action No storage index savings Page 70

Exadata Storage Indexes and Bind Variables Storage indexes used Page 71

Exadata Storage Indexes and NULL values Storage index used Page 72

Exadata Storage Indexes with LIKE, BETWEEN, Wildcards No storage index savings Page 73

Exadata Storage Indexes with LIKE, BETWEEN, Wildcards 14GB of IO saved by storage index Page 74

Exadata Storage Indexes with LIKE, BETWEEN, Wildcards No storage index savings LIKE is fine, but Wildcards prevent storage index IO pruning Page 75

Exadata Storage Indexes with LIKE, BETWEEN, Wildcards Storage index used Page 76

Exadata Storage Indexes with LIKE, BETWEEN, Wildcards Storage index used but less IO saved with wildcards Page 77

Exadata Storage Indexes with LIKE, BETWEEN, Wildcards No storage index used Page 78

Exadata Storage Indexes with OLTP Tables Page 79

Exadata Storage Indexes with OLTP Tables Eliminated over 4Gb of IO via storage index Page 80

Exadata Storage Indexes with OLTP Tables Ran in 0.26 seconds Eliminated over 4GB of IO from via storage index Page 81

Exadata Storage Indexes with OLTP Tables Page 82

Exadata Storage Indexes with OLTP Tables No storage index when upperbound in range is higher than max value Page 83

Exadata Storage Indexes with Normal Index Access Page 84

Exadata Storage Indexes with Normal Index Access No storage index Page 85

Exadata Storage Indexes with Normal Index Access Storage indexes are only used for cell offload functions Page 86

Disabling Storage Indexes Storage indexes can be disabled by setting _kcfis_storageidx_disabled = TRUE Query ran in < 1 second ~ 12Gb saved via storage index Page 87

Disabling Storage Indexes Storage indexes can be disabled by setting _kcfis_storageidx_disabled = TRUE Query ran in 4.71 seconds 0 Bytes saved by storage index Page 88

Tracing Storage Indexes Storage index operations can be traced by setting _kcfis_storageidx_diag_mode = 2 Enable storage index tracing Indicates storage indexes were in use Page 89

Tracing Storage Indexes Page 90

Tracing Storage Indexes SQL_ID for given transaction DATA_OBJECT_ID Page 91

Tracing Storage Indexes strt = 0 end = 2048 Memory=2K Storage region size = 1Mb Column # 11 Low and high values of actual data Page 92

Tracing Storage Indexes No storage index savings Page 93

Summary Storage indexes are designed to eliminate IO requests on storage servers Storage indexes are automatically maintained in memory areas on the storage cell Exadata tracks minimum and maximum values in each storage region while processing IO requests and stores these region-level boundaries in region indexes Exadata examines region indexes to determine whether a storage region can be excluded from access based on a query predicate Storage indexes are automatically created and maintained Storage indexes work for cell offload functions (Smart Scan) Storage indexes can help performance dramatically and at worst-case, will never hurt performance Storage indexes work on well-ordered tables based on the query predicates issued against the tables Page 94

Exadata Smart Flash Cache Smart Flash Cache goal: intelligently cache frequently used data Exadata caches in PCI flash cards on the storage cell Data is cached to improve IO response time and deliver better database performance Flash cards can deliver tens of thousands of I/Os per seconds SAS or SATA drives can deliver a couple hundred I/Os per second On Exadata, Oracle only caches data likely to be requested again Page 95

Exadata Smart Flash Cache: Hardware Four 96Gb PCI flash cards per cell 384 Gb of PCI flash per cell 5.4 Tb of flash on full rack, 2.7 Tb for half rack, 1.15 Tb for quarter One cell can support 75,000 IOPs Full rack - over 1,000,000 IOPs Page 96

Exadata Smart Flash Cache: Software Smart Flash Cache provides a storage cell caching algorithm to cache appropriate data in the storage cell PCI flash cards Each database IO is tagged with metadata The CELL_FLASH_CACHE setting for the object(s): DEFAULT means Smart Flash Cache is used normally, KEEP means that Smart Flash Cache is used more aggressively, NONE means Smart Flash Cache is disabled for the object A cache hint: CACHE means the IO should be cached, NOCACHE means it shouldn t, and EVICT indicates that the cached data should be removed from Smart Flash Cache Smart Flash Cache takes the following into consideration when caching data: IO size: Large objects with CELL_FLASH_CACHE=DEFAULT are not cached Current cache load: Smart table scans are usually directed to disk but if CELL_FLASH_CACHE=KEEP and the cache load is low, they may be satisfied from Smart Flash Cache Specific operations (backups, Data Pump export/import, etc) are not cached Page 97

Exadata Smart Flash Cache: Software Smart Flash Cache is a write-through cache After a write is acknowledged, it s written to Smart Flash Cache if suitable for caching Write performance is not improved or diminished with Smart Flash Cache Because it s not a write-back cache, writes are not cushioned by PCI flash cards during write operations the writes need to happen to disk A small battery-backed cache on each cell performs write-back caching Technology comparisons: EMC FastCache: EMC EFDs as an extension of storage processor controller cache, write-back caching to cushion write IO, very good write performance FusionIO: data placed directly on PCI flash cards, very high write performance Exadata can match write IO performance of EMC FastCache, FusionIO by using Smart Flash Cache for Flash Grid Disk storage be wary of depleting capacity for normal Smart Flash Cache! Page 98

Smart Flash Cache: Write Operations #1: Database issues a write operation! #2: CELLSRV inspects IO metadata! #3: Data is written to disk! #6: CELLSRV uses LRU algorithm to determine which data to replace! #5: If IO is Smart Flash Cache suitable, IO is written to Flash Cache! #4: IO is acknowledged and database process continues! Page 99

Smart Flash Cache: Read from Previously Cached Data #1: Database issues a read request! #2: CELLSRV inspects IO metadata! #3: If data exists in Flash Cache, it will be read from cache no disk access! #4: Read is satisfied from cache! Page 100

Smart Flash Cache: Read from Un-cached Data #1: Database issues a read operation! #2: CELLSRV inspects IO metadata! #3: Data is read from disk! #6: CELLSRV uses LRU algorithm to determine which data to replace! #5: If IO is Smart Flash Cache suitable, IO is written to Flash Cache! #4: Read is acknowledged and database process continues! Page 101

Smart Flash Cache in Action: Writes exa_fc_mystat.sql Page 102

Smart Flash Cache in Action: Writes Page 103

Smart Flash Cache in Action: Writes Before updating, 97 flash cache hits 1000 rows updated Performed index range scan Page 104

Smart Flash Cache in Action: Writes Query Statistics Update yielded (368 97) = 271 flash cache read hits Page 105

Smart Flash Cache in Action: Writes No physical reads No flash cache reads Page 106

Smart Flash Cache in Action: Writes 180 physical reads 52 1 = 51 cell flash cache read hits Page 107

Smart Flash Cache in Action: Reads with Un-cached Data 1,072 flash cache hits for dbm1 Ran in 1:21.34 4,452 flash cache hits for dbm1 = 3,380 flash cache hits Page 108

Smart Flash Cache in Action: Reads with Cached Data Ran in 16.50 seconds 27.762 flash cache read hits Page 109

Smart Flash Cache in Action: Dropping Flash Cache Page 110

Smart Flash Cache in Action: Dropping Flash Cache Ran in 1:24.36 No flash cache read hits Page 111

Smart Flash Cache in Action: Dropping Flash Cache Page 112

Smart Flash Cache in Action: Smart Scan Data ~ 12Mb cached/cell for the index ~ 18Mb cached for one of the table partitions Page 113

Smart Flash Cache in Action: Smart Scan Data We re using about 350Mb/cell currently Page 114

Smart Flash Cache in Action: Smart Scan Data Page 115

Smart Flash Cache in Action: Smart Scan Data Very small delta in flash cache read hits Page 116

Smart Flash Cache in Action: Smart Scan Data Page 117

CELL_FLASH_CACHE KEEP Page 118

CELL_FLASH_CACHE KEEP Query time dropped to 28 seconds cell flash cache read hits jumped from 136,709 to 252,095 Page 119

CELL_FLASH_CACHE KEEP About 26Gb per cell per partition Page 120

CELL_FLASH_CACHE KEEP Page 121

Summary Exadata Smart Flash Cache is a caching mechanism delivered by Exadata Smart Flash Cache uses PCI flash cards on each storage cell to cache appropriate data Each storage cell has 384Gb of PCI flash available for Smart Flash Cache a full rack delivers 5.3TB of flash storage Smart Flash Cache intelligently caches appropriate data (OLTP) Smart Flash Cache is a write-through cache writes need to be performed on disk before acknowledged to foreground process. After writes and initial reads, Exadata determines whether data is eligible for cache and writes to Flash Cache. Subsequent IO benefits from Smart Flash Cache Segments can be tagged to be kept in Smart Flash Cache, similar to buffer cache keep Grid Disks (and ASM disk groups) can be created on Flash Disks, improving performance in some cases We recommend using all flash storage for Smart Flash Cache Page 122

Oracle 11g OLTP Compression VENDOR_ID VEND_NAME STATE VNDR_RATING VENDOR_TYPE ========== =========== ===== =========== ========== 100 ACME ONE MI 100 DIRECT 101 ACME ONE CA 90 DIRECT 102 NORTON IA 95 INDIRECT 103 WINGDINGS MI 96 INDIRECT 104 WINGDINGS GA 96 INDIRECT Uncompressed <- Header -> Symbol Table-> Compressed ACME ONEDIRECT WINGDINGS96INDIRECT Duplicate values inserted into symbol table 100ACME ONEMI100DIRECT 101ACME ONECA()DIRECT 102NORTONIA95INDIRECT 103WINGDINGSMS96INDIR ECT 104WINGDINGSGA96INDIR ECT <- Data -> <- Free Space -> 100*MI100* 101CA90* 102NORTONIA95INDIRECT 103*MS** 104*GA** As rows are inserted, only unique values are inserted into blocks Results in more free space Free space in block Rows are inserted into blocks, no compression Page 123

Oracle 11g OLTP Compression VENDOR_ID VEND_NAME STATE VNDR_RATING VENDOR_TYPE ========== =========== ===== =========== ========== 100 ACME ONE MI 100 DIRECT 101 ACME ONE CA 90 DIRECT 102 NORTON IA 95 INDIRECT 103 WINGDINGS MI 96 INDIRECT 104 WINGDINGS GA 96 INDIRECT When the first row is inserted into the block, it is not compressed Subsequent row inserts check for duplicates and move duplicate values to symbol table Uncompressed 100ACME ONEMI100DIRECT 101ACME ONECA()DIRECT 102NORTONIA95INDIRECT 103WINGDINGSMS96INDIR ECT 104WINGDINGSGA96INDIR ECT <- Header -> Symbol Table-> <- Free Space -> <- Data -> Compressed ACME ONEDIRECT WINGDINGS96INDIRECT 100*MI100* 101CA90* 102NORTONIA95INDIRECT 103*MS** 104*GA** After free space is exhausted in block, block will be fully compressed There is a slight CPU overhead in compressing data; Oracle has to build/maintain symbol table The unit of compression for OLTP Advanced Compression is the block Page 124

Exadata Hybrid Columnar Compression Data is stored by columns in a compression unit A compression unit is a set of blocks, not a single block Basic compression and Advanced Compression loads data into blocks row by row. As rows are added, compression algorithms are invoked to reference or update symbol tables and duplicate values are not inserted HCC examines data before being placed on a block and divides the data into arrays of columns HCC takes a stream if input values and places all values for column 1 in array 1, all values for column 2 in array 2, and so on HCC takes four 8k blocks of data in the stream, by default HCC then performs its de-duplication process on the entire stream of 4 blocks, so the odds of finding duplicate data are greatly increased over standard or Advanced compression because we re considering a large subset of data HCC then does leading-edge compression on the array values and inserts the columns into a compression unit (set of blocks), not row-by-row Page 125

Exadata Hybrid Columnar Compression VENDOR_ID VEND_NAME STATE VNDR_RATING VENDOR_TYPE ========== =========== ===== =========== ========== 100 ACME ONE MI 100 DIRECT 101 ACME ONE CA 90 DIRECT 102 NORTON IA 95 INDIRECT 103 WINGDINGS MI 96 INDIRECT 104 WINGDINGS GA 96 INDIRECT Uncompressed Hybrid Columnar Compression Logical Compression Unit <- Header -> 100ACME ONEMI100DIRECT 101ACME ONECA()DIRECT 102NORTONIA95INDIRECT 103WINGDINGSMS96INDIR ECT 104WINGDINGSGA96INDIR ECT CU Header-> VENDOR_ID VEND_NAME VNDR_RATING STATE VEND OR_TY PE COL7 COL6 COL8 COL10 COL9 Page 126

Exadata Hybrid Columnar Compression Compression Unit #1 COL2 COL5 COL6 COL8 COL9 4 blocks, first 50,000 rows COL1 COL3 COL7 COL10 COL4 Compression Unit #2 COL2 COL5 COL6 COL8 COL9 Second CU, new set of 4 blocks, 48,500 rows COL1 COL3 COL7 COL10 COL4 Compression Unit #2 Third CU, new set of 4 blocks, 51,000 rows COL2 COL5 COL6 COL8 COL9 COL1 COL3 COL7 COL10 COL4 Page 127

Exadata Hybrid Columnar Compression Recall, data is stored by columns inside a compression unit The greater the degree of duplicate column values, the less space required per compression unit and the fewer compression units required to store the data If queries select a single column or subset of columns, Oracle will only need to read from blocks units on which the columns exist This is different than other types of compression and un-compressed tables Not only are we savings space, but we re saving IO Saving IO means better performance! Page 128

Exadata Hybrid Columnar Compression and DML EHCC compress data for bulk direct path loads only Subsequent bulk load inserts will appends new compression units Single-row inserts will insert rows as OLTP compressed DELETEs against HCC tables lock entire CU When updating EHCC tables: The updated row is moved (i.e., deleted + re-inserted, i.e., migrated) New row is OLTP-compressed Locks impact entire CU, not just row! DML on EHCC tables is very expensive! Page 129

Exadata Hybrid Columnar Compression Types QUERY LOW compression is recommended for data warehouse tables in which data load times are important QUERY HIGH is recommended for data warehouse data in which space savings is important. Offers better compression ratio than QUERY LOW but more expensive ARCHIVE LOW is intended for archive data in which load times are important. Better compression than QUERY compression, in most cases, but more overhead for DML ARCHIVE HIGH is designed for archival data in which space savings is the most important factor. Best compression ratio, most expensive algorithm Page 130

Using the Compression Advisor DBMS_COMPRESSION.GET_COMPRESSION_RATIO is Oracle s compression advisor Indicates compression type in this case, EHCC archive low compression comp_for_archive_low = EHCC Archive Low comp_for_archive_high = EHCC Archive High comp_for_query_low = EHCC Query Low comp_for_query_high = EHCC Query High comp_for_oltp = OLTP Advanced Compression Page 131

Using the Compression Advisor DBMS_COMPRESSION.GET_COMPRESSION_RATIO is Oracle s compression advisor OLTP Compression gives a low 1.1 compression ratio Page 132

Using the Compression Advisor DBMS_COMPRESSION.GET_COMPRESSION_RATIO is Oracle s compression advisor EHCC Query Low Compression gives 2.7 compression ratio Page 133

Using the Compression Advisor DBMS_COMPRESSION.GET_COMPRESSION_RATIO is Oracle s compression advisor EHCC Query High Compression gives 5.0 compression ratio Page 134

Using the Compression Advisor DBMS_COMPRESSION.GET_COMPRESSION_RATIO is Oracle s compression advisor EHCC Archive High Compression gives 9.2 compression ratio Page 135

Hybrid Columnar Compression in Action Uncompressed table: SOE.CUSTOMERS Use PCTAS to create 5 new compressed tables: one compressed for OLTP, one EHCC compressed for query low, one compressed for query high, one compressed for archive low, and one compressed for archive high Measure time it takes to create compressed table Measure time it takes to run sample SELECT against compressed table Measure time it takes to bulk insert rows into compressed table Measure time it takes to update rows on compressed table CUSTOMERS CUST_OLTP CUST_QLOW CUST_QHIGH CUST_ALOW CUST_AHIGH Estimated Blocks 19,670 17,881 7,285 3,934 3,391 2,138 Page 136

Storage Impact: Different Compression Scenarios Page 137

Storage Impact: Different Compression Scenarios In our example, archive compression is not as efficient as HCC query high compression Page 138

Storage Impact: Different Compression Scenarios Good estimates from DBMS_COMPRESSION CUSTOMERS CUST_OLTP CUST_QLOW CUST_QHIGH CUST_ALOW CUST_AHIGH Estimated Blocks 19,670 17,881 7,285 3,934 3,391 2,138 Page 139

Performance Impact: Creating Compressed Tables Page 140

Performance Impact: Creating Compressed Tables Page 141

Performance Impact: Creating Compressed Tables Create Table Time (CUSTOMERS) 50 45 40 35 30 25 20 15 10 5 0 OLTP QUERYLOW QUERYHIGH ARCHIVELOW ARCHIVEHIGH Page 142

Performance Impact: Creating Compressed Tables 2500 Create Table Time (MYOBJ) 2000 1500 1000 500 0 OLTP QUERYLOW QUERYHIGH ARCHIVELOW ARCHIVEHIGH Page 143

Performance Impact: Querying Compressed Tables We expect least IO and best full-scan time against this table Page 144

Performance Impact: Querying Compressed Tables Query ran in 4.63 seconds ~600Mb of IO 96% smart scan efficiency Page 145

Performance Impact: Querying Compressed Tables 4.91 seconds ~700Mb of IO 17% less LIO compared to uncompressed 94.46% smart scan efficiency Page 146

Performance Impact: Querying Compressed Tables 3.03 second runtime 97MB over interconnect Less CPU and LIO 99.07% smart scan efficiency Page 147

Performance Impact: Querying Compressed Tables Query ran in 2.07 seconds 35 MB over interconnect Less CPU and LIO 99.52% smart scan efficiency Page 148

Performance Impact: Querying Compressed Tables Ran in 2.08 seconds 19MB over interconnect Even less CPU and LIO 99.74% smart scan efficiency Page 149

Performance Impact: Querying Compressed Tables 2.06 seconds 17MB over interconnect 99.76% smart scan efficiency Page 150

Performance Impact: Querying Compressed Tables Page 151

Performance Impact: Updating Compressed Tables Page 152

Performance Impact: Updating Compressed Tables Took the longest amount of time More CPU required to update HCC compressed tables Page 153

Proving Oracle migrated HCC Updated Rows Page 154

Proving Oracle migrated HCC Updated Rows Row location: File 11 Block 1569727 Slot 165 Page 155

Proving Oracle migrated HCC Updated Rows Page 156

Proving Oracle migrated HCC Updated Rows Page 157

Proving Oracle migrated HCC Updated Rows Value=1 means fetch was done via migrated row Page 158

Examining the Compression Unit Archive high # Blocks/CU = 32 Cols/Rows CU Length Page 159

Examining the Compression Unit Page 160

Examining the Compression Unit Archive high CU size = 7 blocks smaller than previous block dump 1 deleted row, slot 165 Page 161

HCC Decompression HCC compresses data in-flight as it s inserted via direct path Where and when is the data decompressed when queried? Definitions: Upper Half = Compute Node Lower Half = Storage Server HCC data is decompressed on lower half when queried via Smart Scan (i.e., full scan, i.e., serial direct read). Uncompressed data transmitted over the interconnect, storage server CPUs used for decompression HCC data is decompressed on upper half when queried without Smart Scan (i.e., single-block reads). Compressed data transmitted over interconnect, compute node CPUs used for decompression Plan for CPU impact and application design! Page 162

HCC Decompression 42 cs of CPU time on compute node Page 163

HCC Decompression Index range scan Page 164

HCC Decompression 382 cs of CPU time Page 165

Hybrid Columnar Compression Summary Exadata Hybrid Columnar Compression is a unique compression feature available only to Exadata Hybrid Columnar Compression compresses tables by column for a set of rows and stores inside a logical compression unit The unit of compression for basic and OLTP Advanced Compression is a database block The unit of compression for Hybrid Columnar Compression is a compression unit Oracle provides 4 different flavors of Hybrid Columnar Compression: compress for query low, compress for query high, compress for archive low, and compress for archive high IO against Hybrid Columnar Compressed tables is reduced compared to un-compressed tables SELECTs against Hybrid Columnar Compressed tables can run much faster less IO! Rows need to be loaded into HCC tables using DIRECT PATH load DML against HCC tables can be expensive: single-row INSERT operations use OLTP compression, direct path inserts perform HCC compression, and UPDATEs migrate rows (and can be expensive!) Page 166

Exadata IO Resource Management IO Resource Management (IORM) provides a means to govern and meter IO from different workloads in the Exadata Storage Server Database consolidation is a key driver to customer adoption of Exadata Consolidation means that multiple databases and applications could share Exadata storage Different databases in a shared Exadata storage grid could have different IO performance requirements One of the common challenges with shared storage infrastructure is that of competing IO workloads Batch vs. OLTP Warehouse vs. OLTP Production vs. Test and Development You can mitigate competing priorities by over-provisioning storage, but this becomes expensive Exadata addresses this challenge with IO Resource Management. Page 167

Database Resource Management A single database may have many types of workloads with different performance requirements Resource consumer groups allow you to group sessions by workload After creating resource consumer groups, you specify how resources are used within a resource consumer group Once resource consumer groups are established, you must map sessions to a consumer group based on distinguishing characteristics The combination of resource consumer groups and session mappings comprises a resource plan One resource plan can be active in a database at a time A database resource plan is also called an intradatabase resource plan Let s show an example Page 168

Database Resource Management - Example Database DBM OM OLTP Consumer group Other OLTP Consumer group Database XBM Online query Consumer group Reporting Consumer group Batch query Consumer group Page 169

Database Resource Management - Example Database DBM OM OLTP Consumer group Other OLTP Consumer group Database XBM Online query Consumer group Interactive category Batch category Reporting Consumer group Batch query Consumer group Page 170

IO Resource Management Plans IORM provides different approaches for managing resource allocations If you have multiple workloads within a database that you wish to control database resource usage with, you need to configure an intradatabase resource plan If you only have one database in your Exadata Database Machine, your intradatabase resource plan is all you need IO resource management is handled automatically inside the storage servers based on this intradatabase resource plan If you have multiple databases in your Exadata Database Machine that you wish to govern IO resource amongst, you create an interdatabase resource plan Rules in an interdatabase resource plan specify allocations to databases, not consumer groups Category resource management is used when you want to control resources primary by the category of work being done it allows for allocation of resources amongst categories spanning multiple databases An IORM plan is the combination of an interdatabase plan and a category plan Page 171

IORM Architecture CELLSRV queue 3 CELLSRV queue 1 CELLSRV queue 4 CELLSRV queue 2 Page 172

IORM Architecture Resource Plans IORM schedules IO according to resource plans onto disk queues Page 173

IORM Architecture: Rules IORM is only engaged when needed IORM does not intervene if there is only one active consumer group on one database Any disk allocation that is not fully utilized is made available to other workloads in relation to the configured resource plans Background IO is scheduled based on their priority relative to user IO Redo and control file writes always take precedence DBWR writes are scheduled at the same priority as user IO For each cell disk, each database accessing the cell has one IO queue per consumer group and three background queues Background IO queues are mapped to high, medium, and low priority requests with different IO types mapped to each queue If no intradatabase plan is set, all non-background IO requests are grouped into a single consumer group called OTHER_GROUPS Page 174

IORM in Action: Planning DBM has three consumer groups, OM OLTP, OTHER OLTP, and REPORTING XBM will have two consumer groups, ONLINE QUERY and BATCH QUERY DBM Intradatabase Resource Plan 50% of resources allocated to OM OLTP 30% of resources allocated to OTHER OLTP 20% of resources allocated to REPORTING XBM Intradatabase Resource Plan 70% of resources allocated to ONLINE QUERY 30% of resources allocated to BATCH QUERY Page 175

IORM in Action: Planning Category Plan 70% of resources allocated to INTERACTIVE category OM OLTP and OTHER OLTP in INTERACTIVE category for DBM ONLINE QUERY in INTERACTIVE category for XBM 30% of resources allocated to BATCH category REPORTING in BATCH category for DBM BATCH QUERY in BATCH category for XBM Interdatabase Plan 60% of resources allocated to database DBM 40% of resources allocated to database XBM Page 176

IORM in Action: Understanding the Math All User IO = 100% Category Plan 70% Interactive 30% Batch Interdatabase Plan 40% XBM 60% DBM 40% XBM 60% DBM Intradatabase Plan 50% 30% 70% 20% 30% IORM Allocation DBM OM OLTP 26.25% DBM OTHER OLTP: 15.75% XBM: ONLINE QUERY 28.00% DBM: REPORTING 18.00% XBM: BATCH QUERY 12.00% Page 177

IORM in Action: Understanding the Math CG% = (Intra CG% / sum (X)) * db% * cat% CG% = IORM determined resource allocation for consumer group sessions Intra CG% = resource allocation for consumer group within an intradatabase plan X = sum of intradatabase consumer group allocations for all consumer groups in the same category Db% = percentage of database allocation in the interdatabase plan cat% = percentage of resource allocation for the category in which the consumer group belongs Page 178

IORM in Action Page 179

IORM in Action: Configuration on DBM Database Page 180

IORM in Action: Configuration on XBM Database Page 181

IORM in Action: On the Exadata Cells Page 182

IORM in Action: On the Exadata Cells Page 183

Mapping Sessions to Consumer Groups in DBM Page 184

Validate Consumer Group Mapping in DBM Page 185

Mapping Sessions to Consumer Groups in XBM Page 186

Validate Consumer Group Mapping in XBM Page 187

Using Cell database IO metrics Use to measure DB load Use to measure DB load Use to monitor what databases waited for IO to be queued by IORM Metric Name DB_IO_RQ_SM DB_IO_RQ_LG DB_IO_RQ_SM_SEC DB_IO_RQ_LG_SEC DB_IO_WT_SM DB_IO_WT_LG Meaning Total number of IO requests issues by the database since any resource plan was set IO requests per second issued by the database in the last minute Total number of seconds that IO requests issued by the database waited to be scheduled Page 188

Using Cell database IO metrics Small IO load per database Large IO load per database Page 189

Using Cell database IO metrics Small IO load per database/ sec, last minute Large IO load per database/ sec, last minute Page 190

Using Cell database IO metrics Small IO Waits/DB/ Min Large IO Waits/DB/ Min Page 191

Testing our IORM Plan Page 192

Imposing Database Limits with IORM Page 193

Imposing Database Limits with IORM Page 194

Imposing Database Limits with IORM Page 195

Imposing Database Limits with IORM Page 196

Summary IO Resource Management is an Exadata feature designed to govern IO to the Exadata Storage Cells IORM is used in conjunction with DBRM Consumer groups, resource plans, resource group mappings and directives are part of an intradatabase resource plan; i.e., resource allocation management within a database Interdatabase plans are used to control IO resource allocation between multiple databases sharing an Exadata storage server Category plans are a way to group resource consumer groups from an intradatabase plan into an IORM plan and further control IO resource allocation in the Exadata storage servers. If you re consolidating on Exadata, you should be using IORM Instance caging is a way to control CPU utilization for a database in an Exadata compute grid Page 197

Top Reasons to Migrate to Exadata Your database requires the extreme performance that Exadata can deliver You have business goals that can only be met by Exadata s ability to satisfy extreme performance and consolidation capabilities You need to consolidate your database tier platform into a single high-performance, high-capacity, fault tolerant platform You want to reduce the number of people and amount of time required to build and maintain Oracle database tier infrastructures Your capacity or performance requirements have placed you in a position to evaluate enterprise database platform re-architecting You are looking to simplify your database tier infrastructure into a common technology stack with a single-point-of-contact for support and management Exadata is a business enabler Page 198