Oracle InMemory Database Calgary Oracle Users Group December 11, 2014
Outline Introductions Who is here? Purpose of this presentation Background Why In-Memory What it is How it works Technical mechanics How it is used Demos Considerations Conclusion
Introductions Me Started with Oracle in 1989 (Version 5.1B 22 5 ¼ floppy (ior i)) Oracle DBCORP Cognicase CGI First4 Database/Application performance/design Data warehousing Data Modeling You? What is your Role? DBA DEVELOPER MANAGER STUDENT
Purpose of the Presentation Review the motivation for in-memory Discuss the Oracle Option Explain how it works Show use cases Demonstrate functionality
In memory? Hasn t the database always been in memory? Toggle into memory
What about Times-Ten Nope for Oracle.com Oracle TimesTen In-Memory Database (TimesTen) is a fullfeatured, memory-optimized, relational database with persistence and recoverability. It provides applications with the instant responsiveness and very high throughput required by databaseintensive applications. Deployed in the application tier, TimesTen operates on databases that fit entirely in physical memory (RAM). Applications access the TimesTen database using standard SQL interfaces.
In-memory Hype? BIG Data Data Lake Data Reservoir Data Dump Data Scientist Memory is now cheap, difficult to licence on.
In-memory Hype?
In-memory Hype?
What is the Motivation? ANALYTICS Variable structure Velocity Volume Trend Can t wait for Extract, Transform and Load (ETL) Query real-time data
Approach? Replication Active Data Guard / Golden Gate Limited unless restructure to support analytics ETL to restructure and present in OLAP In-memory Dual format (tabular/columnar) Everybody doing it
Why is an IM scan faster than the buffer cache? Buffer Cache SELECT COL4 FROM MYTABLE; X X X X X RESULT Row Format 15
Oracle Database In-memory Goals Real-Time Analytics Accelerate Mixed workload OLTP No Changes to Applications Trivial to implement
Real-Time Enterprise DATA- DRIVEN EFFICIENT AGILE Data-Driven Get immediate answers to any question with real-time analytics Agile Eliminate latency with analytics directly on OLTP data Efficient Easily and Nondisruptive deployment accelerates all applications 17
Breakthrough: Dual Format Database Normal Buffer Cache SALES Row Format SALE S New In-Memory Format SALES Column Format BOTH row and column formats for same table Simultaneously active and transactionally consistent Analytics & reporting use new in-memory Column format OLTP uses proven row format 18
In-Memory Use Cases OLTP Real-time reporting directly on OLTP source data Removes need for separate ODS Speeds data extraction Data Warehouse Staging/ETL/Temp not a candidate Write once, read once All or a subset of Foundation Layer For time sensitive analytics Potential to replace Access Layer 19
Complex OLTP is Slowed by Analytic Indexes Table 1 3 OLTP Indexes 10 20 Analytic Indexes Most Indexes in complex OLTP (e.g. ERP) databases are only used for analytic queries Inserting one row into a table requires updating 10-20 analytic indexes: Slow! Indexes only speed up predictable queries & reports 20
Oracle In-Memory Column Store Storage Index Example: Find all sales from stores with a store_id of 8 Memory IMCU Min 1 Max 3 Each column is the made up of multiple column units IMCU IMCU IMCU SALES Column Format Min 4 Max 7 Min 8 Max 12 Min 7 Max 15 Min / max value is recorded for each column unit in a storage index Storage index provides partition pruning like performance for ALL queries 21
Vector Register REGION Orders of Magnitude Faster Analytic Data Scans Memory Example: Find sales in region of CA Each CPU core scans local in-memory columns Scans use super fast SIMD vector instructions CPU Load multiple region values CA CA CA CA Vector Compare all values an 1 cycle Originally designed for graphics & science Billions of rows/sec scan rate per CPU core Row format is millions/sec > 100x Faster 22
Pre In-Memory SGA
In-Memory SGA
In-Memory Area: Composition In Memory Area IMCU IMCU SMU SMU IMCU IMCU SMU SMU IMCU IMCU SMU SMU SMU SMU IMCU IMCU Column Format Data Metadata Contains two subpools for: In Memory Compression Units (IMCUs) Snapshot Metadata Units (SMUs) IMCUs contain column formatted data SMUs contain metadata and transactional information Space required for SMU pool is a small percentage (typically <10%) of In-Memory Area Current pool sizes and status visible in V$INMEMORY_AREA view 2
Compression Unit (IMCU) ROWID Extent #13 Blocks 20-120 EMPID IMCU header Column CUs NAME Extent #14 Blocks 82-182 DEPT SALARY Extent #15 Blocks 201-301 Unit of column store allocation Columnar representation of a large number of rows from an object Rows from one or more table extents Actual size depends on size of rows, compression factor, etc. Each column stored as a separate contiguous Column Compression Unit (column CU) Rowids also stored as a Column CU 26
In Memory Column Unit
In-Memory A Store Not A Cache Pure In-Memory Columnar SALES What is a store? A static pool of memory You decide what objects are populated in memory Alter table SALES INMEMORY Objects don t age out Objects automatically kept transactionally consistent 28
User controls memory population
Demo Enable (by default! Kevin Clossen Blog) Set inmemory_size > 100M Population DDL (create or alter) Default tablespace level settings Table level attributes Segment level attributes By Column set (vs. indexes) IMCO (inmemory coordinator) -> 2 minute cycle SMCU (worker processes) Priority None default Needs select to get into memory Low, medium, high, critical After start-up, workers allocated in priority order to load
DML
RAC Capabilities Distribute inmemory objects Aggregate total inmemory size Replicate Segments loaded inmemory on each node
In-Memory Aware Cost Model Cost of In-Memory Scan IO cost: Includes the cost of reading: Invalid rows from disk Extent map CPU cost: Includes Traversing IMCUs IMCU pruning using storage indexes Decompressing IMCUs Predicate evaluation Stitching rows Scanning transaction journal rows 34
In-Memory & The Optimizer Plan Display SELECT count(*) FROM sales WHERE quantity_sold > 30 or (quantity_sold < 10 and prod_id = 50); ------------------------------------------------------------------- Id Operation Name Pstart Pstop ------------------------------------------------------------------- 0 SELECT STATEMENT 1 SORT AGGREGATE 2 PARTITION RANGE ALL 1 16 * 3 TABLE ACCESS INMEMORY FULL SALES 1 16 ------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - inmemory(("quantity_sold">30 OR "QUANTITY_SOLD"<10) AND ("QUANTITY_SOLD">30 OR "PROD_ID"=50 AND "QUANTITY_SOLD"<10)) filter("quantity_sold">30 OR "PROD_ID"=50 AND "QUANTITY_SOLD"<10) 35
Tracing: SALES is partially inmemory *************************************** BASE STATISTICAL INFORMATION *********************** Table Stats:: Table: SALES Alias: SALES (Using composite stats) #Rows: 960 SSZ: 0 LGR: 0 #Blks: 12 AvgRowLen: 30.00 NEB: 0 ChainCnt: 0.00 SPC: 0 RFL: 0 RNF: 0 CBK: 0 CHR: 0 KQDFLG: 1 #IMCUs: 7 IMCRowCnt: 560 IMCJournalRowCnt: 14 #IMCBlocks: 7 IMCQuotient: 0.583333 Index Stats:: Index: SALES_CHANNEL_BIX Col#: 4 USING COMPOSITE STATS LVLS: 0 #LB: 12 #DK: 5 LB/K: 2.00 DB/K: 12.00 CLUF: 60.00 NRW: 60.00 SSZ: 0.00 LGR: 0.00 CBK: 0.00 GQL: 0.00 CHR: 0.00 KQDFLG: 1 BSZ: 1 KKEISFLG: 1 ALL PARTITIONS USABLE Index: SALES_CUST_BIX Col#: 2 USING COMPOSITE STATS LVLS: 0 #LB: 12 #DK: 630 LB/K: 1.00 DB/K: 1.00 CLUF: 927.00 NRW: 927.00 SSZ: 0.00 LGR: 0.00 CBK: 0.00 GQL: 0.00 CHR: 0.00 KQDFLG: 1 BSZ: 1 KKEISFLG: 1 ALL PARTITIONS USABLE Index: SALES_PROD_BIX Col#: 1 USING COMPOSITE STATS LVLS: 0 #LB: 12 #DK: 766 LB/K: 1.00 DB/K: 1.00 CLUF: 938.00 NRW: 938.00 SSZ: 0.00 LGR: 0.00 CBK: 0.00 GQL: 0.00 CHR: 0.00 KQDFLG: 1 BSZ: 1 KKEISFLG: 1 ALL PARTITIONS USABLE Index: SALES_PROMO_BIX Col#: 5 USING COMPOSITE STATS LVLS: 0 #LB: 12 #DK: 116 LB/K: 1.00 DB/K: 1.00 CLUF: 133.00 NRW: 133.00 SSZ: 0.00 LGR: 0.00 CBK: 0.00 GQL: 0.00 CHR: 0.00 KQDFLG: 1 BSZ: 1 KKEISFLG: 1 ALL PARTITIONS USABLE Index: SALES_TIME_BIX Col#: 3 USING COMPOSITE STATS LVLS: 0 #LB: 12 #DK: 620 LB/K: 1.00 DB/K: 1.00 CLUF: 620.00 NRW: 620.00 SSZ: 0.00 LGR: 0.00 CBK: 0.00 GQL: 0.00 CHR: 0.00 1 BSZ: 1 KKEISFLG: 10 KQDFLG: 1 ALL PARTITIONS USABLE IMC Implied predicates generated: Table SALES [SALES]: (null)"sales"."quantity_sold">30 OR "SALES"."QUANTITY_SOLD"<10 select count(*) from sales where quantity_sold > 30 or (quantity_sold < 10 and prod_id = 5000); Table: SALES Alias: SALES (Using composite stats) #Rows: 960 SSZ: 0 LGR: 0 #Blks: 12 AvgRowLen: 30.00 NEB: 0 ChainCnt: 0.00 SPC: 0 RFL: 0 RNF: 0 CBK: 0 CHR: #IMCUs: 7 IMCRowCnt: 560 #IMCBlocks: 7 IMCQuotient: 0.583333 36
In-Memory & The Optimizer Best Practices with or without In-Memory Use DBMS_STATS.GATHER_*_STATS procedures to gather statistics Use histograms to make the Optimizer aware of any data skews Use extended statistics to make the Optimizer aware of correlation Column group statistics used for both single table cardinality estimates, joins & aggregation
Other Considerations Load time Database restarts Availability dependencies Datapump aware Pluggable database aware Inmemory at the container level Shared amongst pluggables
Conclusion InMemory Database Verify your use-case Plan your implementation (instrument and verify) Resources ORACLE LEARNING LIBRARY (YouTube) Inmemory Oracle blog (Maria Cargan) https://blogs.oracle.com/in-memory/ Oracle Open World Presentation Oracle documentation TechNet White Paper
Questions?