1
Oracle TimesTen In-Memory Database for Analytics - Best Practices and Use Cases Susan Cheung Vice President - Product Management 2
Agenda TimesTen Overview Best Practices, Tips, and Tricks Customer Use Cases Summary 3
TimesTen In-Memory Database (very brief) Overview 4
Oracle TimesTen In-Memory Database 16 years of Innovation 5
Oracle TimesTen In-Memory Database Deployment Options 1. Standalone In-Memory Database for OLTP applications Standalone In-Memory DB for OLTP Application 2. In-Memory Cache Database for the Oracle Database Application In-Memory OLTP Caching 3. In-Memory Analytics in Oracle Exalytics In-Memory Analytics 6
Oracle TimesTen In-Memory Database Powerful In-Memory Transactional and Analytics RDBMS In-Memory Relational Database Compatible with Oracle Database Oracle TimesTen In-Memory Database Extremely Fast Persistent and Recoverable 7
TimesTen In-Memory Database Persistent, Recoverable, and Highly Available Client-Server Application Client/ Server TimesTen Client Lib Direct-Linked Application TimesTen Libraries JDBC / ODBC / ADO.NET / OCI / PLSQL Memory-Resident Database Fast data access Checkpoint Files Transaction Log Files Persistent and Recoverable Commits and Rollbacks Transaction logs persisted to disk storage Dual checkpoint files for restart and recoverability HA and DR via Replication 8
TimesTen Persistence and Recovery Checkpoint and Transaction Log Files Checkpoint is a snapshot of the database persisted on disk storage Dual checkpoint files for redundancy All transactions logged to in-memory transaction log buffer Parallel transaction log manager Background log flushers automatically persist log data to disk storage Checkpoint Files Data Inserts Updates Transaction Log Files... 9
Analytics Use Cases for Oracle TimesTen 10
In-Memory Analytics Summary Aggregates - Sub-second Response Time Aggregations are more suitable for analysis on higher-level grains of FACT data Knowledge of query patterns required Aggregated tables and indexes typically much smaller than detail-tables Easily fit entirely in memory using modern hardware Reports using summary aggregates typically provide exceptional response time (sub-second) 11
In-Memory Analytics Summary Aggregates Scenarios Reports Using Aggregate Set 1 OLTP Operational Data Store 1 2 Summary Aggregate Tables and Indexes OLTP Data Warehouse Reports Using stored in TimesTen Aggregate Set 2 12
Data Mart Use Cases Subset of Data from a Data Warehouse Data mart and operational data stores For analysis where summary aggregates are not sufficient Require access to detail source tables (fact and dimension tables) Consider hot set of data, not the entire warehouse Data volume constrained by availability of RAM in the system May use compression to include more data Common use cases have both aggregations and detail tables 13
In-Memory Analytics ODS and Data Mart Detail Tables Scenarios Operational Reports OLTP Operational Data Store OLTP Data Warehouse Aggregates + Detail tables + indexes Reports using Aggregates AND Detail Tables 14
Best Practices Tips and Tricks 15
TimesTen Shared Memory Segment TimesTen Databases TimesTen database at run time resides in a shared memory segment (size = database size) Permanent Region -PermSize Tables, indexes, etc Persisted to checkpoint files Temporary Region TempSize Query processing (e.g. sorting, group by, order by) Usage depends on number of connections Not persisted Transaction log buffer -LogBufMB Persisted to transaction log files DB Header Permanent Region Temporary Region Transaction Log Buffer 16
Disk Still Matters For Persistence and Recovery High performance transaction log manager Background transaction log flusher automatically persists transaction data to disk Synchronous and buffered disklogging options Application developer has finegrained control down to the transaction level Dual checkpoint files for recovery Background Log flusher persists transaction data to disk Application Application TimesTen Application TimesTen Libraries TimesTen Libraries Libraries Transaction Log Buffer in memory Updates Log Records Memory resident Database 1 Periodic snapshot of database to checkpoint files 2 Transaction Log files Checkpoint files 17
Best Practices For Transaction Logs Place transaction log files and checkpoint files onto separate disks Prevents checkpoint I/O from affecting transaction log I/O Configure rate of checkpoints to reduce impact Faster disks yields Higher transaction throughput for write intensive applications! Faster checkpoints, faster recovery Using SSD / Flash can benefit logging and checkpoints Properly size transaction log buffer Transaction log buffer may be sized up to 64GB MonitorLOG_BUFFER_WAITS, LOG_FS_READS 18
TimesTen Checkpointing Process Purging of Transaction Log Files Transaction log files purged via checkpointing After the changes reflected by the files records have been flushed to both checkpoint files No other TimesTen components (incremental backups, replication, XLA etc.) require the data Frequent (regular) checkpoints prevents transaction logs accumulation Checkpoint frequency interval Log volume 19
Graceful Shutdown is a Necessity Stopping TimesTen main daemon without graceful shutdown ttdaemonadmin stop Forces the TimesTen daemon to stop invalidates all active databases Upon restart the database go through the recovery process Read checkpoint file, check transaction logs for redo operations, rebuild indexes Much longer database restart time Best Practices: Stop all applications connecting to the database Ensure no active connections to the database Stop the daemon process 20
Update Statistics! Root cause of many performance issues Cost based query optimizer uses statistics and indexes to generate query plans Update statistics After initial load of TimesTen tables When table size changes dramatically (e.g 2X) When indexes added or schema changed TimesTen provides a broad range of methods for updating statistics ttisql commandstatsupdate or built-in procedures Statistics can be estimated or set for particular tables 21
Index Choices Hash, Range, Bitmap Hash indexes Best performance for equality matches Cannot be used for range searches Can be used on any column Must be properly sized for good performance Undersized hash indexes can result in severe performance penalties CREATE TABLE t pk NUMBER(6) NOT NULL PRIMARY KEY) UNIQUE HASH ON(pk)PAGES= expected_rows/256; 22
Index Choices Range Indexes Range indexes best for range search Generally use less space than hash indexes Default index type for primary key indexes Unique PK range indexes can be ALTERed to hash indexes Bitmap indexes Use when qualifying index reduces number of rows to be scanned Use only on columns with low cardinality 23
Index Optimization Very Important Optimising indexing is essential to get good performance The TimesTen optimiser is index centric Three index types Hash best for full key equality lookups and equijoins Range (default) more flexible than hash but not as fast or equality lookups / equijoins Bitmap NOT RECOMMENDED FOR BI WORKLOADS Primary Key and Foreign key constraints Automatically create indexes For primary keys, you have the option of specifying a hash index 24
TimesTen Index Advisor Recommend Indexes for Better Query Execution Plans Analyzes SQL workload Begin / End Capture Cost/benefit analysis Output: Index Creation DDL Recommends optimal indexes Table scans Joins Sorts, Grouping Operations Obser ve Begin Capture Analyze & Recommend Run Workload End Capture 25
Extract Physical SQL from NQQUERY Log ttindexadvisor.pl sample script ttindexadvisor.pl extracts physical SQL from OBIEE NQQUERY log Run Index Advisor using the SQL from NQQuery Log Recommends indexes for optimal query performance Create indexes if requested Download from TimesTen Download page on OTN, under Sample Utilities http://oracle.com/technetwork/database/databasetechnologies/timesten/downloads/index.html 26
Data Types and Storage Variable Length Types E.gVARCHAR2(n) n <= 128 is stored inline n > 128 is out-of-line Inline columns are accessed faster Space, time trade off Can specify column storage type withinline ornot INLINE qualifier Declare columnsnot NULL INLINE storage 6 Barney Dinosaur... purple... denotes unused space Out-of-Line storage 6 Barney purple Dinosaur 27
TimesTen Native Integer Types Space Efficient and High Performance Type Bytes Range of values TT_TINYINT 1 0.. 255 TT_SMALLINT 2-32768.. 32767 TT_INTEGER 4-2147483648.. 2147483647 TT_BIGINT 8-9223372036854775808.. 9223372036854775807 By comparison NUMBER (no precision or scale) requires 22 bytes of storage NUMBER(18) requires 13 bytes of storage (1.6x TT_BIGINT) NUMBER(9) requires 8 bytes of storage (2x TT_INTEGER) 28
The ttimportfromoracle Utility What Does it Do? Copies tables from an Oracle database into a TimesTen database Table definitions Indexes and referential constraints Parallel data load option Optimizes data types Minimize space usage Maximize performance Recommends optimal compression clauses Minimize space usage Accurately estimates TimesTen memory requirements 29
ttimportfromoracle Outputs ttimportfromoracle generates a series of output files File TableList.txt CreateUsers.sql DropTables.sql CreateTables.sql DropIndexes.sql CreateIndexes.sql Description List of all tables with associated load parameters SQL script to create all necessary database users SQL script to drop all tables SQL script to create all tables SQL script to drop all indexes and constraints SQL script to create all indexes and constraints 30
ttimportfromoracle Outputs (cont) File UpdateStats.sql ttsizing.sh LoadData.sql ttpdl.sh Description SQL script to update optimizer statistics for all tables Shell script to run ttsize on all tables SQL script to load all tables sequentially Shell script to load all tables in parallel Memory sizing estimates can be found in the CreateTables.sql file 31
Automate Data Loading From Oracle Database Initial Data Population withtloadfromoracle ttimportfromoracle generates DDL and loading scripts using ttloadfromoracle, a TimesTen built-in procedure Supports initial data population from the Oracle Database Executes a query on the Oracle Database Loads the result set into specified TimesTen table Query on Oracle Database may contain joins and expressions call ttloadfromoracle (tblowner, tblname, Query, numthreads) wherenumthreads is number of parallel threads for loading Example Command> call ttloadfromoracle( user1, custdim, SELECT * from CUSTOMER,8); 32
Loading Data From Flat Files to TimesTen Using TimesTen ttbulkcp Utility ttbulkcp TimesTen utility Import data from flat files to TimesTen Flat files must have ttbulkcp compatible format Refer to TimesTen documentation for more details Step 1: export data from source database to ttbulkcp compatible files Step 2: import flat files to TimesTen Example [oracle@ttbi]$ ttbulkcp i DSN=TT_AGGR_STORE user1.sales_fact sales_fact.dump xp 1000 Sales_fact.dump: 50000000 rows inserted 50000000 rows total [oracle@ttbi]$ * xp = commit every <n> rows 33
Hugepages for Efficient Memory Management Best practice use Hugepages for large databases More efficient memory management Hugepages required by OS When shared memory segment > 256GB On Linux, Hugepages automatically locked by OS TimesTen DSN MemoryLock setting not required TimesTen daemon option file setting required Add -linuxlargepagealignment 2 34
Best Practices Summary Disk speeds matter use Flash or SSD for checkpoint files and transaction logs Proper indexes yield better query performance use Index Advisor when in doubt Statistics matter update statistics when row count changes significantly Data types for optimal storage native numeric types if possible Graceful/clean shutdown reduces startup time Appropriate Optimizer hints are helpful 35
Some New Features in TimesTen 11.2.2.7 Faster prepares for star joins An optimizer hint Useful for OBIEE reporting (as every request re-prepares) Additional star-join optimizations Parallel checkpoint reads for faster restart Take advantage of Flash and SSD drives 2-4GB/sec (your mileage varies) 36
Customer Examples 37
Savvis Inc. CompanyOverview Global leader in cloud infrastructure / hosted IT enterprise solutions Industry: High Technology Employees: 2,440 Revenue: US$186.3 million Challenges Many shadow copies of data marts in various desk top formats Many un-managed metrics across the company Provide front-line employees rapid access to customer data across various dimensions Provide rapid access to strategic planning information Implement a single BI platform Help operations staff respond more quickly to customer service issues Solution OBIEE 11g + TimesTen on Exalytics In-Memory Machine Data replication to TimesTen using Oracle GoldenGate Why Exalytics and TimesTen? Faster replication (reduced from 2 hours to seconds) so BI data is much more current Enables near real-time reporting with consistent enterprise metrics across all teams Provides advanced visualization of complex data sets Allows users to easily manipulate data to their needs Customer Perspective Overall, we expect Oracle Exalytics In-Memory Machine will help us foster stronger relationships with our customers. It will deliver, to our fingertips, the real-time data we need to continue to deliver world-class service. Michael Mahoney, Sr. Manager of Business Operations 38
CIMA Application Overview CIMA World s largest professional body of management accountants - 203,000 members/students across 173 countries and 28 sites globally Reporting Application: OBIEE, Oracle Database with source data from Oracle Siebel Complex business models Challenges Executives require instant access to information and insight; existing system too slow Intelligence needed for agile business decision making Users require direct access to information; phone calls and emails to BI department staff unacceptable Users outside of UK experienced poor responsiveness Solution OBIEE 11g + TimesTen on Exalytics In-Memory Machine Oracle DAC loaded all Siebel source tables to TimesTen Why Exalytics and TimesTen? Speed!!! Achieved over 100x query response time improvement with zero application code change Reports previously not finished in hours now return in seconds Instantaneous knowledge of customers across the company DAC provides quick incremental refresh to TimesTen User feedback amazing, excellent and incredible ipad and mobile device support to be added soon OLTP data input DAC 39
Real-Time Fraud Detection USPS Total Revenue Protection (TRP) Challenges 4 billion mail scans per day peak (74,000/sec) 275 processing and distribution centers 33,000 postal facilities Find, track, and reject mail due to duplicate postage, short pay, or ineligible discounts 509 row inserts/sec (RIPS) 275M txs per 15 hr processing window Sorting and capture time exceeded processing window Solution Real-time data scans ingested into TimesTen 1.7TB TimesTen in-memory database Real-time TRP algorithms executed on TimesTen Results retained in TimesTen and propagated to Oracle Database for long term storage and analysis SGI Altix 2.25TB RAM, 72 Itanium CPUs, 17TB disks TimesTen Values 190,222 RIPS (3 threads) 1,091,018 RIPS (18 threads) Processed 4 Billion txs in less than 6 hours Revenue protection performed in real-time upon first scan Sorting and capture now easily fits within processing window 40
Real-Time Fraud Detection System Application Overview Industry : Communications Business : Telecom Application : Real Time Fraud Detection Analyzes voice (phone) and data (Internet) traffic for fraudulent activity in real-time >2.5 billion phone records daily Alerts based on threshold rule violation Challenges Home grown solution could not keep up with the performance requirements due to the increase in traffic volume Require HA and DR support Why TimesTen? Better throughput and scalability than existing in-house built solution HA via active standby in-memory databases DR support with remote subscriber Simple to deploy with standard SQL interfaces Active Standby DR Site Subscriber Solution Oracle TimesTen In-Memory Database Cache Oracle Database 41 41
Summary 42
Summary TimesTen offers very high performance Correct setup and usage is vital to achieve good performance and easy management Several dimensions to consider: OS, configuration, operation, tuning Use the tools available to make your life easier and get the best results We have covered a lot but there is more Don t be afraid to ask for help if you need it 43
44
45
46