OLAP Is Different From What You Think Rittman Mead BI Forum Spring 2012 Dan Vlamis Vlamis Software Solutions 816-781-2880 http://www.vlamis.com
Dan Vlamis and Vlamis Software Solutions Vlamis Software founded in 1992 in Kansas City, Missouri Developed more than 200 Oracle BI systems Specializes in ORACLE-based: Data Warehousing Business Intelligence Design and integrated BI and DW solutions Training and mentoring Expert presenter at major Oracle conferences www.vlamis.com (blog, papers, newsletters, services) Developer for IRI (former owners of Oracle OLAP) Co-author of book Oracle Essbase & Oracle OLAP Beta tester for OBIEE 11g Reseller for Simba and NAVTEQ map data for OBIEE HOL Coordinator for 2012 Collaborate Conference Copyright 2012, Vlamis Software Solutions, Inc.
Perceptions of OLAP Perception Reality Summary management takes planning Easy, just different Should be driven by user community Works with SQL, MDX, etc. Always important to design properly Relational can do anything Hard to do Should be driven by IT Takes specialized software Not necessary anymore
OLAP Back End Products ROLAP (tables) Discoverer Summary tables OBIEE (with no MOLAP) Microstrategy Business Objects MOLAP (cubes) Express Essbase OLAP Option to Oracle DB TM/1 Cognos Powerplay MS Analysis Services Palo
OLAP Is Fast For Dimensional Queries Dimensions are natural indexes to data Dimensions are natural way to look at data By, across, over, down prepositions are often dimensions Handles multiple levels easily embedded total hierarchies Inter-row calcs are easy Share, index Yr/yr or prior period comparison Movingtotal
OLAP Properties Cube structure Metadata often integrated into solution Sparsity must be addressed Arrays (cubes) do not handle automatically Often handled by combining dimensions <PROD GEOG> Products have been around a long time mature Good at unpredictable query patterns if fits in dimensional model
Dimensions Are Key to OLAP Model OLAP good at unpredictable query pattern if query fits dimensions of data Don t confuse limitations of pre-calculated data with limitations of OLAP If filter invalidates OLAP, likely invalidates summary table logic Example: Sales by Region (easy) Hard: Sales by Region for stores open > 1 yr If demand ultimate flexibility, must calc on the fly and performance will be a problem if accessing lots of data
Aggregate Data Problem Relational good at homogenous key data Dimensions handle data at multiple levels E.g. Day -> Month -> Quarter -> Year Single key column has values at each level In Relational, aggregate data via views/agg tables In Cubes, aggregate data all in one table Queries return data at many levels in SQL statement Aggregation logic in OLAP cubes simplifies queries Cost-based aggregation will calculate what levels to agg on the fly vs. store aggregates
Materialized Views Challenges in Ad Hoc Query Environments Month, City, Category SALES_MCC month_id category_id city_id Year, City, Category SALES_YCC year_id category_id city_id Year, Continent, Category SALES_YCC year_id category_id continent_id Qtr, State, Item SALES_QSI qtr_id item_id state_id Month, State SALES_MS month state Year, Continent SALES day_id prod_id cust_id chan_id Year, District SALES_YCT year_id type_id continent_id SALES_YC year_id continent_id Cust, Time, Prod, Chan Lvls SALES_XXX SALES_XXX SALES_XXX expense_amount SALES_XXX potential_fraud_cost expense_amount expense_amount potential_fraud_cost potential_fraud_cost Creating MVs to support ad hoc query patterns is challenging Users expect excellent query response time across any summary Potentially many MVs to manage Practical limitations on size and manageability constrain the number of materialized views
Cube-based Materialized Views Much Better Manageability & Performance PRODUCT item_id subcategory category type CUSTOMER cust_id city state country TIME day_id month quarter year SALES day_id prod_id cust_id chan_id price rewrite refresh SALES CUBE CHANNEL chan_id class A single cube provides the equivalent of thousands of summary combinations The 11g SQL Query Optimizer treats OLAP cubes as MV s and rewrites queries to access cubes transparently Cube refreshed using standard MV procedures
Cost Based Aggregation Pinpoint Summary Management NY 25,000 customers Los Angeles 35 customers Precomputed Improves aggregation speed and storage consumption by precomputing cells that are most expensive to calculate Easy to administer Simplifies SQL queries by presenting data as fully calculated Computed when queried
Empowering Any SQL-Based Tool Leveraging the OLAP Calculation Engine SELECT cu.long_description customer, f.profit_rank_cust_sh_parent, f.profit_share_cust_sh_parent, f.profit_rank_cust_sh_level, f.profit, f.gross_margin FROM time_calendar_view t, product_primary_view p, customer_shipments_view cu, channel_primary_view ch, units_cube_view f WHERE t.level_name = 'CALENDAR_YEAR' AND t.calendar_year = 'CY2006' AND p.dim_key = 'TOTAL' AND cu.parent = 'TOTAL' AND ch.dim_key = 'TOTAL' AND t.dim_key = f.time AND p.dim_key = f.product AND cu.dim_key = f.customer AND ch.dim_key = f.channel;
QUESTIONS?
Essbase vs. Oracle OLAP Essbase Oracle OLAP Separate server Built into Oracle DB List price* $184K/CPU Separate admin Administer by LoB Must build cubes Part of middle tier Excellent writeback Query via MDX, XML/A List price* DB + $23K/CPU Admin same as Oracle DB Administer by IT Must build cubes Part of server tier Limited writeback Query via SQL (now MDX) * http://www.oracle.com/us/corporate/pricing/index.html