The Unified Dimensional Model 1 OLAP Online Analytical Processing 1993, E.F. Codd proposed a system structure specially designed to support data analysts Term persists even though Codd s proposed structure does not 2 Online Analytical Processing Defintion Online analytical processing (OLAP) systems enable users to quickly and easily retrieve information from data, ususally in a data mart, for analysis. OLAP systems present data using measures, dimensions, hierarchies, and cubes. Larson, B. (2008). Delivering Business Intelligence with Microsoft SQL Server 2008. New York: McGraw-Hill Osborne. 3 1
OLAP Systems Data is in a data mart Data is structured in terms of measures, dimensions, hierarchies, and cubes OLAP system accesses the data mart Provides tools for viewing and analyzing the data quickly 4 [Data] Cube Definition A cube is a structure that contains a value for one or more measures for each unique combination of the members of all its dimensions. These are detail, or leaf-level values. The cube also contains aggregated values formed by the dimension hierarchies or when one or more of the dimensions is left out of the hierarchy. Larson, B. (2008). Delivering Business Intelligence with Microsoft SQL Server 2008. New York: McGraw-Hill Osborne. 5 A Point Within a Cube is a Value of the Measure (the Fact) The intersection of all dimensions is a point That point represents a value of the measure for the particular unique combination of dimension values The point is called a detail or leaf-level value 6 2
Aggregate Definition An aggregate is a number that is calculated from amounts in many detail records. An aggregate may be a sum of many numbers, but it can also be derived using other arithmetic operations or even from a count of the number of items in a group. Preprocessed aggregate Sometimes aggregates are stored in the cube instead of being calculated as needed by the user who is browsing through the data This saves processing time 7 OLAP Systems Built around data structured as measures, dimensions, hierarchies, and cubes a multidimensional approach to organizing data 8 Multidimensional Database Definition A multidimensional database is structured around measures, dimensions, hierarchies, and cubes rather than tables, rows, columns, and relations. Larson, B. (2008). Delivering Business Intelligence with Microsoft SQL Server 2008. New York: McGraw-Hill Osborne. http://gerardnico.com/wiki/database/database_multidi mensional 9 3
Purpose of Relational Database Design To increase data integrity by reducing data redundancy 10 OLAP Cubes Should Be Easily Understood by Users The users of OLAP cubes are decision-makers, often executive level Naming conventions appropriate for relational database structures are not appropriate for OLAP cubes Measures, dimensions, hierarchies, should be given names that are easily understood by the decision-maker Many OLAP systems also provide for metadata such as a description 11 OLAP Cube Architectures Relational OLAP Multidimensional OLAP Hybrid OLAP 12 4
Relational OLAP Cube structure is stored in a multidimensional database Leaf-level measures are in a relational data mart that is the source of the cube Preprocessed aggregates are stored in a relationship table Cube Structure (Multidimensional Storage) Preprocessed Aggregates (Relational Storage) Detail-Level Values (Relational Data Warehouse) 13 ROLAP Advantages Able to store more data Leaf-level values are as upto-date as the data mart Cube retrieves the values from the data mart Disadvantages Slower than other OLAP architectures 14 Multidimensional OLAP Cube structure is stored in a multidimensional database Preprocessed aggregates are stored in a multidimensional database Copy of leaf-level values are stored in a multidimensional database as well Cube Structure (Multidimensional Storage Preprocessed Aggregates (Multidimensional Storage) Detail-Level Values (Multidimensional Storage) 15 5
MOLAP Advantages Very fast Disadvantages Leaf-level data must be loaded from data mart, thus some latency 16 Hybrid OLAP Cube structure in multidimensional database Aggregates in multidimensional database Leaf-level data is in the data mart 17 HOLAP Advantages Fast retrieval of aggregates No latency due to copying leaf-level level values from data mart Disadvantages Not as fast when retrieving leaf-level data 18 6
OLAP Complexity requires more database concepts, skills and knowledge to design and implement Also requires extensive organizational understanding to create/identify measures and dimensions Can be expensive projects Generally require a data mart Data mart must be maintained with data updates and a procedure for making this happen Some latency because data must be copied into the data mart Disadvantages 19 Unified Dimensional Model Microsoft Advantages of OLAP, avoiding some drawbacks of OLAP UDM does not require a data mart can be built over an OLTP system A single UDM can use data from Data mart OLTP XML data Other vendor data stores UDM can identify measures and dimensions directly from a relational database 20 Data Source File that contains the database connection information Data views combine tables and fields from various data stores Data views filter out unnecessary db information Virtual additions to tables can be done in the data view 21 7
UDM Utilizes proactive caching to speed up data retrieval 22 UDM Advantages Built on transactional data extremely low latency Do not have to load data from OLTP to data mart Easier to create and maintain complexity is handled by the tools Design versioning with source control 23 The Unified Dimensional Model 24 8