Learning Objectives. Definition of OLAP Data cubes OLAP operations MDX OLAP servers



Similar documents
DATA WAREHOUSING - OLAP

A Technical Review on On-Line Analytical Processing (OLAP)

Anwendersoftware Anwendungssoftwares a. Data-Warehouse-, Data-Mining- and OLAP-Technologies. Online Analytic Processing

OLAP Systems and Multidimensional Queries II

Data W a Ware r house house and and OLAP II Week 6 1

OLAP. Business Intelligence OLAP definition & application Multidimensional data representation

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP

Basics of Dimensional Modeling

Data Warehousing OLAP

CHAPTER 4 Data Warehouse Architecture

OLAP Systems and Multidimensional Expressions I

DATA WAREHOUSING AND OLAP TECHNOLOGY

2074 : Designing and Implementing OLAP Solutions Using Microsoft SQL Server 2000

Overview of Data Warehousing and OLAP

CS2032 Data warehousing and Data Mining Unit II Page 1

M Designing and Implementing OLAP Solutions Using Microsoft SQL Server Day Course

Unit -3. Learning Objective. Demand for Online analytical processing Major features and functions OLAP models and implementation considerations

Database Applications. Advanced Querying. Transaction Processing. Transaction Processing. Data Warehouse. Decision Support. Transaction processing

Data Warehousing: Data Models and OLAP operations. By Kishore Jaladi

1. OLAP is an acronym for a. Online Analytical Processing b. Online Analysis Process c. Online Arithmetic Processing d. Object Linking and Processing

UNIT-3 OLAP in Data Warehouse

OLAP & DATA MINING CS561-SPRING 2012 WPI, MOHAMED ELTABAKH

Week 3 lecture slides

Building Data Cubes and Mining Them. Jelena Jovanovic

Data Warehouse. MIT-652 Data Mining Applications. Thimaporn Phetkaew. School of Informatics, Walailak University. MIT-652: DM 2: Data Warehouse 1

OLAP Systems and Multidimensional Expressions II

Monitoring Genebanks using Datamarts based in an Open Source Tool

This tutorial will help computer science graduates to understand the basic-toadvanced concepts related to data warehousing.

Data Mining for Knowledge Management. Data Warehouses

Chapter 3, Data Warehouse and OLAP Operations

Data Warehousing and OLAP Technology

Data warehousing. Han, J. and M. Kamber. Data Mining: Concepts and Techniques Morgan Kaufmann.

OLAP Theory-English version

BUSINESS ANALYTICS AND DATA VISUALIZATION. ITM-761 Business Intelligence ดร. สล ล บ ญพราหมณ

LITERATURE SURVEY ON DATA WAREHOUSE AND ITS TECHNIQUES

BUILDING BLOCKS OF DATAWAREHOUSE. G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT

Web Log Data Sparsity Analysis and Performance Evaluation for OLAP

Business Intelligence, Analytics & Reporting: Glossary of Terms

The Art of Designing HOLAP Databases Mark Moorman, SAS Institute Inc., Cary NC


TUTORIAL: Introduction to Multidimensional Expressions (MDX)

Data Warehouse design

Analytics with Excel and ARQUERY for Oracle OLAP

What is OLAP - On-line analytical processing

Course 6234A: Implementing and Maintaining Microsoft SQL Server 2008 Analysis Services

Overview. Data Warehousing and Decision Support. Introduction. Three Complementary Trends. Data Warehousing. An Example: The Store (e.g.

Multi-dimensional index structures Part I: motivation

Data Warehousing. Outline. From OLTP to the Data Warehouse. Overview of data warehousing Dimensional Modeling Online Analytical Processing

OLAP and Data Warehousing! Introduction!

University of Gaziantep, Department of Business Administration

Oracle OLAP What's All This About?

Data Warehouse: Introduction

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 29-1

ORACLE OLAP. Oracle OLAP is embedded in the Oracle Database kernel and runs in the same database process

Data Warehousing. Paper

When to consider OLAP?

DATA CUBES E Jayant Haritsa Computer Science and Automation Indian Institute of Science. JAN 2014 Slide 1 DATA CUBES

Data Warehousing. Read chapter 13 of Riguzzi et al Sistemi Informativi. Slides derived from those by Hector Garcia-Molina

Data Warehousing and Decision Support. Introduction. Three Complementary Trends. Chapter 23, Part A

New Approach of Computing Data Cubes in Data Warehousing

Oracle OLAP 11g and Oracle Essbase

Business Intelligence & Product Analytics

Data Warehouse Snowflake Design and Performance Considerations in Business Analytics

Data Warehousing, OLAP, and Data Mining

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2007 Lecture 16 - Data Warehousing

70-467: Designing Business Intelligence Solutions with Microsoft SQL Server

14. Data Warehousing & Data Mining

Business Intelligence in SharePoint 2013

Lecture 2 Data warehousing

Part 22. Data Warehousing

Building Cubes and Analyzing Data using Oracle OLAP 11g

Decision Support. Chapter 23. Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke 1

Data Testing on Business Intelligence & Data Warehouse Projects

Business Intelligence for SUPRA. WHITE PAPER Cincom In-depth Analysis and Review

Turning your Warehouse Data into Business Intelligence: Reporting Trends and Visibility Michael Armanious; Vice President Sales and Marketing Datex,

OLAP and OLTP. AMIT KUMAR BINDAL Associate Professor M M U MULLANA

Data Warehousing & OLAP

Week 13: Data Warehousing. Warehousing

Lost in Space? Methodology for a Guided Drill-Through Analysis Out of the Wormhole

CS Programming OLAP

Lecture 2: Introduction to Business Intelligence. Introduction to Business Intelligence

Outline. Data Warehousing. What is a Warehouse? What is a Warehouse?

Data Warehouses & OLAP

BUILDING OLAP TOOLS OVER LARGE DATABASES

Data W a Ware r house house and and OLAP Week 5 1

Business Benefits From Microsoft SQL Server Business Intelligence Solutions How Can Business Intelligence Help You? PTR Associates Limited

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2009 Lecture 15 - Data Warehousing: Cubes

SQL SERVER BUSINESS INTELLIGENCE (BI) - INTRODUCTION

Business Intelligence and Healthcare

Cognos 8 Best Practices

TIM 50 - Business Information Systems

Optimizing Your Data Warehouse Design for Superior Performance

The Benefits of Data Modeling in Business Intelligence

Transcription:

OLAP

Learning Objectives Definition of OLAP Data cubes OLAP operations MDX OLAP servers 2

What is OLAP? OLAP has two immediate consequences: online part requires the answers of queries to be fast, the analytical part is a hint that the queries itself are complex Complex questions with Fast Answers! 3

Why OLAP? Empowers end-users to do own analysis Increased productivity of business end-users and consequently the entire organization Frees up IT of report requests Reduced backlog of applications development for IT staff by making end-users self-sufficient enough to build their own models No knowledge of tables or SQL required 4

OLAP Applications Marketing: Market research analysis, sales forecasting, promotions analysis, customer analysis, and market/customer segmentation. Sales: Sales analysis and sales forecasting. Finance: Budgeting, activity-based costing, financial performance analysis, and financial modeling. Manufacturing: Production planning and defect analysis. 5

OLAP Clients Visualization OLAP capabilities Interactive manipulation 6

Excel as OLAP Client 7

Learning Objectives Definition of OLAP Data cubes OLAP operations MDX OLAP servers 8

From Tables and Spreadsheets to Data Cubes OLAP is based on a multidimensional data model which views data in the form of a data cube A data cube, such as sales, allows data to be modeled and viewed in multiple dimensions Dimension tables, such as item (item_name, brand, type), or time (day, week, month, quarter, year) Fact table contains measures (such as dollars_sold) and keys to each of the related dimension tables 9

Representing Multi- Dimensional Data Example of two-dimensional query. What is the total revenue generated by property sales in each city, in each quarter? Compare representation: three-field relational table versus two-dimensional matrix. 10

Multi-Dimensional Data as Three-Field Table versus Two-Dimensional Matrix 11

Representing Multi- Dimensional Data Example of three-dimensional query. What is the total revenue generated by sales for each type of property (Flat or House) in each city, in each quarter? Compare representation: four-field relational table versus threedimensional cube. 12

Multi-Dimensional Data as Four-Field Table versus Three-Dimensional Cube 13

Example: 3-d data cube 14

Definition of Cube The data cube summarizes the measure with respect to a set of n dimensions and provides summarizations for all subsets of them product chairs tables desks shelves boards ALL year 1999 2000 2001 2002 ALL 25 37 89 21 172 10 30 0 45 85 56 84 9 35 184 19 20 0 71 110 5 16 11 15 47 115 187 109 187 598 Data cube 15

Cube as set of cuboids The most detailed part of the cube is called a base cuboid. The top most 0-D cuboid, which holds the highest-level of summarization, is called the apex cuboid. The lattice of cuboids forms a data cube. product chairs tables desks shelves boards ALL year 1999 2000 2001 2002 ALL 25 37 89 21 172 10 30 0 45 85 56 84 9 35 184 19 20 0 71 110 5 16 11 15 47 115 187 109 187 598 base cuboid Data cube apex cuboid 16

Cube as set of cuboids all product date country product,date product,country date, country 0-D(apex) cuboid 1-D cuboids 2-D cuboids product, date, country 3-D(base) cuboid 17

Example: Cube and cuboids color, size : DIMENSIONS count : MEASURE size φ color C / S S M L TOT Red 20 3 5 28 Blue 3 3 8 14 Gray 0 0 5 5 color; size TOT 23 6 18 47 18

Ex: Cube and cuboids color, size : DIMENSIONS count : MEASURE size φ color C / S S M L TOT Red 20 3 5 28 Blue 3 3 8 14 Gray 0 0 5 5 color; size TOT 23 6 18 47 19

Ex: Cube and cuboids color, size : DIMENSIONS count : MEASURE size φ color C / S S M L TOT Red 20 3 5 28 Blue 3 3 8 14 Gray 0 0 5 5 color; size TOT 23 6 18 47 20

Ex: Cube and cuboids color, size : DIMENSIONS count : MEASURE size φ color C / S S M L TOT Red 20 3 5 28 Blue 3 3 8 14 Gray 0 0 5 5 color; size TOT 23 6 18 47 21

Ex: Cube and cuboids color, size : DIMENSIONS count : MEASURE size φ color C / S S M L TOT Red 20 3 5 28 Blue 3 3 8 14 Gray 0 0 5 5 color; size TOT 23 6 18 47 22

Ex: Cube and cuboids color, size : DIMENSIONS count : MEASURE size φ color C / S S M L TOT Red 20 3 5 28 Blue 3 3 8 14 Gray 0 0 5 5 color; size TOT 23 6 18 47 DataCube 23

Learning Objectives Definition of OLAP Data cubes OLAP operations MDX OLAP servers 24

Typical OLAP Operations Roll up (drill-up): summarize data by climbing up hierarchy or by dimension reduction Drill down (roll down): reverse of roll-up from higher level summary to lower level summary or detailed data, or introducing new dimensions Slice and dice: project and select Pivot (rotate): reorient the cube, visualization, 3D to series of 2D planes. Other operations drill across: involving (across) more than one fact table drill through: through the bottom level of the cube to its backend relational tables 25

Example of operations on a Datacube size φ color C / S S M L TOT Red 20 3 5 28 Blue 3 3 8 14 Gray 0 0 5 5 color; size TOT 23 6 18 47 26

Roll-up Roll-up: In this example we reduce one dimension It is possible to climb up one hierarchy Example (product, city) (product, country) size φ color C / S S M L TOT Red 20 3 5 28 Blue 3 3 8 14 Gray 0 0 5 5 color; size TOT 23 6 18 47 27

Drill-down Drill-down In this example we add one dimension It is possible to climb down one hierarchy Example (product, year) (product, month) size φ color C / S S M L TOT Red 20 3 5 28 Blue 3 3 8 14 Gray 0 0 5 5 color; size TOT 23 6 18 47 28

Slice Slice: Perform a selection on one dimension size φ color C / S S M L TOT Red 20 3 5 28 Blue 3 3 8 14 Gray 0 0 5 5 color; size TOT 23 6 18 47 29

Dice Dice: Perform a selection on two or more dimensions size φ color C / S S M L TOT Red 20 3 5 28 Blue 3 3 8 14 Gray 0 0 5 5 color; size TOT 23 6 18 47 30

Slice/Dice Easy terms compared to Select-Where in SQL customers store 31

Learning Objectives Definition of OLAP Data cubes OLAP operations MDX OLAP servers 32

Multidimensional Expressions (MDX) Microsoft SQL Server OLAP Services provides an architecture for access to multidimensional data For expressing queries to this data, OLAP employs a full-fledged, highly functional expression syntax: Multidimensional EXpressions (MDX) OLAP Services supports MDX functions as a full language implementation for creating and querying cube data 33

MDX Introductory Tutorial Dimensions used in the examples Dimension Name Product Hierarchy Product Family Product Department Product Category Product Subcategory Brand Name Product Name Description The products that are on sale in the FoodMart stores Promotions Promotion Name Identifies promotion that triggered the Sale Store Customer Store Country Store State Store City Store Name Country, State or Province, City, Name Geographical hierarchy for different stores in the chain (country, state, city) Geographical hierarchy for registered customers Time Years, Quarters, Months Time period when the sale was made 34

MDX Introductory Tutorial Outline of an expression returning two cube dimensions SELECT axis specification ON COLUMNS, axis specification ON ROWS FROM cube_name WHERE slicer_specification axis specification: members of a dimension (all levels of hierarchy) If a single dimension: COLUMNS must be returned For more dimensions, the named axes would be PAGES, CHAPTERS and, finally, SECTIONS WHERE clause is actually optional and acts as slicer specification 35

MDX Introductory Tutorial 2 dimensions -or levels of a dimension- and slice SELECT NON EMPTY {[Store Type].MEMBERS} ON COLUMNS, NON EMPTY {[Store].[Store State].MEMBERS} ON ROWS FROM [Sales] WHERE (Measures.[Sales Average], [Time].[1997]) 36

MDX Introductory Tutorial Query top-n in list SELECT Measures.[Profit] ON COLUMNS, TOPCOUNT([Store].[Store City].MEMBERS, 5, Measures.[Profit]) ON ROWS FROM [Sales] TOPCOUNT(set, count, numeric_expression) 37

Learning Objectives Definition of OLAP Data cubes OLAP operations MDX OLAP servers 38

Conceptual vs. Actual The cube is a logical way of visualizing the data in an OLAP setting Not how the data is actually represented Two opposite ways of storing data: ROLAP: Relational OLAP MOLAP: Multidimensional OLAP 39

OLAP Server Architectures Relational OLAP (ROLAP) Use relational or extended-relational DBMS to store and manage warehouse data and OLAP middle ware to support missing pieces Include optimization of DBMS backend, implementation of aggregation navigation logic, and additional tools and services greater scalability Multidimensional OLAP (MOLAP) Array-based multidimensional storage engine (sparse matrix techniques) Pre-calculating cuboids (space overhead) fast indexing to pre-computed summarized data Hybrid OLAP (HOLAP) User flexibility, e.g., low level: relational, high-level: array 40

ROLAP ROLAP supports RDBMS products through the use of an application logic layer To improve performance, some ROLAP products have enhanced SQL engines to support the complexity of multi-dimensional analysis The development issues associated with ROLAP technology: Performance problems associated with the processing of complex queries that require multiple passes through the relational data. Development of middleware to facilitate the development of multi-dimensional applications. 41

MOLAP MOLAP tools use specialized data structures and multi-dimensional database management systems (MDDBMS) to organize, navigate, and analyze data. MOLAP data structures use array technology and efficient storage techniques that minimize the disk space requirements through sparse data management. The development issues associated with MOLAP: Only a limited amount of data can be efficiently stored and analyzed. MOLAP products require a different set of tools to build and maintain the database. OLAP, by Dr. Khalil 42

ROLAP vs. MOLAP Performance: How fast will the system appear to the end-user? MOLAP vendors believe this is a key point in their favor Data volume and scalability: While MOLAP servers can handle up to 100GB of storage, ROLAP servers can handle hundreds of gigabytes and terabytes 43

ROLAP vs. MOLAP 44

HOLAP HOLAP tools deliver selected data directly from DBMS or via MOLAP server Is the fastest-growing type of OLAP tools. 45

HOLAP Partial materialization 2 n views for n dimensions (nohierarchies) Storage/updatetime explosion More precomputation doesn t mean better performance!!!! 46