Data Warehousing and OLAP II. Toon Calders

Size: px
Start display at page:

Download "Data Warehousing and OLAP II. Toon Calders"

Transcription

1 Data Warehousing and OLAP II Toon Calders

2 What have we seen last time? Datawarehousing Alternative data storage for analysis Geared towards aggregation queries Online Analytical Processing Multidimensional i l view on data: Data Cubes

3 What have we seen last time? Technologies Simple extensions to SQL insufficient Date Country Sales 1st semester Ireland 20 1t 1st semester France 126 1st semester Germany 56 1st semester null 202 2nd semester Ireland 23 2nd semester France 138 2nd semester Germany 48 2nd semester null 209 null Ireland 43 null France 264 null Germany 104 null null 411

4 What have we seen last time? Completely materializing cube is impossible Reason: sparse dimensions Size of cube w.r.t. number of dimensions (500 data points)

5 What have we seen last time? Therefore: partially materialize the cube Discussed the approach of Harinarayan et al. (part, supplier, customer) (part, supplier) (part, customer) (supplier, customer) (part) (supplier) (customer) ()

6 What have we seen last time? Therefore: partially materialize the cube Discussed the approach of Harinarayan et al. (part, supplier, customer) (part, supplier) (part, customer) (supplier, customer) (part) (supplier) (customer) ()

7 Today s Menu Extensions to the View Selection Problem probabilities on queries memory bound II.2 Data storage and indexing star and snowflake schema, multi-cube bitmap, join, and bitmap-join index

8 Extensions (1) Problem The views in a lattice are unlikely to have the same probability of being requested in a query. Solution: We can weight each benefit by its probability.

9 Example Suppose following distribution: a: 0%, b: 7%, c:3%, d:10%, e:5%, f:20%, g: 40%, h:15% a b c d e f g h Expected cost given S = 48 Query Uses? Cost Exp. Cost a a b b c a d b 50 5 e b f f 40 8 g b h f 40 6 Total 48 a 100 b 50 c 75 d 20 e 30 f 40 g 1 h 10

10 Example Solution found by greedy: a: 0%, b: 7%, c:3%, d:10%, e:5%, f:20%, g: 40%, h:15% a b c d e f g h Expected cost given S = 46 Query Uses? Cost Exp. Cost a a b a c a d d 20 2 e e f a g d 20 8 h e Total 46 a 100 b 50 c 75 d 20 e 30 f 40 g 1 h 10

11 Example Optimal Solution: a: 0%, b: 7%, c:3%, d:10%, e:5%, f:20%, g: 40%, h:15% a b c d e f g h Expected cost given S = 39 Query Uses? Cost Exp. Cost a a b a c a d d 20 2 e a f f 40 8 g d 20 8 h f 40 6 Total 39 a 100 b 50 c 75 d 20 e 30 f 40 g 1 h 10

12 Extensions (2) Problem Instead of asking for some fixed number (k) of views to materialize, we might instead allocate a fixed amount of space to views. Solution We can consider the benefit of each view per unit space always select the view with the best trade-off that still fits

13 Example Available space: 85 (a is already materialized) materializing all: 226 b a c d e f g h a 100 b 50 c 75 d 20 e 30 f 40 g 1 query Materialized view b c d e f g h a b c d e f g h Total Avg h 10

14 Example Available space: 84 (g:1) b a c d e f g h query Materialized view b c d e f h a b c d e f g h Total Avg a 100 b 50 c 75 d 20 e 30 f 40 g 1 h 10

15 Example Available space: 74 (g:1, h:10) b a c d e f g h query Materialized view b c d e f a b c d e f g h Total Avg a 100 b 50 c 75 d 20 e 30 f 40 g 1 h 10

16 Example Available space: 54 (g:1, h:10, d:20) b a c d e f g h query Materialized view b e f a b c d e f g h Total Avg a 100 b 50 c 75 d 20 e 30 f 40 g 1 h 10

17 Example Available space: 24 (g:1, h:10, d:20, e:30) b a c d e f g h query Materiali zed view b f a - - b 50 - c - - d - - e - - f - 60 g - - h - - Total Avg a 100 b 50 c 75 d 20 e 30 f 40 g 1 h 10

18 Conclusions View Materialization Materialization of views is an essential query optimization strategy for decision-support applications. Finding optimal solution is NP-hard. Introduction of greedy algorithm There e exists cases which greedy algorithm fails to produce optimal solution Yet, greedy algorithm has a good guarantee Expansion of greedy algorithm.

19 II.2 Data storage and indexing How is the data stored? relational database (ROLAP) Specialized structures (MOLAP) How can we speed up computation? Indexing structures bitmap index join index bitmap-join index

20 How Does it Fit In? We already know what we want to materialize, but HOW are we going to store it? We made the problem smaller but did not solve it Before partial materialization: Answer (supplier) from (part, supplier, customer) After partial materialization: Answer (supplier) from (supplier, customer) Not all queries are of the type SELECT D1,, Dk, sum(m) FROM R GROUP BY D1,, Dk

21 How Does it Fit In? Example of another type of query: SELECT supplier, year, min(price) FROM cube WHERE product = toilet paper and (year = 2009 or year = 2010) GROUP BY supplier, year

22 How Does it Fit In? Two ways for storing the data In a relational database ROLAP Using specialized storage structures MOLAP Two popular OLAP indexing structures and their combination: Bitmap index Join index Bitmap-Join index

23 Implementation Nowadays systems are typically divided into three categories: ROLAP (Relational OLAP) OLAP on top of a relational database MOLAP (Multi-Dimensional OLAP) Use of special multi-dimensional data structures HOLAP: (Hybrid) combination of previous two

24 ROLAP Typical database scheme: star schema fact table is central links to dimensional tables Extensions: snowflake schema dimensions s have hierarchy/extra a information o attached Star constellation multiple l star schemas sharing dimensions i

25 Example of a Star Schema Order Order No Order Date Customer Customer No Customer Name Customer Address City Salesperson SalespersonID SalespersonName City Quota Fact Table OrderNO O SalespersonID CustomerNO ProdNo DateKey CityName Quantity Total Price Product ProductNO ProdName ProdDescr CategoryName CategoryDescr UnitPrice Date DateKey Date Month Year City CityName State Country

26 Example of a Snowflake Schema Order Order No Product ProductNO Category Order Date Customer Fact Table ProdName ProdDescr CategoryName CategoryDescr Customer No Customer Name OrderNO SalespersonID CategoryName UnitPrice Customer Address CustomerNO City Salesperson SalespersonID SalespersonName City Quota ProdNo DateKey CityName Quantity Total Price Date DateKey Date Month City CityName StateName Month Month Year State StateName Country

27 Fact Constellation Multiple fact tables share the same dimensions E.g., (part, customer) shares Customer with (supplier, customer) Customer ID Name Address SC_fact C_ID S_ID amount Supplier ID Name Address SP_fact P_ID S_ID amount Part ID Name Color

28 Summary How is the data stored? Relational database (ROLAP) Specialized structures (MOLAP) How can we speed up computation? Indexing structures bitmap index join index Bitmap join index

29 MOLAP Not on top of relational database most popular design specialized data structures Multicubes vs Hypercubes Not all subcubes b are materialized User identifies set of sparse attributes S, and a set of dense attributes D. Index tree is constructed on sparse dimensions. Each leaf points to a multidimensional array indexed by D.

30 Example product, store are sparse dimensions date and customer-type are dense prod. p 1time ret reg Total prod. p 1/1/ store s1 2/1/ Total prod. p store s2 1time ret reg Total 1/1/ /1/ Total

31 Example product, store are sparse dimensions date and customer-type are dense prod. p 1time ret reg Total prod. p 1/1/ store s1 2/1/ Total E.g., B-tree, R-tree, prod. p store s2 1time ret reg Total 1/1/ /1/ Total

32 Example product, store are sparse dimensions date and customer-type are dense prod. p 1time ret reg Total prod. p 1/1/ store s1 2/1/07 582D array Direct access Total E.g., B-tree, R-tree, prod. p store s2 1time ret reg Total 1/1/ /1/ Total

33 Example product, store are sparse dimensions date and customer-type are dense prod. p E.g., B-tree, R-tree, 1time ret reg Total prod. p 1/1/ store s1 2/1/07 582D array Direct access prod. p store s Total time ret reg Total 1/1/ /1/ Total Linked list

34 Queries Efficiency depends on: does index on sparse dimensions fit into memory? Type of queries: Restrictions ti on all dimensions i Restrictions only on dense Restrictions s only on some sparse se and dense

35 Queries Selection on all attributes: (p,s1,ret,all) prod. p 1time ret reg Total prod. p 1/1/ store s1 2/1/ Total prod. p store s2 1time ret reg Total 1/1/ /1/ Total

36 Queries Only on dense attributes: (all,all,ret, 2/1/07 ) prod. p 1time ret reg Total prod. p 1/1/ store s1 2/1/ Total prod. p store s2 1time ret reg Total 1/1/ /1/ Total

37 Queries Only some sparse and dense attributes: (all,s1,ret, 2/1/07 ) prod. p 1time ret reg Total prod. p 1/1/ store s1 2/1/ Total prod. p store s2 1/1/ prod. p2 store s1 1time ret reg Total 2/1/ Total

38 Queries Only some sparse and dense attributes: (p,all,all, 2/1/07 ) prod. p 1time ret reg Total prod. p 1/1/ store s1 2/1/ Total prod. p store s2 prod. p2 store s1 1time ret reg Total 1/1/ /1/ Total

39 Storing the Cube (summary) Dense combinations of dimensions can be stored in multi-dimensional arrays For every combination of sparse dimensions one sub-cube Sub-cubes b indexed d by sparse dimensions i E.g., B-tree Order of the dimensions plays a role

40 Summary How is the data stored? relational database (ROLAP) Multi-dimensional structure (MOLAP) How can we speed up computation? Indexing structures bitmap index join index Bitmap-join index

41 Bitmap Index: Example Product Country Sales TV Ireland 20 TV France 126 HiFi Germany 56 PC Ireland 23 TV France 138 PC Germany 48 Index for Country: Index for Product Ireland TV France HiFi Germany PC

42 Bitmap Index: Example Index for Country: Ireland France Germany Index for Product TV HiFi PC SELECT sum(sales) FROM PCS WHERE (Country = Ireland or Country = France) and not (Product = TV) Access only tuples corresponding to a 1 in the bitmap: ( ) &! =

43 Bitmap Index Size of bitmaps can be reduced Use, e.g., run-length encoding is encoded as 4x1;3x0;4x1;7x0;4x1;3x0 Or, store a list of 1-positions instead of a full bitmap becomes 1;2;3;7;17 Can reduce the storage space significantly Logical operations can work directly on the encoding

44 Bitmap Index Works poorly for high cardinality domains For every value a bitmap Difficult to maintain Insert tuple = add entry in all bitmaps Therefore often not maintained but completely rebuilt after mass insertions Many commercial products have implemented Many commercial products have implemented bitmap indices

45 Summary How is the data stored? relational database (ROLAP) Specialized structures (MOLAP) How can we speed up computation? Indexing structures bitmap index join index Bitmap-join index

46 Join Index Traditional indexes: value in table rids in same table Join indices: Index on the join of two tables value in one table rids in other table Data warehouse: values of dimensions of star schema rows in fact table. Join indexes can span multiple dimensions

47 Join Index: Example Store(sID, city, country) Product(pID,brand) Inventory(sID, pid, cost) sid pid Cost sid City Country s1 Antwerp B s2 Brussels B s3 Amsterdam NL s1 p1 125 s1 p2 30 s2 p1 150 s2 p2 40 s2 p3 80 pid p1 p2 p3 Brand C C D s3 p2 35 s3 p3 75 Index table Inventory on Country, Brand

48 Join Index: Example Index table Inventory on Country, Brand sid pid Cost r1 s1 p1 125 Country Brand Row ID r2 s1 p2 30 B C r1, r2, r3, r4 r3 s2 p1 150 B D r5 r4 s2 p2 40 NL C r6, r7 r5 s2 p3 80 r6 s3 p2 35 r7 s3 p1 75 Can be used to answer queries involving the same join condition more efficiently

49 Join Index Join index can index tuples in the fact table based on an attribute in one of the dimensions Fact Table OrderNO SalespersonID CustomerNO ProdNo DateKey CityName Quantity Total Price City CityName State Country E.g., Index tuples in the fact table for the attribute Country

50 Summary How is the data stored? relational database (ROLAP) Specialized structures (MOLAP) How can we speed up computation? Indexing structures bitmap index join index Bitmap-join index

51 Bitmap-join index Logical combination of bitmap index and join index Different bitmap-join indices with bitmaps on same table can be combined EXAMPLE Customer Customer No Customer Name Customer Address City Fact Table CustomerNO ProdNo Quantity Total Price Product ProductNO ProdName ProdDescr Category CategoryDescr UnitPrice bitmap-join index on FactTable(Customer.City) bitmap-join index on FactTable(Product.Category) Slice on customers in Antwerp buying VCRs

52 Summary: indices Bitmap index, join-index, bitmap-join index: Can speedup selection queries with arbitrary Boolean combinations of indexed attributes Very interesting for ad-hoc analytical queries Are not easy to update Hence, not very suitable for operational databases with losts of inserts and deletes Typically these indices are completely rebuild after bulk inserts Therefore, very typical for datawarehouses and less suitable for OLTP systems

53 Summary Data warehouse is a specialized database to support analytical queries = OLAP queries Data cube as conceptual model Implementation of Data Cube View selection problem Explosion problem ROLAP vs. MOLAP Indexing structures

Data Warehousing and OLAP. t.calders@tue.nl

Data Warehousing and OLAP. t.calders@tue.nl Data Warehousing and OLAP Toon Calders Toon Calders t.calders@tue.nl Motivation «Traditional» relational databases are geared towards online transaction processing: bank terminal flight reservations student

More information

Database Applications. Advanced Querying. Transaction Processing. Transaction Processing. Data Warehouse. Decision Support. Transaction processing

Database Applications. Advanced Querying. Transaction Processing. Transaction Processing. Data Warehouse. Decision Support. Transaction processing Database Applications Advanced Querying Transaction processing Online setting Supports day-to-day operation of business OLAP Data Warehousing Decision support Offline setting Strategic planning (statistics)

More information

Multi-dimensional index structures Part I: motivation

Multi-dimensional index structures Part I: motivation Multi-dimensional index structures Part I: motivation 144 Motivation: Data Warehouse A definition A data warehouse is a repository of integrated enterprise data. A data warehouse is used specifically for

More information

Review. Data Warehousing. Today. Star schema. Star join indexes. Dimension hierarchies

Review. Data Warehousing. Today. Star schema. Star join indexes. Dimension hierarchies Review Data Warehousing CPS 216 Advanced Database Systems Data warehousing: integrating data for OLAP OLAP versus OLTP Warehousing versus mediation Warehouse maintenance Warehouse data as materialized

More information

DATA WAREHOUSING - OLAP

DATA WAREHOUSING - OLAP http://www.tutorialspoint.com/dwh/dwh_olap.htm DATA WAREHOUSING - OLAP Copyright tutorialspoint.com Online Analytical Processing Server OLAP is based on the multidimensional data model. It allows managers,

More information

Data Warehouse. The term Data Warehouse was coined by Bill Inmon in 1990, which he defined in the following way:

Data Warehouse. The term Data Warehouse was coined by Bill Inmon in 1990, which he defined in the following way: Data Warehouse The term Data Warehouse was coined by Bill Inmon in 1990, which he defined in the following way: A data warehouse is a subject-oriented, integrated, time-variant and non-volatile collection

More information

DATA WAREHOUSING AND OLAP TECHNOLOGY

DATA WAREHOUSING AND OLAP TECHNOLOGY DATA WAREHOUSING AND OLAP TECHNOLOGY Manya Sethi MCA Final Year Amity University, Uttar Pradesh Under Guidance of Ms. Shruti Nagpal Abstract DATA WAREHOUSING and Online Analytical Processing (OLAP) are

More information

TIES443. Lecture 3: Data Warehousing. Lecture 3. Data Warehousing. Course webpage: http://www.cs.jyu.fi/~mpechen/ties443.

TIES443. Lecture 3: Data Warehousing. Lecture 3. Data Warehousing. Course webpage: http://www.cs.jyu.fi/~mpechen/ties443. TIES443 Lecture 3 Data Warehousing Mykola Pechenizkiy Course webpage: http://www.cs.jyu.fi/~mpechen/ties443 Department of Mathematical Information Technology University of Jyväskylä November 3, 2006 1

More information

Decision Support, Data Warehousing, and OLAP

Decision Support, Data Warehousing, and OLAP Decision Support, Data Warehousing, and OLAP Anindya Datta Director, ixl Center for E-CommerceE Georgia Institute of Technology adatta@cc. @cc.gatech.eduedu Outline Terminology: OLAP vs.. OLTP Data Warehousing

More information

OLAP Systems and Multidimensional Expressions I

OLAP Systems and Multidimensional Expressions I OLAP Systems and Multidimensional Expressions I Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Software Development Technologies Master

More information

Part 22. Data Warehousing

Part 22. Data Warehousing Part 22 Data Warehousing The Decision Support System (DSS) Tools to assist decision-making Used at all levels in the organization Sometimes focused on a single area Sometimes focused on a single problem

More information

Basics of Dimensional Modeling

Basics of Dimensional Modeling Basics of Dimensional Modeling Data warehouse and OLAP tools are based on a dimensional data model. A dimensional model is based on dimensions, facts, cubes, and schemas such as star and snowflake. Dimensional

More information

Week 3 lecture slides

Week 3 lecture slides Week 3 lecture slides Topics Data Warehouses Online Analytical Processing Introduction to Data Cubes Textbook reference: Chapter 3 Data Warehouses A data warehouse is a collection of data specifically

More information

OLAP OLAP. Data Warehouse. OLAP Data Model: the Data Cube S e s s io n

OLAP OLAP. Data Warehouse. OLAP Data Model: the Data Cube S e s s io n OLAP OLAP On-Line Analytical Processing In contrast to on-line transaction processing (OLTP) Mostly ad hoc queries involving aggregation Response time rather than throughput is the main performance measure.

More information

DATA CUBES E0 261. Jayant Haritsa Computer Science and Automation Indian Institute of Science. JAN 2014 Slide 1 DATA CUBES

DATA CUBES E0 261. Jayant Haritsa Computer Science and Automation Indian Institute of Science. JAN 2014 Slide 1 DATA CUBES E0 261 Jayant Haritsa Computer Science and Automation Indian Institute of Science JAN 2014 Slide 1 Introduction Increasingly, organizations are analyzing historical data to identify useful patterns and

More information

Overview of Data Warehousing and OLAP

Overview of Data Warehousing and OLAP Overview of Data Warehousing and OLAP Chapter 28 March 24, 2008 ADBS: DW 1 Chapter Outline What is a data warehouse (DW) Conceptual structure of DW Why separate DW Data modeling for DW Online Analytical

More information

Data Warehouse Logical Design. Letizia Tanca Politecnico di Milano (with the kind support of Rosalba Rossato)

Data Warehouse Logical Design. Letizia Tanca Politecnico di Milano (with the kind support of Rosalba Rossato) Data Warehouse Logical Design Letizia Tanca Politecnico di Milano (with the kind support of Rosalba Rossato) Data Mart logical models MOLAP (Multidimensional On-Line Analytical Processing) stores data

More information

Decision Support. Chapter 23. Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke 1

Decision Support. Chapter 23. Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke 1 Decision Support Chapter 23 Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke 1 Introduction Increasingly, organizations are analyzing current and historical data to identify useful

More information

Building Data Cubes and Mining Them. Jelena Jovanovic Email: jeljov@fon.bg.ac.yu

Building Data Cubes and Mining Them. Jelena Jovanovic Email: jeljov@fon.bg.ac.yu Building Data Cubes and Mining Them Jelena Jovanovic Email: jeljov@fon.bg.ac.yu KDD Process KDD is an overall process of discovering useful knowledge from data. Data mining is a particular step in the

More information

Decision support systems are the core of business

Decision support systems are the core of business COVER FEATURE Database Technology for Decision Support Systems Creating the framework for an effective decision support system one that leverages business data from numerous discrete touch points is a

More information

CHAPTER 3. Data Warehouses and OLAP

CHAPTER 3. Data Warehouses and OLAP CHAPTER 3 Data Warehouses and OLAP 3.1 Data Warehouse 3.2 Differences between Operational Systems and Data Warehouses 3.3 A Multidimensional Data Model 3.4Stars, snowflakes and Fact Constellations: 3.5

More information

1. OLAP is an acronym for a. Online Analytical Processing b. Online Analysis Process c. Online Arithmetic Processing d. Object Linking and Processing

1. OLAP is an acronym for a. Online Analytical Processing b. Online Analysis Process c. Online Arithmetic Processing d. Object Linking and Processing 1. OLAP is an acronym for a. Online Analytical Processing b. Online Analysis Process c. Online Arithmetic Processing d. Object Linking and Processing 2. What is a Data warehouse a. A database application

More information

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 29-1

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 29-1 Slide 29-1 Chapter 29 Overview of Data Warehousing and OLAP Chapter 29 Outline Purpose of Data Warehousing Introduction, Definitions, and Terminology Comparison with Traditional Databases Characteristics

More information

Data Warehousing Overview

Data Warehousing Overview Data Warehousing Overview This Presentation will leave you with a good understanding of Data Warehousing technologies, from basic relational through ROLAP to MOLAP and Hybrid Analysis. However it is necessary

More information

2074 : Designing and Implementing OLAP Solutions Using Microsoft SQL Server 2000

2074 : Designing and Implementing OLAP Solutions Using Microsoft SQL Server 2000 2074 : Designing and Implementing OLAP Solutions Using Microsoft SQL Server 2000 Introduction This course provides students with the knowledge and skills necessary to design, implement, and deploy OLAP

More information

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2007 Lecture 16 - Data Warehousing

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2007 Lecture 16 - Data Warehousing CSE 544 Principles of Database Management Systems Magdalena Balazinska Fall 2007 Lecture 16 - Data Warehousing Class Projects Class projects are going very well! Project presentations: 15 minutes On Wednesday

More information

Data Warehousing & OLAP

Data Warehousing & OLAP Data Warehousing & OLAP What is Data Warehouse? A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection of data in support of management s decisionmaking process. W.

More information

Data Warehousing and Decision Support. Introduction. Three Complementary Trends. Chapter 23, Part A

Data Warehousing and Decision Support. Introduction. Three Complementary Trends. Chapter 23, Part A Data Warehousing and Decision Support Chapter 23, Part A Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke 1 Introduction Increasingly, organizations are analyzing current and historical

More information

Web Log Data Sparsity Analysis and Performance Evaluation for OLAP

Web Log Data Sparsity Analysis and Performance Evaluation for OLAP Web Log Data Sparsity Analysis and Performance Evaluation for OLAP Ji-Hyun Kim, Hwan-Seung Yong Department of Computer Science and Engineering Ewha Womans University 11-1 Daehyun-dong, Seodaemun-gu, Seoul,

More information

OLAP and Data Warehousing! Introduction!

OLAP and Data Warehousing! Introduction! The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still

More information

CS54100: Database Systems

CS54100: Database Systems CS54100: Database Systems Date Warehousing: Current, Future? 20 April 2012 Prof. Chris Clifton Data Warehousing: Goals OLAP vs OLTP On Line Analytical Processing (vs. Transaction) Optimize for read, not

More information

Learning Objectives. Definition of OLAP Data cubes OLAP operations MDX OLAP servers

Learning Objectives. Definition of OLAP Data cubes OLAP operations MDX OLAP servers OLAP Learning Objectives Definition of OLAP Data cubes OLAP operations MDX OLAP servers 2 What is OLAP? OLAP has two immediate consequences: online part requires the answers of queries to be fast, the

More information

Data Warehousing and OLAP

Data Warehousing and OLAP 1 Data Warehousing and OLAP Hector Garcia-Molina Stanford University Warehousing Growing industry: $8 billion in 1998 Range from desktop to huge: Walmart: 900-CPU, 2,700 disk, 23TB Teradata system Lots

More information

OLAP Systems and Multidimensional Queries II

OLAP Systems and Multidimensional Queries II OLAP Systems and Multidimensional Queries II Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Software Development Technologies Master

More information

IST722 Data Warehousing

IST722 Data Warehousing IST722 Data Warehousing Components of the Data Warehouse Michael A. Fudge, Jr. Recall: Inmon s CIF The CIF is a reference architecture Understanding the Diagram The CIF is a reference architecture CIF

More information

Data W a Ware r house house and and OLAP II Week 6 1

Data W a Ware r house house and and OLAP II Week 6 1 Data Warehouse and OLAP II Week 6 1 Team Homework Assignment #8 Using a data warehousing tool and a data set, play four OLAP operations (Roll up (drill up), Drill down (roll down), Slice and dice, Pivot

More information

OLAP & DATA MINING CS561-SPRING 2012 WPI, MOHAMED ELTABAKH

OLAP & DATA MINING CS561-SPRING 2012 WPI, MOHAMED ELTABAKH OLAP & DATA MINING CS561-SPRING 2012 WPI, MOHAMED ELTABAKH 1 Online Analytic Processing OLAP 2 OLAP OLAP: Online Analytic Processing OLAP queries are complex queries that Touch large amounts of data Discover

More information

Data Warehousing. Read chapter 13 of Riguzzi et al Sistemi Informativi. Slides derived from those by Hector Garcia-Molina

Data Warehousing. Read chapter 13 of Riguzzi et al Sistemi Informativi. Slides derived from those by Hector Garcia-Molina Data Warehousing Read chapter 13 of Riguzzi et al Sistemi Informativi Slides derived from those by Hector Garcia-Molina What is a Warehouse? Collection of diverse data subject oriented aimed at executive,

More information

BUILDING BLOCKS OF DATAWAREHOUSE. G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT

BUILDING BLOCKS OF DATAWAREHOUSE. G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT BUILDING BLOCKS OF DATAWAREHOUSE G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT 1 Data Warehouse Subject Oriented Organized around major subjects, such as customer, product, sales. Focusing on

More information

An Overview of Data Warehousing and OLAP Technology

An Overview of Data Warehousing and OLAP Technology An Overview of Data Warehousing and OLAP Technology Surajit Chaudhuri Umeshwar Dayal Microsoft Research, Redmond Hewlett-Packard Labs, Palo Alto surajitc@microsoft.com dayal@hpl.hp.com Abstract Data warehousing

More information

Main Memory & Near Main Memory OLAP Databases. Wo Shun Luk Professor of Computing Science Simon Fraser University

Main Memory & Near Main Memory OLAP Databases. Wo Shun Luk Professor of Computing Science Simon Fraser University Main Memory & Near Main Memory OLAP Databases Wo Shun Luk Professor of Computing Science Simon Fraser University 1 Outline What is OLAP DB? How does it work? MOLAP, ROLAP Near Main Memory DB Partial Pre

More information

University of Gaziantep, Department of Business Administration

University of Gaziantep, Department of Business Administration University of Gaziantep, Department of Business Administration The extensive use of information technology enables organizations to collect huge amounts of data about almost every aspect of their businesses.

More information

A Technical Review on On-Line Analytical Processing (OLAP)

A Technical Review on On-Line Analytical Processing (OLAP) A Technical Review on On-Line Analytical Processing (OLAP) K. Jayapriya 1., E. Girija 2,III-M.C.A., R.Uma. 3,M.C.A.,M.Phil., Department of computer applications, Assit.Prof,Dept of M.C.A, Dhanalakshmi

More information

CHAPTER 4 Data Warehouse Architecture

CHAPTER 4 Data Warehouse Architecture CHAPTER 4 Data Warehouse Architecture 4.1 Data Warehouse Architecture 4.2 Three-tier data warehouse architecture 4.3 Types of OLAP servers: ROLAP versus MOLAP versus HOLAP 4.4 Further development of Data

More information

Data Warehouse: Introduction

Data Warehouse: Introduction Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of base and data mining group,

More information

Business Intelligence, Data warehousing Concept and artifacts

Business Intelligence, Data warehousing Concept and artifacts Business Intelligence, Data warehousing Concept and artifacts Data Warehousing is the process of constructing and using the data warehouse. The data warehouse is constructed by integrating the data from

More information

Monitoring Genebanks using Datamarts based in an Open Source Tool

Monitoring Genebanks using Datamarts based in an Open Source Tool Monitoring Genebanks using Datamarts based in an Open Source Tool April 10 th, 2008 Edwin Rojas Research Informatics Unit (RIU) International Potato Center (CIP) GPG2 Workshop 2008 Datamarts Motivation

More information

On-Line Application Processing. Warehousing Data Cubes Data Mining

On-Line Application Processing. Warehousing Data Cubes Data Mining On-Line Application Processing Warehousing Data Cubes Data Mining 1 Overview Traditional database systems are tuned to many, small, simple queries. Some new applications use fewer, more time-consuming,

More information

When to consider OLAP?

When to consider OLAP? When to consider OLAP? Author: Prakash Kewalramani Organization: Evaltech, Inc. Evaltech Research Group, Data Warehousing Practice. Date: 03/10/08 Email: erg@evaltech.com Abstract: Do you need an OLAP

More information

DATA WAREHOUSING II. CS121: Introduction to Relational Database Systems Fall 2015 Lecture 23

DATA WAREHOUSING II. CS121: Introduction to Relational Database Systems Fall 2015 Lecture 23 DATA WAREHOUSING II CS121: Introduction to Relational Database Systems Fall 2015 Lecture 23 Last Time: Data Warehousing 2 Last time introduced the topic of decision support systems (DSS) and data warehousing

More information

The Cubetree Storage Organization

The Cubetree Storage Organization The Cubetree Storage Organization Nick Roussopoulos & Yannis Kotidis Advanced Communication Technology, Inc. Silver Spring, MD 20905 Tel: 301-384-3759 Fax: 301-384-3679 {nick,kotidis}@act-us.com 1. Introduction

More information

Data Warehouse Design

Data Warehouse Design Data Warehouse Design Modern Principles and Methodologies Matteo Golfarelli Stefano Rizzi Translated by Claudio Pagliarani Mc Grauu Hill New York Chicago San Francisco Lisbon London Madrid Mexico City

More information

Chapter 3, Data Warehouse and OLAP Operations

Chapter 3, Data Warehouse and OLAP Operations CSI 4352, Introduction to Data Mining Chapter 3, Data Warehouse and OLAP Operations Young-Rae Cho Associate Professor Department of Computer Science Baylor University CSI 4352, Introduction to Data Mining

More information

A Critical Review of Data Warehouse

A Critical Review of Data Warehouse Global Journal of Business Management and Information Technology. Volume 1, Number 2 (2011), pp. 95-103 Research India Publications http://www.ripublication.com A Critical Review of Data Warehouse Sachin

More information

Data Mining and Data Warehousing Henryk Maciejewski Data Warehousing and OLAP

Data Mining and Data Warehousing Henryk Maciejewski Data Warehousing and OLAP Data Mining and Data Warehousing Henryk Maciejewski Data Warehousing and OLAP Part II Data Warehousing Contents OLAP Approach to Data Analysis Database for OLAP = Data Warehouse Logical model Physical

More information

CS6905 - Programming OLAP

CS6905 - Programming OLAP CS6905 - Programming OLAP DANIEL LEMIRE Research Officer, NRC Adjunct Professor, UNB CS6905 - Programming OLAP DANIEL LEMIRE Research Officer, NRC Adjunct Professor, UNB These slides will be made available

More information

The DC-tree: A Fully Dynamic Index Structure for Data Warehouses

The DC-tree: A Fully Dynamic Index Structure for Data Warehouses The DC-tree: A Fully Dynamic Index Structure for Data Warehouses Martin Ester, Jörn Kohlhammer, Hans-Peter Kriegel Institute for Computer Science, University of Munich Oettingenstr. 67, D-80538 Munich,

More information

Mario Guarracino. Data warehousing

Mario Guarracino. Data warehousing Data warehousing Introduction Since the mid-nineties, it became clear that the databases for analysis and business intelligence need to be separate from operational. In this lecture we will review the

More information

Anwendersoftware Anwendungssoftwares a. Data-Warehouse-, Data-Mining- and OLAP-Technologies. Online Analytic Processing

Anwendersoftware Anwendungssoftwares a. Data-Warehouse-, Data-Mining- and OLAP-Technologies. Online Analytic Processing Anwendungssoftwares a Data-Warehouse-, Data-Mining- and OLAP-Technologies Online Analytic Processing Online Analytic Processing OLAP Online Analytic Processing Technologies and tools that support (ad-hoc)

More information

Outline. Data Warehousing. What is a Warehouse? What is a Warehouse?

Outline. Data Warehousing. What is a Warehouse? What is a Warehouse? Outline Data Warehousing What is a data warehouse? Why a warehouse? Models & operations Implementing a warehouse 2 What is a Warehouse? Collection of diverse data subject oriented aimed at executive, decision

More information

The DC-Tree: A Fully Dynamic Index Structure for Data Warehouses

The DC-Tree: A Fully Dynamic Index Structure for Data Warehouses Published in the Proceedings of 16th International Conference on Data Engineering (ICDE 2) The DC-Tree: A Fully Dynamic Index Structure for Data Warehouses Martin Ester, Jörn Kohlhammer, Hans-Peter Kriegel

More information

Database Design Patterns. Winter 2006-2007 Lecture 24

Database Design Patterns. Winter 2006-2007 Lecture 24 Database Design Patterns Winter 2006-2007 Lecture 24 Trees and Hierarchies Many schemas need to represent trees or hierarchies of some sort Common way of representing trees: An adjacency list model Each

More information

Data Warehousing: Data Models and OLAP operations. By Kishore Jaladi kishorejaladi@yahoo.com

Data Warehousing: Data Models and OLAP operations. By Kishore Jaladi kishorejaladi@yahoo.com Data Warehousing: Data Models and OLAP operations By Kishore Jaladi kishorejaladi@yahoo.com Topics Covered 1. Understanding the term Data Warehousing 2. Three-tier Decision Support Systems 3. Approaches

More information

Week 13: Data Warehousing. Warehousing

Week 13: Data Warehousing. Warehousing 1 Week 13: Data Warehousing Warehousing Growing industry: $8 billion in 1998 Range from desktop to huge: Walmart: 900-CPU, 2,700 disk, 23TB Teradata system Lots of buzzwords, hype slice & dice, rollup,

More information

BUSINESS ANALYTICS AND DATA VISUALIZATION. ITM-761 Business Intelligence ดร. สล ล บ ญพราหมณ

BUSINESS ANALYTICS AND DATA VISUALIZATION. ITM-761 Business Intelligence ดร. สล ล บ ญพราหมณ 1 BUSINESS ANALYTICS AND DATA VISUALIZATION ITM-761 Business Intelligence ดร. สล ล บ ญพราหมณ 2 การท าความด น น ยากและเห นผลช า แต ก จ าเป นต องท า เพราะหาไม ความช วซ งท าได ง ายจะเข ามาแทนท และจะพอกพ นข

More information

Hybrid OLAP, An Introduction

Hybrid OLAP, An Introduction Hybrid OLAP, An Introduction Richard Doherty SAS Institute European HQ Agenda Hybrid OLAP overview Building your data model Architectural decisions Metadata creation Report definition Hybrid OLAP overview

More information

14. Data Warehousing & Data Mining

14. Data Warehousing & Data Mining 14. Data Warehousing & Data Mining Data Warehousing Concepts Decision support is key for companies wanting to turn their organizational data into an information asset Data Warehouse "A subject-oriented,

More information

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2009 Lecture 15 - Data Warehousing: Cubes

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2009 Lecture 15 - Data Warehousing: Cubes CSE 544 Principles of Database Management Systems Magdalena Balazinska Winter 2009 Lecture 15 - Data Warehousing: Cubes Final Exam Overview Open books and open notes No laptops and no other mobile devices

More information

The Design and the Implementation of an HEALTH CARE STATISTICS DATA WAREHOUSE Dr. Sreèko Natek, assistant professor, Nova Vizija, srecko@vizija.

The Design and the Implementation of an HEALTH CARE STATISTICS DATA WAREHOUSE Dr. Sreèko Natek, assistant professor, Nova Vizija, srecko@vizija. The Design and the Implementation of an HEALTH CARE STATISTICS DATA WAREHOUSE Dr. Sreèko Natek, assistant professor, Nova Vizija, srecko@vizija.si ABSTRACT Health Care Statistics on a state level is a

More information

The Art of Designing HOLAP Databases Mark Moorman, SAS Institute Inc., Cary NC

The Art of Designing HOLAP Databases Mark Moorman, SAS Institute Inc., Cary NC Paper 139 The Art of Designing HOLAP Databases Mark Moorman, SAS Institute Inc., Cary NC ABSTRACT While OLAP applications offer users fast access to information across business dimensions, it can also

More information

Apache Kylin Introduction Dec 8, 2014 @ApacheKylin

Apache Kylin Introduction Dec 8, 2014 @ApacheKylin Apache Kylin Introduction Dec 8, 2014 @ApacheKylin Luke Han Sr. Product Manager lukhan@ebay.com @lukehq Yang Li Architect & Tech Leader yangli9@ebay.com Agenda What s Apache Kylin? Tech Highlights Performance

More information

CS2032 Data warehousing and Data Mining Unit II Page 1

CS2032 Data warehousing and Data Mining Unit II Page 1 UNIT II BUSINESS ANALYSIS Reporting Query tools and Applications The data warehouse is accessed using an end-user query and reporting tool from Business Objects. Business Objects provides several tools

More information

Data Warehousing. Paper 133-25

Data Warehousing. Paper 133-25 Paper 133-25 The Power of Hybrid OLAP in a Multidimensional World Ann Weinberger, SAS Institute Inc., Cary, NC Matthias Ender, SAS Institute Inc., Cary, NC ABSTRACT Version 8 of the SAS System brings powerful

More information

Business Intelligence: Multidimensional Data Analysis

Business Intelligence: Multidimensional Data Analysis Business Intelligence: Multidimensional Data Analysis Per Westerlund August 20, 2008 Master Thesis in Computing Science 30 ECTS Credits Abstract The relational database model is probably the most frequently

More information

M2074 - Designing and Implementing OLAP Solutions Using Microsoft SQL Server 2000 5 Day Course

M2074 - Designing and Implementing OLAP Solutions Using Microsoft SQL Server 2000 5 Day Course Module 1: Introduction to Data Warehousing and OLAP Introducing Data Warehousing Defining OLAP Solutions Understanding Data Warehouse Design Understanding OLAP Models Applying OLAP Cubes At the end of

More information

Index Selection Techniques in Data Warehouse Systems

Index Selection Techniques in Data Warehouse Systems Index Selection Techniques in Data Warehouse Systems Aliaksei Holubeu as a part of a Seminar Databases and Data Warehouses. Implementation and usage. Konstanz, June 3, 2005 2 Contents 1 DATA WAREHOUSES

More information

Investigating the Effects of Spatial Data Redundancy in Query Performance over Geographical Data Warehouses

Investigating the Effects of Spatial Data Redundancy in Query Performance over Geographical Data Warehouses Investigating the Effects of Spatial Data Redundancy in Query Performance over Geographical Data Warehouses Thiago Luís Lopes Siqueira Ricardo Rodrigues Ciferri Valéria Cesário Times Cristina Dutra de

More information

Andreas Rauber and Philipp Tomsich Institute of Software Technology Vienna University of Technology, Austria {andi,phil}@ifs.tuwien.ac.

Andreas Rauber and Philipp Tomsich Institute of Software Technology Vienna University of Technology, Austria {andi,phil}@ifs.tuwien.ac. An Architecture for Modular On-Line Analytical Processing Systems: Supporting Distributed and Parallel Query Processing Using Co-operating CORBA Objects Andreas Rauber and Philipp Tomsich Institute of

More information

Turkish Journal of Engineering, Science and Technology

Turkish Journal of Engineering, Science and Technology Turkish Journal of Engineering, Science and Technology 03 (2014) 106-110 Turkish Journal of Engineering, Science and Technology journal homepage: www.tujest.com Integrating Data Warehouse with OLAP Server

More information

SAS BI Course Content; Introduction to DWH / BI Concepts

SAS BI Course Content; Introduction to DWH / BI Concepts SAS BI Course Content; Introduction to DWH / BI Concepts SAS Web Report Studio 4.2 SAS EG 4.2 SAS Information Delivery Portal 4.2 SAS Data Integration Studio 4.2 SAS BI Dashboard 4.2 SAS Management Console

More information

Indexing Techniques for Data Warehouses Queries. Abstract

Indexing Techniques for Data Warehouses Queries. Abstract Indexing Techniques for Data Warehouses Queries Sirirut Vanichayobon Le Gruenwald The University of Oklahoma School of Computer Science Norman, OK, 739 sirirut@cs.ou.edu gruenwal@cs.ou.edu Abstract Recently,

More information

Bussiness Intelligence and Data Warehouse. Tomas Bartos CIS 764, Kansas State University

Bussiness Intelligence and Data Warehouse. Tomas Bartos CIS 764, Kansas State University Bussiness Intelligence and Data Warehouse Schedule Bussiness Intelligence (BI) BI tools Oracle vs. Microsoft Data warehouse History Tools Oracle vs. Others Discussion Business Intelligence (BI) Products

More information

While people are often a corporation s true intellectual property, data is what

While people are often a corporation s true intellectual property, data is what While people are often a corporation s true intellectual property, data is what feeds the people, enabling employees to see where the company stands and where it will go. Quick access to quality data helps

More information

Data Warehousing. Outline. From OLTP to the Data Warehouse. Overview of data warehousing Dimensional Modeling Online Analytical Processing

Data Warehousing. Outline. From OLTP to the Data Warehouse. Overview of data warehousing Dimensional Modeling Online Analytical Processing Data Warehousing Outline Overview of data warehousing Dimensional Modeling Online Analytical Processing From OLTP to the Data Warehouse Traditionally, database systems stored data relevant to current business

More information

Data Warehousing Systems: Foundations and Architectures

Data Warehousing Systems: Foundations and Architectures Data Warehousing Systems: Foundations and Architectures Il-Yeol Song Drexel University, http://www.ischool.drexel.edu/faculty/song/ SYNONYMS None DEFINITION A data warehouse (DW) is an integrated repository

More information

Columnstore Indexes for Fast Data Warehouse Query Processing in SQL Server 11.0

Columnstore Indexes for Fast Data Warehouse Query Processing in SQL Server 11.0 SQL Server Technical Article Columnstore Indexes for Fast Data Warehouse Query Processing in SQL Server 11.0 Writer: Eric N. Hanson Technical Reviewer: Susan Price Published: November 2010 Applies to:

More information

(Week 10) A04. Information System for CRM. Electronic Commerce Marketing

(Week 10) A04. Information System for CRM. Electronic Commerce Marketing (Week 10) A04. Information System for CRM Electronic Commerce Marketing Course Code: 166186-01 Course Name: Electronic Commerce Marketing Period: Autumn 2015 Lecturer: Prof. Dr. Sync Sangwon Lee Department:

More information

New Approach of Computing Data Cubes in Data Warehousing

New Approach of Computing Data Cubes in Data Warehousing International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 4, Number 14 (2014), pp. 1411-1417 International Research Publications House http://www. irphouse.com New Approach of

More information

Data W a Ware r house house and and OLAP Week 5 1

Data W a Ware r house house and and OLAP Week 5 1 Data Warehouse and OLAP Week 5 1 Midterm I Friday, March 4 Scope Homework assignments 1 4 Open book Team Homework Assignment #7 Read pp. 121 139, 146 150 of the text book. Do Examples 3.8, 3.10 and Exercise

More information

Data Warehousing Concepts

Data Warehousing Concepts Data Warehousing Concepts JB Software and Consulting Inc 1333 McDermott Drive, Suite 200 Allen, TX 75013. [[[[[ DATA WAREHOUSING What is a Data Warehouse? Decision Support Systems (DSS), provides an analysis

More information

Databases and Data Warehouses

Databases and Data Warehouses Databases and Data Warehouses Lecture BigData Analytics Julian M. Kunkel julian.kunkel@googlemail.com University of Hamburg / German Climate Computing Center (DKRZ) 23-10-2015 Outline 1 Relational Model

More information

Database Management System Dr. S. Srinath Department of Computer Science & Engineering Indian Institute of Technology, Madras Lecture No.

Database Management System Dr. S. Srinath Department of Computer Science & Engineering Indian Institute of Technology, Madras Lecture No. Database Management System Dr. S. Srinath Department of Computer Science & Engineering Indian Institute of Technology, Madras Lecture No. # 31 Introduction to Data Warehousing and OLAP Part 2 Hello and

More information

Introduction to Data Warehousing. Ms Swapnil Shrivastava swapnil@konark.ncst.ernet.in

Introduction to Data Warehousing. Ms Swapnil Shrivastava swapnil@konark.ncst.ernet.in Introduction to Data Warehousing Ms Swapnil Shrivastava swapnil@konark.ncst.ernet.in Necessity is the mother of invention Why Data Warehouse? Scenario 1 ABC Pvt Ltd is a company with branches at Mumbai,

More information

Unit -3. Learning Objective. Demand for Online analytical processing Major features and functions OLAP models and implementation considerations

Unit -3. Learning Objective. Demand for Online analytical processing Major features and functions OLAP models and implementation considerations Unit -3 Learning Objective Demand for Online analytical processing Major features and functions OLAP models and implementation considerations Demand of On Line Analytical Processing Need for multidimensional

More information

Data Warehouse. MIT-652 Data Mining Applications. Thimaporn Phetkaew. School of Informatics, Walailak University. MIT-652: DM 2: Data Warehouse 1

Data Warehouse. MIT-652 Data Mining Applications. Thimaporn Phetkaew. School of Informatics, Walailak University. MIT-652: DM 2: Data Warehouse 1 Data Warehouse MIT-652 Data Mining Applications Thimaporn Phetkaew School of Informatics, Walailak University MIT-652: DM 2: Data Warehouse 1 Chapter 2: Data Warehousing and OLAP Technology for Data Mining

More information

WWW.VIDYARTHIPLUS.COM

WWW.VIDYARTHIPLUS.COM 4.1 Data Warehousing Components What is Data Warehouse? - Defined in many different ways but mainly it is: o A decision support database that is maintained separately from the organization s operational

More information

Overview. Data Warehousing and Decision Support. Introduction. Three Complementary Trends. Data Warehousing. An Example: The Store (e.g.

Overview. Data Warehousing and Decision Support. Introduction. Three Complementary Trends. Data Warehousing. An Example: The Store (e.g. Overview Data Warehousing and Decision Support Chapter 25 Why data warehousing and decision support Data warehousing and the so called star schema MOLAP versus ROLAP OLAP, ROLLUP AND CUBE queries Design

More information

PowerDesigner WarehouseArchitect The Model for Data Warehousing Solutions. A Technical Whitepaper from Sybase, Inc.

PowerDesigner WarehouseArchitect The Model for Data Warehousing Solutions. A Technical Whitepaper from Sybase, Inc. PowerDesigner WarehouseArchitect The Model for Data Warehousing Solutions A Technical Whitepaper from Sybase, Inc. Table of Contents Section I: The Need for Data Warehouse Modeling.....................................4

More information

An Overview of Business Intelligence Technology

An Overview of Business Intelligence Technology BI technologies are essential to running today s businesses and this technology is going through sea changes. An Overview of Business Intelligence Technology By Surajit Chaudhuri, Umeshwar Dayal, and Vivek

More information

low-level storage structures e.g. partitions underpinning the warehouse logical table structures

low-level storage structures e.g. partitions underpinning the warehouse logical table structures DATA WAREHOUSE PHYSICAL DESIGN The physical design of a data warehouse specifies the: low-level storage structures e.g. partitions underpinning the warehouse logical table structures low-level structures

More information