Overview. Data Warehousing and Decision Support. Introduction. Three Complementary Trends. Data Warehousing. An Example: The Store (e.g.

Size: px
Start display at page:

Download "Overview. Data Warehousing and Decision Support. Introduction. Three Complementary Trends. Data Warehousing. An Example: The Store (e.g."

Transcription

1 Overview Data Warehousing and Decision Support Chapter 25 Why data warehousing and decision support Data warehousing and the so called star schema MOLAP versus ROLAP OLAP, ROLLUP AND CUBE queries Design issues: Finding answers quickly Denormalization Indexing Aggregation Top N Queries Using Views Typical architecture CSI3130- Database II 1 CSI3130- Database II 2 Introduction Increasingly, organizations are analyzing current and historical data to identify useful patterns and support business strategies. Emphasis is on complex, interactive, exploratory analysis of very large datasets created by integrating data from across all parts of an enterprise; data is fairly static. Contrast such On Line Analytic Processing (OLAP) with traditional On line Transaction Processing (OLTP): mostly long queries, instead of short update Xacts. CSI3130- Database II 3 Three Complementary Trends Data Warehousing: Consolidate data from many sources in one large repository. Loading, periodic synchronization of replicas. Semantic integration. OLAP: Complex SQL queries and views. Queries based on spreadsheet style operations and multidimensional view of data. Interactive and online queries. Data Mining: Exploratory search for interesting trends and anomalies. (Another lecture!) CSI3130- Database II 4 Data Warehousing EXTERNAL DATA SOURCES An Example: The Store (e.g. Staples) Integrated data spanning long time periods, often augmented with summary information. Several gigabytes to terabytes common. Interactive response times expected for complex queries; ad hoc updates uncommon. Metadata Repository EXTRACT TRANSFORM LOAD REFRESH SUPPORTS DATA WAREHOUSE DATA MINING OLAP CSI3130- Database II 5 Sells office equipment, including stationary (pens, erasers, pencils), folders, paper, notice boards, etc. Business has 50 large stores spread over Ontario and Quebec. Stores are divided into 4 regions, namely North, South, West and East. Each store has about 20,000 different products on the shelves, each with an unique bar code. Bar codes are scanned at the point of sales (POS) Products are distributed to stores by an Agent who is identified by Name. GOAL: Determine how well business is doing. CSI3130- Database II 6 1

2 Successful data warehousing: Summarised High Quality Information Dimensions of Sales CSI3130- Database II 7 CSI3130- Database II 8 Dimensions of Sales Dimensions of Sales Product: Location: Time: Agent: Product-key, BC-description, BC-number, package-size, brand, sub-category, category, department, weight, package-type, weight-unit, units-per-case, units-pershipping-case, shelf-width, shelf-height, shelf-dept, Store-key, Store-name, Store-number, Store-city, Store-province, Store-region, Store-phone, Store-fax, floor-plan-type, first-opened-date, last-remodel-date, store_m 2, time-key, day-of-week, day-number-in-month, day-number-year, week-number, month, month-number, quarter, fiscal-period, holiday-flag, weekend-flag, last-day-of-month-flag, season, event Agent-key, first-name, last-name, age, gender, duration-of-employment, CSI3130- Database II 9 Location Agent Store-key Agent-key Sales FACT Store-key Agent-key Time-key Product-key Time dollar-sales Product Time-key units-sales Product-key dollar-cost customer-count CSI3130- Database II 10 Database sizing for the data warehouse Time dimension: last 2 years x 365 days = 730 days Location dimension: 50 stores, reporting sales every day Product dimension: 20,000 products in each store, of which 3,000 sell each day in a given store Agent dimension: a minimum of 5 agents on each day Number of base Sales fact records = 730x50x3,000x5 = 45,007,500 records! Number of key fields = 4; Number of fact fields = 4; Total fields = 8 Base Sales fact table = 45 million x 8 fields x 4 bytes = 14GB Database size: Sales fact + other tables??? 20GB++ CSI3130- Database II 11 Warehousing Issues Semantic Integration: When getting data from multiple sources, must eliminate mismatches, e.g., different currencies, schemas. Heterogeneous Sources: Must access data from a variety of source formats and repositories. Replication capabilities can be exploited here. Load, Refresh, Purge: Must load data, periodically refresh it, and purge too old data. Metadata Management: Must keep track of source, loading time, and other information for all data in the warehouse. CSI3130- Database II 12 2

3 Multidimensional Data Model Collection of numeric measures, which depend on a set of dimensions. E.g., measure Sales, dimensions Product (key: ), Location (), and Time (timeid). Slice =1 is shown: timeid timeid sales CSI3130- Database II 13 MOLAP vs ROLAP Multidimensional data can be stored physically in a (disk resident, persistent) array; called MOLAP systems. Alternatively, can store as a relation; called ROLAP systems. The main relation, which relates dimensions to a measure, is called the fact table. Each dimension can have additional attributes and an associated dimension table. E.g., Products(, pname, category, price) Fact tables are much larger than dimensional tables. CSI3130- Database II 14 Dimension Hierarchies For each dimension, the set of values can be organized in a hierarchy: PRODUCT TIME LOCATION year quarter country category week month state pname date city CSI3130- Database II 15 OLAP Queries Influenced by SQL and by spreadsheets. A common operation is to aggregate a measure over one or more dimensions. Find total sales. Find total sales for each city, or for each state. Find top five products ranked by total sales. Roll up: Aggregating at different levels of a dimension hierarchy. E.g., Given total sales by city, we can roll up to get sales by state. CSI3130- Database II 16 OLAP Queries Drill down: The inverse of roll up. E.g., Given total sales by state, can drill down to get total sales by city. E.g., Can also drill down on different dimension to get total sales by product for each state. Pivoting: Aggregation on selected dimensions. E.g., Pivoting on Location and Time WI yields CA this cross tabulation: Slicing and Dicing: Equality and range selections on one or more dimensions. Total Total CSI3130- Database II 17 Comparison with SQL Queries The cross tabulation obtained by pivoting can also be computed using a collection of SQLqueries: SELECT SUM(S.sales) FROM Sales S, Times T, Locations L WHERE S.timeid=T.timeid AND S.timeid=L.timeid GROUP BY T.year, L.state SELECT SUM(S.sales) SELECT SUM(S.sales) FROM Sales S, Times T FROM Sales S, Location L WHERE S.timeid=T.timeid GROUP BY T.year WHERE S.timeid=L.timeid GROUP BY L.state CSI3130- Database II 18 3

4 The CUBE Operator Generalizing the previous example, if there are k dimensions, we have 2^k possible SQL GROUP BY queries that can be generated through pivoting on a subset of dimensions. CUBE,, timeid BY SUM Sales Equivalent to rolling up Sales on all eight subsets of the set {,, timeid}; each roll up corresponds to an SQL query of the form: SELECT SUM(S.sales) Lots of work on optimizing FROM Sales S the CUBE operator! GROUP BY grouping-list CSI3130- Database II 19 PRODUCTS Design Issues timeid date week month quarter year holiday_flag pname timeid sales SALES (Fact table) category price LOCATIONS state country Fact table in BCNF; dimension tables un normalized. Dimension tables are small; updates/inserts/deletes are rare. So, anomalies less important than query performance. This kind of schema is very common in OLAP applications, and is called a star schema; computing the join of all these relations is called a star join. CSI3130- Database II 20 city TIMES Implementation Issues: Indexes New indexing techniques: Bitmap indexes, Join indexes, array representations, compression, precomputation of aggregations, etc. E.g., Bitmap index: Bit-vector: F M 1 bit for each possible value. Many queries can be answered using bit-vector ops! sex custid name sex rating rating Joe M Ram M Sue F Woo M CSI3130- Database II 21 Join Indexes Consider the join of Sales, Products, Times, and Locations, possibly with additional selection conditions (e.g., country= USA ). A join index can be constructed to speed up such joins. The index contains [s,p,t,l] if there are tuples (with sid) s in Sales, p in Products, t in Times and l in Locations that satisfy the join (and selection) conditions. Problem: Number of join indexes can grow raly. A variation addresses this problem: For each column with an additional selection (e.g., country), build an index with [c,s] in it if a dimension table tuple with value c in the selection column joins with a Sales tuple with sid s; if indexes are bitmaps, called bitmapped join index. CSI3130- Database II 22 PRODUCTS Bitmapped Join Index timei d pname dat e week mont h quarte r year holiday_fla g timeid sales SALES (Fact table) category price city TIMES LOCATIONS state country Consider a query with conditions price=10 and country= USA. Suppose tuple (with sid) s in Sales joins with a tuple p with price=10 and a tuple l with country = USA. There are two join indexes; one containing [10,s] and the other [USA,s]. Intersecting these indexes tells us which tuples in Sales are in the join and satisfy the given selection. Data Warehousing and Decision Support Chapter 25 CSI3130- Database II 23 CSI3130- Database II 24 4

5 Overview Querying Sequences in SQL Why data warehousing and decision support Data warehousing and the so called star schema MOLAP versus ROLAP OLAP, ROLLUP AND CUBE queries Design issues: Finding answers quickly Denormalization Indexing Aggregation Top N Queries Using Views Typical architecture CSI3130- Database II 25 Trend analysis is difficult to do in SQL 92 or earlier: Find the % change in monthly sales Find the top 5 product by total sales Find the trailing n day moving average of sales The first two queries can be expressed with difficulty, but the third cannot even be expressed in SQL 92 if n is a parameter of the query. The WINDOW clause in SQL:1999+ allows us to write such queries over a table viewed as a sequence (implicitly, based on user specified sort keys) Note: SQL 2008 defines more flexible windowing functions (to be implemented ) CSI3130- Database II 26 The WINDOW Clause SELECT L.state, T.month, AVG(S.sales) OVER W AS movavg FROM Sales S, Times T, Locations L WHERE S.timeid=T.timeid AND S.=L. WINDOW W AS (PARTITION BY L.state ORDER BY T.month RANGE BETWEEN INTERVAL `1 MONTH PRECEDING AND INTERVAL `1 MONTH FOLLOWING) Let the result of the FROM and WHERE clauses be Temp. (Conceptually) Temp is partitioned according to the PARTITION BY clause. Similar to GROUP BY, but the answer has one row for each row in a partition, not one row per partition! Each partition is sorted according to the ORDER BY clause. For each row in a partition, the WINDOW clause creates a window of nearby (preceding or succeeding) tuples. Can be value based, as in example, using RANGE Can be based on number of rows to include in the window, using ROWS clause The aggregate function is evaluated for each row in the partition using the corresponding window. New aggregate functions that are useful with windowing include RANK (position of a row within its partition) and its variants DENSE_RANK, PERCENT_RANK, CUME_DIST. CSI3130- Database II 27 Top N Queries If you want to find the 10 (or so) cheapest cars, it would be nice if the DB could avoid computing the costs of all cars before sorting to determine the 10 cheapest. Idea: Guess at a cost c such that the 10 cheapest all cost less than c, and that not too many more cost less. Then add the selection cost<c and evaluate the query. If the guess is right, great, we avoid computation for cars that cost more than c. If the guess is wrong, need to reset the selection and recompute the original query. CSI3130- Database II 28 Top N Queries SELECT P., P.pname, S.sales FROM Sales S, Products P WHERE S.=P. AND S.=1 AND S.timeid=3 ORDER BY S.sales DESC OPTIMIZE FOR 10 ROWS SELECT P., P.pname, S.sales FROM Sales S, Products P WHERE S.=P. AND S.=1 AND S.timeid=3 AND S.sales > c ORDER BY S.sales DESC OPTIMIZE FOR construct is not in SQL:1999! Cut off value c is chosen by optimizer. CSI3130- Database II 29 Online Aggregation Consider an aggregate query, e.g., finding the average sales by province. Can we provide the user with some information before the exact average is computed for all states? Can show the current running average for each province as the computation proceeds. Even better, if we use statistical techniques and sample tuples to aggregate instead of simply scanning the aggregated table, we can provide bounds such as the average for Ontario is with 95% probability. Should also use nonblocking algorithms! CSI3130- Database II 30 5

6 Views and Decision Support How do you implement a data warehouse? Materialized views pre computed Issues Synchronization Incremental maintenance policy needed Lazy Periodic Forced Multi nationals Project Planning Business Requirement Definition Technical Product Architecture Selection & Design Installation Dimensional Modeling End-User Application Specification Physical Design Data Staging Design & Development End-User Application Development Deployment Maintenance and Growth Project Management CSI3130- Database II 31 CSI3130- Database II The Business Dimensional Lifecycle (Kimball et. Al., 2008) 32 Summary Decision support is an emerging, raly growing subarea of databases. Involves the creation of large, consolidated data repositories called data warehouses. Warehouses exploited using sophisticated analysis techniques: complex SQL queries and OLAP multidimensional queries (influenced by both SQL and spreadsheets). New techniques for database design, indexing, view maintenance, and interactive querying need to be supported. Interested? CSI5115 in the Fall Data warehousing CSI3130- Database II 33 CSI3130- Database II 34 6

Data Warehousing and Decision Support. Introduction. Three Complementary Trends. Chapter 23, Part A

Data Warehousing and Decision Support. Introduction. Three Complementary Trends. Chapter 23, Part A Data Warehousing and Decision Support Chapter 23, Part A Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke 1 Introduction Increasingly, organizations are analyzing current and historical

More information

Decision Support. Chapter 23. Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke 1

Decision Support. Chapter 23. Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke 1 Decision Support Chapter 23 Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke 1 Introduction Increasingly, organizations are analyzing current and historical data to identify useful

More information

CS54100: Database Systems

CS54100: Database Systems CS54100: Database Systems Date Warehousing: Current, Future? 20 April 2012 Prof. Chris Clifton Data Warehousing: Goals OLAP vs OLTP On Line Analytical Processing (vs. Transaction) Optimize for read, not

More information

OLAP and Data Warehousing! Introduction!

OLAP and Data Warehousing! Introduction! The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still

More information

DATA CUBES E0 261. Jayant Haritsa Computer Science and Automation Indian Institute of Science. JAN 2014 Slide 1 DATA CUBES

DATA CUBES E0 261. Jayant Haritsa Computer Science and Automation Indian Institute of Science. JAN 2014 Slide 1 DATA CUBES E0 261 Jayant Haritsa Computer Science and Automation Indian Institute of Science JAN 2014 Slide 1 Introduction Increasingly, organizations are analyzing historical data to identify useful patterns and

More information

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2007 Lecture 16 - Data Warehousing

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2007 Lecture 16 - Data Warehousing CSE 544 Principles of Database Management Systems Magdalena Balazinska Fall 2007 Lecture 16 - Data Warehousing Class Projects Class projects are going very well! Project presentations: 15 minutes On Wednesday

More information

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2009 Lecture 15 - Data Warehousing: Cubes

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2009 Lecture 15 - Data Warehousing: Cubes CSE 544 Principles of Database Management Systems Magdalena Balazinska Winter 2009 Lecture 15 - Data Warehousing: Cubes Final Exam Overview Open books and open notes No laptops and no other mobile devices

More information

Data Warehousing. Outline. From OLTP to the Data Warehouse. Overview of data warehousing Dimensional Modeling Online Analytical Processing

Data Warehousing. Outline. From OLTP to the Data Warehouse. Overview of data warehousing Dimensional Modeling Online Analytical Processing Data Warehousing Outline Overview of data warehousing Dimensional Modeling Online Analytical Processing From OLTP to the Data Warehouse Traditionally, database systems stored data relevant to current business

More information

LITERATURE SURVEY ON DATA WAREHOUSE AND ITS TECHNIQUES

LITERATURE SURVEY ON DATA WAREHOUSE AND ITS TECHNIQUES LITERATURE SURVEY ON DATA WAREHOUSE AND ITS TECHNIQUES MUHAMMAD KHALEEL (0912125) SZABIST KARACHI CAMPUS Abstract. Data warehouse and online analytical processing (OLAP) both are core component for decision

More information

DATA WAREHOUSING AND OLAP TECHNOLOGY

DATA WAREHOUSING AND OLAP TECHNOLOGY DATA WAREHOUSING AND OLAP TECHNOLOGY Manya Sethi MCA Final Year Amity University, Uttar Pradesh Under Guidance of Ms. Shruti Nagpal Abstract DATA WAREHOUSING and Online Analytical Processing (OLAP) are

More information

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 29-1

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 29-1 Slide 29-1 Chapter 29 Overview of Data Warehousing and OLAP Chapter 29 Outline Purpose of Data Warehousing Introduction, Definitions, and Terminology Comparison with Traditional Databases Characteristics

More information

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP Data Warehousing and End-User Access Tools OLAP and Data Mining Accompanying growth in data warehouses is increasing demands for more powerful access tools providing advanced analytical capabilities. Key

More information

Week 13: Data Warehousing. Warehousing

Week 13: Data Warehousing. Warehousing 1 Week 13: Data Warehousing Warehousing Growing industry: $8 billion in 1998 Range from desktop to huge: Walmart: 900-CPU, 2,700 disk, 23TB Teradata system Lots of buzzwords, hype slice & dice, rollup,

More information

Week 3 lecture slides

Week 3 lecture slides Week 3 lecture slides Topics Data Warehouses Online Analytical Processing Introduction to Data Cubes Textbook reference: Chapter 3 Data Warehouses A data warehouse is a collection of data specifically

More information

Data Warehousing. Read chapter 13 of Riguzzi et al Sistemi Informativi. Slides derived from those by Hector Garcia-Molina

Data Warehousing. Read chapter 13 of Riguzzi et al Sistemi Informativi. Slides derived from those by Hector Garcia-Molina Data Warehousing Read chapter 13 of Riguzzi et al Sistemi Informativi Slides derived from those by Hector Garcia-Molina What is a Warehouse? Collection of diverse data subject oriented aimed at executive,

More information

Outline. Data Warehousing. What is a Warehouse? What is a Warehouse?

Outline. Data Warehousing. What is a Warehouse? What is a Warehouse? Outline Data Warehousing What is a data warehouse? Why a warehouse? Models & operations Implementing a warehouse 2 What is a Warehouse? Collection of diverse data subject oriented aimed at executive, decision

More information

1. OLAP is an acronym for a. Online Analytical Processing b. Online Analysis Process c. Online Arithmetic Processing d. Object Linking and Processing

1. OLAP is an acronym for a. Online Analytical Processing b. Online Analysis Process c. Online Arithmetic Processing d. Object Linking and Processing 1. OLAP is an acronym for a. Online Analytical Processing b. Online Analysis Process c. Online Arithmetic Processing d. Object Linking and Processing 2. What is a Data warehouse a. A database application

More information

New Approach of Computing Data Cubes in Data Warehousing

New Approach of Computing Data Cubes in Data Warehousing International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 4, Number 14 (2014), pp. 1411-1417 International Research Publications House http://www. irphouse.com New Approach of

More information

Part 22. Data Warehousing

Part 22. Data Warehousing Part 22 Data Warehousing The Decision Support System (DSS) Tools to assist decision-making Used at all levels in the organization Sometimes focused on a single area Sometimes focused on a single problem

More information

Data Warehousing: Data Models and OLAP operations. By Kishore Jaladi kishorejaladi@yahoo.com

Data Warehousing: Data Models and OLAP operations. By Kishore Jaladi kishorejaladi@yahoo.com Data Warehousing: Data Models and OLAP operations By Kishore Jaladi kishorejaladi@yahoo.com Topics Covered 1. Understanding the term Data Warehousing 2. Three-tier Decision Support Systems 3. Approaches

More information

Review. Data Warehousing. Today. Star schema. Star join indexes. Dimension hierarchies

Review. Data Warehousing. Today. Star schema. Star join indexes. Dimension hierarchies Review Data Warehousing CPS 216 Advanced Database Systems Data warehousing: integrating data for OLAP OLAP versus OLTP Warehousing versus mediation Warehouse maintenance Warehouse data as materialized

More information

DATA WAREHOUSING - OLAP

DATA WAREHOUSING - OLAP http://www.tutorialspoint.com/dwh/dwh_olap.htm DATA WAREHOUSING - OLAP Copyright tutorialspoint.com Online Analytical Processing Server OLAP is based on the multidimensional data model. It allows managers,

More information

When to consider OLAP?

When to consider OLAP? When to consider OLAP? Author: Prakash Kewalramani Organization: Evaltech, Inc. Evaltech Research Group, Data Warehousing Practice. Date: 03/10/08 Email: erg@evaltech.com Abstract: Do you need an OLAP

More information

Mario Guarracino. Data warehousing

Mario Guarracino. Data warehousing Data warehousing Introduction Since the mid-nineties, it became clear that the databases for analysis and business intelligence need to be separate from operational. In this lecture we will review the

More information

14. Data Warehousing & Data Mining

14. Data Warehousing & Data Mining 14. Data Warehousing & Data Mining Data Warehousing Concepts Decision support is key for companies wanting to turn their organizational data into an information asset Data Warehouse "A subject-oriented,

More information

Data Warehousing and OLAP

Data Warehousing and OLAP 1 Data Warehousing and OLAP Hector Garcia-Molina Stanford University Warehousing Growing industry: $8 billion in 1998 Range from desktop to huge: Walmart: 900-CPU, 2,700 disk, 23TB Teradata system Lots

More information

Learning Objectives. Definition of OLAP Data cubes OLAP operations MDX OLAP servers

Learning Objectives. Definition of OLAP Data cubes OLAP operations MDX OLAP servers OLAP Learning Objectives Definition of OLAP Data cubes OLAP operations MDX OLAP servers 2 What is OLAP? OLAP has two immediate consequences: online part requires the answers of queries to be fast, the

More information

CS2032 Data warehousing and Data Mining Unit II Page 1

CS2032 Data warehousing and Data Mining Unit II Page 1 UNIT II BUSINESS ANALYSIS Reporting Query tools and Applications The data warehouse is accessed using an end-user query and reporting tool from Business Objects. Business Objects provides several tools

More information

OLAP Systems and Multidimensional Expressions I

OLAP Systems and Multidimensional Expressions I OLAP Systems and Multidimensional Expressions I Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Software Development Technologies Master

More information

BUILDING BLOCKS OF DATAWAREHOUSE. G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT

BUILDING BLOCKS OF DATAWAREHOUSE. G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT BUILDING BLOCKS OF DATAWAREHOUSE G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT 1 Data Warehouse Subject Oriented Organized around major subjects, such as customer, product, sales. Focusing on

More information

Database Applications. Advanced Querying. Transaction Processing. Transaction Processing. Data Warehouse. Decision Support. Transaction processing

Database Applications. Advanced Querying. Transaction Processing. Transaction Processing. Data Warehouse. Decision Support. Transaction processing Database Applications Advanced Querying Transaction processing Online setting Supports day-to-day operation of business OLAP Data Warehousing Decision support Offline setting Strategic planning (statistics)

More information

Data W a Ware r house house and and OLAP II Week 6 1

Data W a Ware r house house and and OLAP II Week 6 1 Data Warehouse and OLAP II Week 6 1 Team Homework Assignment #8 Using a data warehousing tool and a data set, play four OLAP operations (Roll up (drill up), Drill down (roll down), Slice and dice, Pivot

More information

Main Memory & Near Main Memory OLAP Databases. Wo Shun Luk Professor of Computing Science Simon Fraser University

Main Memory & Near Main Memory OLAP Databases. Wo Shun Luk Professor of Computing Science Simon Fraser University Main Memory & Near Main Memory OLAP Databases Wo Shun Luk Professor of Computing Science Simon Fraser University 1 Outline What is OLAP DB? How does it work? MOLAP, ROLAP Near Main Memory DB Partial Pre

More information

Multi-dimensional index structures Part I: motivation

Multi-dimensional index structures Part I: motivation Multi-dimensional index structures Part I: motivation 144 Motivation: Data Warehouse A definition A data warehouse is a repository of integrated enterprise data. A data warehouse is used specifically for

More information

OLAP and OLTP. AMIT KUMAR BINDAL Associate Professor M M U MULLANA

OLAP and OLTP. AMIT KUMAR BINDAL Associate Professor M M U MULLANA OLAP and OLTP AMIT KUMAR BINDAL Associate Professor Databases Databases are developed on the IDEA that DATA is one of the critical materials of the Information Age Information, which is created by data,

More information

Data Warehouse: Introduction

Data Warehouse: Introduction Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of base and data mining group,

More information

OLAP OLAP. Data Warehouse. OLAP Data Model: the Data Cube S e s s io n

OLAP OLAP. Data Warehouse. OLAP Data Model: the Data Cube S e s s io n OLAP OLAP On-Line Analytical Processing In contrast to on-line transaction processing (OLTP) Mostly ad hoc queries involving aggregation Response time rather than throughput is the main performance measure.

More information

Module 1: Introduction to Data Warehousing and OLAP

Module 1: Introduction to Data Warehousing and OLAP Raw Data vs. Business Information Module 1: Introduction to Data Warehousing and OLAP Capturing Raw Data Gathering data recorded in everyday operations Deriving Business Information Deriving meaningful

More information

Introduction to Data Warehousing. Ms Swapnil Shrivastava swapnil@konark.ncst.ernet.in

Introduction to Data Warehousing. Ms Swapnil Shrivastava swapnil@konark.ncst.ernet.in Introduction to Data Warehousing Ms Swapnil Shrivastava swapnil@konark.ncst.ernet.in Necessity is the mother of invention Why Data Warehouse? Scenario 1 ABC Pvt Ltd is a company with branches at Mumbai,

More information

Data Warehousing & OLAP

Data Warehousing & OLAP Data Warehousing & OLAP What is Data Warehouse? A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection of data in support of management s decisionmaking process. W.

More information

Lecture Data Warehouse Systems

Lecture Data Warehouse Systems Lecture Data Warehouse Systems Eva Zangerle SS 2013 PART A: Architecture Chapter 1: Motivation and Definitions Motivation Goal: to build an operational general view on a company to support decisions in

More information

WWW.VIDYARTHIPLUS.COM

WWW.VIDYARTHIPLUS.COM 4.1 Data Warehousing Components What is Data Warehouse? - Defined in many different ways but mainly it is: o A decision support database that is maintained separately from the organization s operational

More information

low-level storage structures e.g. partitions underpinning the warehouse logical table structures

low-level storage structures e.g. partitions underpinning the warehouse logical table structures DATA WAREHOUSE PHYSICAL DESIGN The physical design of a data warehouse specifies the: low-level storage structures e.g. partitions underpinning the warehouse logical table structures low-level structures

More information

Chapter 3, Data Warehouse and OLAP Operations

Chapter 3, Data Warehouse and OLAP Operations CSI 4352, Introduction to Data Mining Chapter 3, Data Warehouse and OLAP Operations Young-Rae Cho Associate Professor Department of Computer Science Baylor University CSI 4352, Introduction to Data Mining

More information

1960s 1970s 1980s 1990s. Slow access to

1960s 1970s 1980s 1990s. Slow access to Principles of Knowledge Discovery in Fall 2002 Chapter 2: Warehousing and Dr. Osmar R. Zaïane University of Alberta Dr. Osmar R. Zaïane, 1999-2002 Principles of Knowledge Discovery in University of Alberta

More information

2074 : Designing and Implementing OLAP Solutions Using Microsoft SQL Server 2000

2074 : Designing and Implementing OLAP Solutions Using Microsoft SQL Server 2000 2074 : Designing and Implementing OLAP Solutions Using Microsoft SQL Server 2000 Introduction This course provides students with the knowledge and skills necessary to design, implement, and deploy OLAP

More information

Overview of Data Warehousing and OLAP

Overview of Data Warehousing and OLAP Overview of Data Warehousing and OLAP Chapter 28 March 24, 2008 ADBS: DW 1 Chapter Outline What is a data warehouse (DW) Conceptual structure of DW Why separate DW Data modeling for DW Online Analytical

More information

Data Warehousing Systems: Foundations and Architectures

Data Warehousing Systems: Foundations and Architectures Data Warehousing Systems: Foundations and Architectures Il-Yeol Song Drexel University, http://www.ischool.drexel.edu/faculty/song/ SYNONYMS None DEFINITION A data warehouse (DW) is an integrated repository

More information

Data Warehouse design

Data Warehouse design Data Warehouse design Design of Enterprise Systems University of Pavia 21/11/2013-1- Data Warehouse design DATA PRESENTATION - 2- BI Reporting Success Factors BI platform success factors include: Performance

More information

OLAP. Business Intelligence OLAP definition & application Multidimensional data representation

OLAP. Business Intelligence OLAP definition & application Multidimensional data representation OLAP Business Intelligence OLAP definition & application Multidimensional data representation 1 Business Intelligence Accompanying the growth in data warehousing is an ever-increasing demand by users for

More information

www.ijreat.org Published by: PIONEER RESEARCH & DEVELOPMENT GROUP (www.prdg.org) 28

www.ijreat.org Published by: PIONEER RESEARCH & DEVELOPMENT GROUP (www.prdg.org) 28 Data Warehousing - Essential Element To Support Decision- Making Process In Industries Ashima Bhasin 1, Mr Manoj Kumar 2 1 Computer Science Engineering Department, 2 Associate Professor, CSE Abstract SGT

More information

OLAP & DATA MINING CS561-SPRING 2012 WPI, MOHAMED ELTABAKH

OLAP & DATA MINING CS561-SPRING 2012 WPI, MOHAMED ELTABAKH OLAP & DATA MINING CS561-SPRING 2012 WPI, MOHAMED ELTABAKH 1 Online Analytic Processing OLAP 2 OLAP OLAP: Online Analytic Processing OLAP queries are complex queries that Touch large amounts of data Discover

More information

This tutorial will help computer science graduates to understand the basic-toadvanced concepts related to data warehousing.

This tutorial will help computer science graduates to understand the basic-toadvanced concepts related to data warehousing. About the Tutorial A data warehouse is constructed by integrating data from multiple heterogeneous sources. It supports analytical reporting, structured and/or ad hoc queries and decision making. This

More information

SAS BI Course Content; Introduction to DWH / BI Concepts

SAS BI Course Content; Introduction to DWH / BI Concepts SAS BI Course Content; Introduction to DWH / BI Concepts SAS Web Report Studio 4.2 SAS EG 4.2 SAS Information Delivery Portal 4.2 SAS Data Integration Studio 4.2 SAS BI Dashboard 4.2 SAS Management Console

More information

Data Warehousing and OLAP Technology for Knowledge Discovery

Data Warehousing and OLAP Technology for Knowledge Discovery 542 Data Warehousing and OLAP Technology for Knowledge Discovery Aparajita Suman Abstract Since time immemorial, libraries have been generating services using the knowledge stored in various repositories

More information

Turning your Warehouse Data into Business Intelligence: Reporting Trends and Visibility Michael Armanious; Vice President Sales and Marketing Datex,

Turning your Warehouse Data into Business Intelligence: Reporting Trends and Visibility Michael Armanious; Vice President Sales and Marketing Datex, Turning your Warehouse Data into Business Intelligence: Reporting Trends and Visibility Michael Armanious; Vice President Sales and Marketing Datex, Inc. Overview Introduction What is Business Intelligence?

More information

Data W a Ware r house house and and OLAP Week 5 1

Data W a Ware r house house and and OLAP Week 5 1 Data Warehouse and OLAP Week 5 1 Midterm I Friday, March 4 Scope Homework assignments 1 4 Open book Team Homework Assignment #7 Read pp. 121 139, 146 150 of the text book. Do Examples 3.8, 3.10 and Exercise

More information

Fluency With Information Technology CSE100/IMT100

Fluency With Information Technology CSE100/IMT100 Fluency With Information Technology CSE100/IMT100 ),7 Larry Snyder & Mel Oyler, Instructors Ariel Kemp, Isaac Kunen, Gerome Miklau & Sean Squires, Teaching Assistants University of Washington, Autumn 1999

More information

On-Line Application Processing. Warehousing Data Cubes Data Mining

On-Line Application Processing. Warehousing Data Cubes Data Mining On-Line Application Processing Warehousing Data Cubes Data Mining 1 Overview Traditional database systems are tuned to many, small, simple queries. Some new applications use fewer, more time-consuming,

More information

Search and Data Mining Techniques. OLAP Anna Yarygina Boris Novikov

Search and Data Mining Techniques. OLAP Anna Yarygina Boris Novikov Search and Data Mining Techniques OLAP Anna Yarygina Boris Novikov The Database: Shared Data Store? A dream from database textbooks: Sharing data between applications This NEVER happened. Applications

More information

INFO 321, Database Systems, Semester 2 2012

INFO 321, Database Systems, Semester 2 2012 References References INFO 321 Chapter 3: Decision Support Systems Department of Information Science Semester 2, 2012 General Kifer Chapter 17 Silberschatz (5th ed.) Chapter 18 Data Warehousing for Cavemen

More information

M2074 - Designing and Implementing OLAP Solutions Using Microsoft SQL Server 2000 5 Day Course

M2074 - Designing and Implementing OLAP Solutions Using Microsoft SQL Server 2000 5 Day Course Module 1: Introduction to Data Warehousing and OLAP Introducing Data Warehousing Defining OLAP Solutions Understanding Data Warehouse Design Understanding OLAP Models Applying OLAP Cubes At the end of

More information

To increase performaces SQL has been extended in order to have some new operations avaiable. They are:

To increase performaces SQL has been extended in order to have some new operations avaiable. They are: OLAP To increase performaces SQL has been extended in order to have some new operations avaiable. They are:. roll up: aggregates different events to reduce details used to describe them (looking at higher

More information

Basics of Dimensional Modeling

Basics of Dimensional Modeling Basics of Dimensional Modeling Data warehouse and OLAP tools are based on a dimensional data model. A dimensional model is based on dimensions, facts, cubes, and schemas such as star and snowflake. Dimensional

More information

Business Intelligence, Data Warehousing and Multidimensional Databases. Torben Bach Pedersen

Business Intelligence, Data Warehousing and Multidimensional Databases. Torben Bach Pedersen Business Intelligence, Data Warehousing and Multidimensional Databases Torben Bach Pedersen Business Intelligence Overview Why Business Intelligence? Data analysis problems Data Warehouse (DW) introduction

More information

Business Intelligence, Data warehousing Concept and artifacts

Business Intelligence, Data warehousing Concept and artifacts Business Intelligence, Data warehousing Concept and artifacts Data Warehousing is the process of constructing and using the data warehouse. The data warehouse is constructed by integrating the data from

More information

Alejandro Vaisman Esteban Zimanyi. Data. Warehouse. Systems. Design and Implementation. ^ Springer

Alejandro Vaisman Esteban Zimanyi. Data. Warehouse. Systems. Design and Implementation. ^ Springer Alejandro Vaisman Esteban Zimanyi Data Warehouse Systems Design and Implementation ^ Springer Contents Part I Fundamental Concepts 1 Introduction 3 1.1 A Historical Overview of Data Warehousing 4 1.2 Spatial

More information

Indexing Techniques for Data Warehouses Queries. Abstract

Indexing Techniques for Data Warehouses Queries. Abstract Indexing Techniques for Data Warehouses Queries Sirirut Vanichayobon Le Gruenwald The University of Oklahoma School of Computer Science Norman, OK, 739 sirirut@cs.ou.edu gruenwal@cs.ou.edu Abstract Recently,

More information

Lecture 2 Data warehousing

Lecture 2 Data warehousing King Saud University College of Computer & Information Sciences IS 466 Decision Support Systems Lecture 2 Data warehousing Dr. Mourad YKHLEF The slides content is derived and adopted from many references

More information

Oracle9i Data Warehouse Review. Robert F. Edwards Dulcian, Inc.

Oracle9i Data Warehouse Review. Robert F. Edwards Dulcian, Inc. Oracle9i Data Warehouse Review Robert F. Edwards Dulcian, Inc. Agenda Oracle9i Server OLAP Server Analytical SQL Data Mining ETL Warehouse Builder 3i Oracle 9i Server Overview 9i Server = Data Warehouse

More information

What is OLAP - On-line analytical processing

What is OLAP - On-line analytical processing What is OLAP - On-line analytical processing Vladimir Estivill-Castro School of Computing and Information Technology With contributions for J. Han 1 Introduction When a company has received/accumulated

More information

CHAPTER 4 Data Warehouse Architecture

CHAPTER 4 Data Warehouse Architecture CHAPTER 4 Data Warehouse Architecture 4.1 Data Warehouse Architecture 4.2 Three-tier data warehouse architecture 4.3 Types of OLAP servers: ROLAP versus MOLAP versus HOLAP 4.4 Further development of Data

More information

CASE PROJECTS IN DATA WAREHOUSING AND DATA MINING

CASE PROJECTS IN DATA WAREHOUSING AND DATA MINING CASE PROJECTS IN DATA WAREHOUSING AND DATA MINING Mohammad A. Rob, University of Houston-Clear Lake, rob@uhcl.edu Michael E. Ellis, University of Houston-Clear Lake, ellisme@uhcl.edu ABSTRACT This paper

More information

Building Data Cubes and Mining Them. Jelena Jovanovic Email: jeljov@fon.bg.ac.yu

Building Data Cubes and Mining Them. Jelena Jovanovic Email: jeljov@fon.bg.ac.yu Building Data Cubes and Mining Them Jelena Jovanovic Email: jeljov@fon.bg.ac.yu KDD Process KDD is an overall process of discovering useful knowledge from data. Data mining is a particular step in the

More information

Data Warehousing. Yeow Wei Choong Anne Laurent

Data Warehousing. Yeow Wei Choong Anne Laurent Data Warehousing Yeow Wei Choong Anne Laurent Databases Databases are developed on the IDEA that DATA is one of the cri>cal materials of the Informa>on Age Informa>on, which is created by data, becomes

More information

COURSE OUTLINE. Track 1 Advanced Data Modeling, Analysis and Design

COURSE OUTLINE. Track 1 Advanced Data Modeling, Analysis and Design COURSE OUTLINE Track 1 Advanced Data Modeling, Analysis and Design TDWI Advanced Data Modeling Techniques Module One Data Modeling Concepts Data Models in Context Zachman Framework Overview Levels of Data

More information

(Week 10) A04. Information System for CRM. Electronic Commerce Marketing

(Week 10) A04. Information System for CRM. Electronic Commerce Marketing (Week 10) A04. Information System for CRM Electronic Commerce Marketing Course Code: 166186-01 Course Name: Electronic Commerce Marketing Period: Autumn 2015 Lecturer: Prof. Dr. Sync Sangwon Lee Department:

More information

DATA WAREHOUSE E KNOWLEDGE DISCOVERY

DATA WAREHOUSE E KNOWLEDGE DISCOVERY DATA WAREHOUSE E KNOWLEDGE DISCOVERY Prof. Fabio A. Schreiber Dipartimento di Elettronica e Informazione Politecnico di Milano DATA WAREHOUSE (DW) A TECHNIQUE FOR CORRECTLY ASSEMBLING AND MANAGING DATA

More information

Enterprise Performance Tuning: Best Practices with SQL Server 2008 Analysis Services. By Ajay Goyal Consultant Scalability Experts, Inc.

Enterprise Performance Tuning: Best Practices with SQL Server 2008 Analysis Services. By Ajay Goyal Consultant Scalability Experts, Inc. Enterprise Performance Tuning: Best Practices with SQL Server 2008 Analysis Services By Ajay Goyal Consultant Scalability Experts, Inc. June 2009 Recommendations presented in this document should be thoroughly

More information

A Design and implementation of a data warehouse for research administration universities

A Design and implementation of a data warehouse for research administration universities A Design and implementation of a data warehouse for research administration universities André Flory 1, Pierre Soupirot 2, and Anne Tchounikine 3 1 CRI : Centre de Ressources Informatiques INSA de Lyon

More information

Why Business Intelligence

Why Business Intelligence Why Business Intelligence Ferruccio Ferrando z IT Specialist Techline Italy March 2011 page 1 di 11 1.1 The origins In the '50s economic boom, when demand and production were very high, the only concern

More information

PowerDesigner WarehouseArchitect The Model for Data Warehousing Solutions. A Technical Whitepaper from Sybase, Inc.

PowerDesigner WarehouseArchitect The Model for Data Warehousing Solutions. A Technical Whitepaper from Sybase, Inc. PowerDesigner WarehouseArchitect The Model for Data Warehousing Solutions A Technical Whitepaper from Sybase, Inc. Table of Contents Section I: The Need for Data Warehouse Modeling.....................................4

More information

Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Oman College of Management and Technology Course 803401 DSS Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization CS/MIS Department Information Sharing

More information

Data warehousing in Oracle. SQL extensions for data warehouse analysis. Available OLAP functions. Physical aggregation example

Data warehousing in Oracle. SQL extensions for data warehouse analysis. Available OLAP functions. Physical aggregation example Data warehousing in Oracle Materialized views and SQL extensions to analyze data in Oracle data warehouses SQL extensions for data warehouse analysis Available OLAP functions Computation windows window

More information

Anwendersoftware Anwendungssoftwares a. Data-Warehouse-, Data-Mining- and OLAP-Technologies. Online Analytic Processing

Anwendersoftware Anwendungssoftwares a. Data-Warehouse-, Data-Mining- and OLAP-Technologies. Online Analytic Processing Anwendungssoftwares a Data-Warehouse-, Data-Mining- and OLAP-Technologies Online Analytic Processing Online Analytic Processing OLAP Online Analytic Processing Technologies and tools that support (ad-hoc)

More information

Data Warehousing and Online Analytical Processing

Data Warehousing and Online Analytical Processing Contents 4 Data Warehousing and Online Analytical Processing 3 4.1 Data Warehouse: Basic Concepts.................. 4 4.1.1 What is a Data Warehouse?................. 4 4.1.2 Differences between Operational

More information

Turkish Journal of Engineering, Science and Technology

Turkish Journal of Engineering, Science and Technology Turkish Journal of Engineering, Science and Technology 03 (2014) 106-110 Turkish Journal of Engineering, Science and Technology journal homepage: www.tujest.com Integrating Data Warehouse with OLAP Server

More information

European Archival Records and Knowledge Preservation Database Archiving in the E-ARK Project

European Archival Records and Knowledge Preservation Database Archiving in the E-ARK Project European Archival Records and Knowledge Preservation Database Archiving in the E-ARK Project Janet Delve, University of Portsmouth Kuldar Aas, National Archives of Estonia Rainer Schmidt, Austrian Institute

More information

DATA WAREHOUSING APPLICATIONS: AN ANALYTICAL TOOL FOR DECISION SUPPORT SYSTEM

DATA WAREHOUSING APPLICATIONS: AN ANALYTICAL TOOL FOR DECISION SUPPORT SYSTEM DATA WAREHOUSING APPLICATIONS: AN ANALYTICAL TOOL FOR DECISION SUPPORT SYSTEM MOHAMMED SHAFEEQ AHMED Guest Lecturer, Department of Computer Science, Gulbarga University, Gulbarga, Karnataka, India (e-mail:

More information

Data Warehouses & OLAP

Data Warehouses & OLAP Riadh Ben Messaoud 1. The Big Picture 2. Data Warehouse Philosophy 3. Data Warehouse Concepts 4. Warehousing Applications 5. Warehouse Schema Design 6. Business Intelligence Reporting 7. On-Line Analytical

More information

Data Warehousing. Overview, Terminology, and Research Issues. Joachim Hammer. Joachim Hammer

Data Warehousing. Overview, Terminology, and Research Issues. Joachim Hammer. Joachim Hammer Data Warehousing Overview, Terminology, and Research Issues 1 Heterogeneous Database Integration Integration System World Wide Web Digital Libraries Scientific Databases Personal Databases Collects and

More information

Business Intelligence, Analytics & Reporting: Glossary of Terms

Business Intelligence, Analytics & Reporting: Glossary of Terms Business Intelligence, Analytics & Reporting: Glossary of Terms A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Ad-hoc analytics Ad-hoc analytics is the process by which a user can create a new report

More information

Lost in Space? Methodology for a Guided Drill-Through Analysis Out of the Wormhole

Lost in Space? Methodology for a Guided Drill-Through Analysis Out of the Wormhole Paper BB-01 Lost in Space? Methodology for a Guided Drill-Through Analysis Out of the Wormhole ABSTRACT Stephen Overton, Overton Technologies, LLC, Raleigh, NC Business information can be consumed many

More information

Optimizing Your Data Warehouse Design for Superior Performance

Optimizing Your Data Warehouse Design for Superior Performance Optimizing Your Data Warehouse Design for Superior Performance Lester Knutsen, President and Principal Database Consultant Advanced DataTools Corporation Session 2100A The Problem The database is too complex

More information

Data Warehousing & OLAP

Data Warehousing & OLAP Data Warehousing & OLAP Motivation: Business Intelligence Customer information (customer-id, gender, age, homeaddress, occupation, income, family-size, ) Product information (Product-id, category, manufacturer,

More information

Optimizing the Performance of the Oracle BI Applications using Oracle Datawarehousing Features and Oracle DAC 10.1.3.4.1

Optimizing the Performance of the Oracle BI Applications using Oracle Datawarehousing Features and Oracle DAC 10.1.3.4.1 Optimizing the Performance of the Oracle BI Applications using Oracle Datawarehousing Features and Oracle DAC 10.1.3.4.1 Mark Rittman, Director, Rittman Mead Consulting for Collaborate 09, Florida, USA,

More information

Data Warehousing and Data Mining

Data Warehousing and Data Mining Data Warehousing and Data Mining Part I: Data Warehousing Gao Cong gaocong@cs.aau.dk Slides adapted from Man Lung Yiu and Torben Bach Pedersen Course Structure Business intelligence: Extract knowledge

More information

A Technical Review on On-Line Analytical Processing (OLAP)

A Technical Review on On-Line Analytical Processing (OLAP) A Technical Review on On-Line Analytical Processing (OLAP) K. Jayapriya 1., E. Girija 2,III-M.C.A., R.Uma. 3,M.C.A.,M.Phil., Department of computer applications, Assit.Prof,Dept of M.C.A, Dhanalakshmi

More information

IST722 Data Warehousing

IST722 Data Warehousing IST722 Data Warehousing Components of the Data Warehouse Michael A. Fudge, Jr. Recall: Inmon s CIF The CIF is a reference architecture Understanding the Diagram The CIF is a reference architecture CIF

More information

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

More information