14. Data Warehousing & Data Mining



Similar documents
OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP

OLAP and OLTP. AMIT KUMAR BINDAL Associate Professor M M U MULLANA

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 29-1

Part 22. Data Warehousing

Data Warehousing, OLAP, and Data Mining

IT0457 Data Warehousing. G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT

Fluency With Information Technology CSE100/IMT100

IST722 Data Warehousing

CS2032 Data warehousing and Data Mining Unit II Page 1

1. OLAP is an acronym for a. Online Analytical Processing b. Online Analysis Process c. Online Arithmetic Processing d. Object Linking and Processing

Concepts of Database Management Seventh Edition. Chapter 9 Database Management Approaches

Data Warehousing Systems: Foundations and Architectures

DATA WAREHOUSE CONCEPTS DATA WAREHOUSE DEFINITIONS

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Week 3 lecture slides

Course DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

DATA WAREHOUSING AND OLAP TECHNOLOGY

Published by: PIONEER RESEARCH & DEVELOPMENT GROUP ( 28

BUILDING BLOCKS OF DATAWAREHOUSE. G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT

When to consider OLAP?

B.Sc (Computer Science) Database Management Systems UNIT-V

BUSINESS ANALYTICS AND DATA VISUALIZATION. ITM-761 Business Intelligence ดร. สล ล บ ญพราหมณ

An Introduction to Data Warehousing. An organization manages information in two dominant forms: operational systems of

Data Warehousing and OLAP Technology for Knowledge Discovery

Data Warehousing. Outline. From OLTP to the Data Warehouse. Overview of data warehousing Dimensional Modeling Online Analytical Processing

Chapter 5. Warehousing, Data Acquisition, Data. Visualization

Lection 3-4 WAREHOUSING

OLAP. Business Intelligence OLAP definition & application Multidimensional data representation

Week 13: Data Warehousing. Warehousing

Hybrid Support Systems: a Business Intelligence Approach

LITERATURE SURVEY ON DATA WAREHOUSE AND ITS TECHNIQUES

2074 : Designing and Implementing OLAP Solutions Using Microsoft SQL Server 2000

Turkish Journal of Engineering, Science and Technology

Data Warehousing. Read chapter 13 of Riguzzi et al Sistemi Informativi. Slides derived from those by Hector Garcia-Molina

DATA CUBES E Jayant Haritsa Computer Science and Automation Indian Institute of Science. JAN 2014 Slide 1 DATA CUBES

Data Warehouses & OLAP

Course MIS. Foundations of Business Intelligence

Foundations of Business Intelligence: Databases and Information Management

Data Warehousing and Data Mining

Data Warehousing and Data Mining in Business Applications

Chapter 6 FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT Learning Objectives

DATA WAREHOUSING - OLAP

Data W a Ware r house house and and OLAP Week 5 1

A Design and implementation of a data warehouse for research administration universities

OLAP & DATA MINING CS561-SPRING 2012 WPI, MOHAMED ELTABAKH

Foundations of Business Intelligence: Databases and Information Management

Introduction to Data Warehousing. Ms Swapnil Shrivastava

Data Warehouse: Introduction

Dimensional Modeling for Data Warehouse

Overview of Data Warehousing and OLAP

DATA WAREHOUSING APPLICATIONS: AN ANALYTICAL TOOL FOR DECISION SUPPORT SYSTEM

OLAP and Data Warehousing! Introduction!

New Approach of Computing Data Cubes in Data Warehousing

Data Mining for Successful Healthcare Organizations

Outline. Data Warehousing. What is a Warehouse? What is a Warehouse?

Technology-Driven Demand and e- Customer Relationship Management e-crm

A Critical Review of Data Warehouse

OLAP Theory-English version

CHAPTER 3. Data Warehouses and OLAP

CHAPTER 4 Data Warehouse Architecture

This tutorial will help computer science graduates to understand the basic-toadvanced concepts related to data warehousing.

Foundations of Business Intelligence: Databases and Information Management

An Overview of Data Warehousing, Data mining, OLAP and OLTP Technologies

Namrata 1, Dr. Saket Bihari Singh 2 Research scholar (PhD), Professor Computer Science, Magadh University, Gaya, Bihar

Alexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data

DATA WAREHOUSE E KNOWLEDGE DISCOVERY

SAS BI Course Content; Introduction to DWH / BI Concepts

Business Intelligence

Overview. DW Source Integration, Tools, and Architecture. End User Applications (EUA) EUA Concepts. DW Front End Tools. Source Integration

Unit -3. Learning Objective. Demand for Online analytical processing Major features and functions OLAP models and implementation considerations

Bussiness Intelligence and Data Warehouse. Tomas Bartos CIS 764, Kansas State University

Emerging Technologies Shaping the Future of Data Warehouses & Business Intelligence

Framework for Data warehouse architectural components

Data W a Ware r house house and and OLAP II Week 6 1

The Role of the BI Competency Center in Maximizing Organizational Performance

Foundations of Business Intelligence: Databases and Information Management

Building a Data Warehouse

INTERACTIVE DECISION SUPPORT SYSTEM BASED ON ANALYSIS AND SYNTHESIS OF DATA - DATA WAREHOUSE

Data Mart/Warehouse: Progress and Vision

A Technical Review on On-Line Analytical Processing (OLAP)

Chapter 3, Data Warehouse and OLAP Operations

Overview. Data Warehousing and Decision Support. Introduction. Three Complementary Trends. Data Warehousing. An Example: The Store (e.g.

<Insert Picture Here> Enhancing the Performance and Analytic Content of the Data Warehouse Using Oracle OLAP Option

Business Intelligence Solutions. Cognos BI 8. by Adis Terzić

Data Warehousing & OLAP

INFO 321, Database Systems, Semester

Data Warehousing: Data Models and OLAP operations. By Kishore Jaladi

5.5 Copyright 2011 Pearson Education, Inc. publishing as Prentice Hall. Figure 5-2

Data Warehousing and Decision Support. Introduction. Three Complementary Trends. Chapter 23, Part A

HYPERION MASTER DATA MANAGEMENT SOLUTIONS FOR IT

Data Warehousing and OLAP

PowerDesigner WarehouseArchitect The Model for Data Warehousing Solutions. A Technical Whitepaper from Sybase, Inc.

Databases in Organizations

Data Warehousing. Jens Teubner, TU Dortmund Winter 2015/16. Jens Teubner Data Warehousing Winter 2015/16 1

Transcription:

14. Data Warehousing & Data Mining

Data Warehousing Concepts Decision support is key for companies wanting to turn their organizational data into an information asset Data Warehouse "A subject-oriented, integrated, time-variant, and nonvolatile collection of data in support of management's decision-making process" Subject-oriented Integrated Time-variant Non-volatile OLAP (on-line analytical processing), DDS, EIS, and data mining applications 204

Benefits of Data Warehousing Potential high returns on investment Competitive advantage Increased productivity of corporate decision-makers 205

Comparison of OLTP and Data Warehousing OLTP systems Holds current data Stores detailed data Data is dynamic Repetitive processing High level of transaction throughput Predictable pattern of usage Transaction driven Application oriented Supports day-to-day decisions Serves large number of clerical / operational users Data warehousing systems Holds historic data Stores detailed, lightly, and summarized data Data is largely static Ad hoc, unstructured, and heuristic processing Medium to low transaction throughput Unpredictable pattern of usage Analysis driven Subject oriented Supports strategic decisions Serves relatively lower number of managerial users 206

Data Warehouse Architecture Operational Data Load Manager Warehouse Manager Query Manager Detailed Data Lightly and Highly Summarized Data Archive / Backup Data Meta-Data End-user Access Tools 207

End-user Access Tools Reporting and query tools Application development tools Executive Information System (EIS) tools Online Analytical Processing (OLAP) tools Data mining tools 208

Typical Architecture Operational data source 1 Load Manager Meta-data Higly summarized data Query Manager Reporting, query, application deve. tools Operational data source 2 Detaled data Lightly summarized data Warehouse Manager DBMS OLAP tools Operational data source 3 Data mining tools Archive/backup data End-user access tools 209

Data Warehousing Tools and Technologies Extraction, Cleansing, and Transformation Tools Data Warehouse DBMS Load performance Load processing Data quality management Query performance Terabyte scalability Networked data warehouse Warehouse administration Integrated dimensional tools Advanced query functionality 210

Data Marts A subset of data warehouse that supports the requirements of a particular department or business function 211

Designing Data Warehouses The Star Schema A logical structure that has a fact tables (containing factual data) in the center, surrounded by dimension tables (containing reference data) The Snowflake Schema A variant of the star schema where each dimension can have its own dimension 212

... Designing Data Warehouses Service srvice_id service_name service_type service_group... Time time_id holiday quarter day_of_week month date... Sales Customer customer_id name revenue credite_rate new?... srvice_id time_id salesrep_id customer_id amount commission... Salesrep salesrep_id name region manager_id salary hire_date birth_date... 213

Online Analytical Processing (OLAP) OLAP The dynamic synthesis, analysis, and consolidation of large volume of multi-dimensional data Multi-dimensional OLAP Cubes of data City Product type Time 214

Problems of Data Warehousing Underestimation of resources for data loading Hidden problem with source systems Required data not captured Increased end-user demands Data homogenization High demand for resources Data ownership High maintenance Long duration projects Complexity of integration 215

Codd's Rules for OLAP Multi-dimensional conceptual view Transparency Accessibility Consistent reporting performance Client-server architecture Generic dimensionality Dynamic sparse matrix handling Multi-user support Unrestricted cross-dimensional operations Intuitive data manipulation Flexible reporting Unlimited dimensions and aggregation levels 216

OLAP Tools Multi-dimensional OLAP (MOLAP) Multi-dimensional DBMS (MDDBMS) Relational OLAP (ROLAP) Creation of multiple multi-dimensional views of the twodimensional relations Managed Query Environment (MQE) Deliver selected data directly from the DBMS to the desktop in the form of a data cube, where it is stored, analyzed, and manipulated locally 217

SQL Extensions Sample extensions Decode Cume Show the monthly sales foe each branch, along with the monthly year-to-date figures MovingAvg(n) MovingSum(n) Rank When List the top 5 branches, based on last year sales; sort by branch number RatioToReport Tertile Create Macro 218

Data Mining Definition The process of extracting valid, previously unknown, comprehensible, and actionable information from large database and using it to make crucial business decisions Knowledge discovery Goals Association rules Sequential patterns Classification trees Prediction Identification Classification Optimization 219

Data Mining Techniques Predictive Modeling Supervised training with two phases Training phase : building a model using large sample of historical data called the training set Testing phase : trying the model on new data Database Segmentation Link Analysis Deviation Detection 220