Data Warehouse: Introduction



Similar documents
Data warehouse Architectures and processes

DATA WAREHOUSING AND OLAP TECHNOLOGY

Data Warehousing Systems: Foundations and Architectures

1. OLAP is an acronym for a. Online Analytical Processing b. Online Analysis Process c. Online Arithmetic Processing d. Object Linking and Processing

Data warehouse design

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP

Part 22. Data Warehousing

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 29-1

IST722 Data Warehousing

CHAPTER 4 Data Warehouse Architecture

Moving Large Data at a Blinding Speed for Critical Business Intelligence. A competitive advantage

Hybrid Support Systems: a Business Intelligence Approach

DATA WAREHOUSING - OLAP

Mario Guarracino. Data warehousing

Bussiness Intelligence and Data Warehouse. Tomas Bartos CIS 764, Kansas State University

LITERATURE SURVEY ON DATA WAREHOUSE AND ITS TECHNIQUES

Data Warehousing and OLAP Technology for Knowledge Discovery

Published by: PIONEER RESEARCH & DEVELOPMENT GROUP ( 28

MDM and Data Warehousing Complement Each Other

Why Business Intelligence

Data Warehouse Overview. Srini Rengarajan

BUILDING BLOCKS OF DATAWAREHOUSE. G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT

Innovative technology for big data analytics

[callout: no organization can afford to deny itself the power of business intelligence ]

OLAP and OLTP. AMIT KUMAR BINDAL Associate Professor M M U MULLANA

While people are often a corporation s true intellectual property, data is what

Advanced Data Management Technologies

Emerging Technologies Shaping the Future of Data Warehouses & Business Intelligence

This tutorial will help computer science graduates to understand the basic-toadvanced concepts related to data warehousing.

Overview. DW Source Integration, Tools, and Architecture. End User Applications (EUA) EUA Concepts. DW Front End Tools. Source Integration

When to consider OLAP?

Business Intelligence, Data warehousing Concept and artifacts

Monitoring Genebanks using Datamarts based in an Open Source Tool

Foundations of Business Intelligence: Databases and Information Management

Fluency With Information Technology CSE100/IMT100

OLAP Systems and Multidimensional Expressions I

A Critical Review of Data Warehouse

Data Warehousing Concepts

Hybrid OLAP, An Introduction

SAS BI Course Content; Introduction to DWH / BI Concepts

Data Warehousing and Data Mining in Business Applications

14. Data Warehousing & Data Mining

Turkish Journal of Engineering, Science and Technology

Data warehouse and Business Intelligence Collateral

A Design and implementation of a data warehouse for research administration universities

Ramesh Bhashyam Teradata Fellow Teradata Corporation

70-467: Designing Business Intelligence Solutions with Microsoft SQL Server

Business Intelligence Solutions. Cognos BI 8. by Adis Terzić

<Insert Picture Here> Enhancing the Performance and Analytic Content of the Data Warehouse Using Oracle OLAP Option

Oracle9i Data Warehouse Review. Robert F. Edwards Dulcian, Inc.

Course MIS. Foundations of Business Intelligence

LEARNING SOLUTIONS website milner.com/learning phone


Data Warehousing. Jens Teubner, TU Dortmund Winter 2015/16. Jens Teubner Data Warehousing Winter 2015/16 1

MS 20467: Designing Business Intelligence Solutions with Microsoft SQL Server 2012

Introduction to Datawarehousing

Week 13: Data Warehousing. Warehousing

A DATA WAREHOUSE SOLUTION FOR E-GOVERNMENT

Safe Harbor Statement

Data Warehouse Design

Lection 3-4 WAREHOUSING

III JORNADAS DE DATA MINING

Introduction to Data Warehousing. Ms Swapnil Shrivastava

Data Warehouse Logical Design. Letizia Tanca Politecnico di Milano (with the kind support of Rosalba Rossato)

Introducing Oracle Exalytics In-Memory Machine

Master Data Management and Data Warehousing. Zahra Mansoori

CS2032 Data warehousing and Data Mining Unit II Page 1

Integrating SAP and non-sap data for comprehensive Business Intelligence

W H I T E P A P E R B u s i n e s s I n t e l l i g e n c e S o lutions from the Microsoft and Teradata Partnership

Data Warehouse: Introduction

Large Telecommunications Company Gains Full Customer View, Boosts Monthly Revenue, Cuts IT Costs by $3 Million

Foundations of Business Intelligence: Databases and Information Management

Practical Considerations for Real-Time Business Intelligence. Donovan Schneider Yahoo! September 11, 2006

Data Warehousing: Data Models and OLAP operations. By Kishore Jaladi

INTERACTIVE DECISION SUPPORT SYSTEM BASED ON ANALYSIS AND SYNTHESIS OF DATA - DATA WAREHOUSE

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Two-Phase Data Warehouse Optimized for Data Mining

Data Warehousing and Data Mining

Databases in Organizations

Meta-data and Data Mart solutions for better understanding for data and information in E-government Monitoring

University of Gaziantep, Department of Business Administration

International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April ISSN

Data W a Ware r house house and and OLAP II Week 6 1

Jagir Singh, Greeshma, P Singh University of Northern Virginia. Abstract

Week 3 lecture slides

Data Warehousing. Outline. From OLTP to the Data Warehouse. Overview of data warehousing Dimensional Modeling Online Analytical Processing

Data analysis: tools and methods

ORACLE OLAP. Oracle OLAP is embedded in the Oracle Database kernel and runs in the same database process

An Overview of Data Warehousing, Data mining, OLAP and OLTP Technologies

Chapter 5. Warehousing, Data Acquisition, Data. Visualization

2074 : Designing and Implementing OLAP Solutions Using Microsoft SQL Server 2000

OLAP & DATA MINING CS561-SPRING 2012 WPI, MOHAMED ELTABAKH

Business Intelligence : a primer

The Cubetree Storage Organization

Lecture Data Warehouse Systems

CONCEPTUALIZING BUSINESS INTELLIGENCE ARCHITECTURE MOHAMMAD SHARIAT, Florida A&M University ROSCOE HIGHTOWER, JR., Florida A&M University

Presented by: Jose Chinchilla, MCITP

IT0457 Data Warehousing. G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT

Introduction. Introduction to Data Warehousing

Retail POS Data Analytics Using MS Bi Tools. Business Intelligence White Paper

Transcription:

Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of base and data mining group, base and data mining group, Decision support systems Introduction Huge operational databases are available in most companies these databases may provide a large wealth of useful information Decision support systems provide means for in depth analysis of a company s business faster and better decisions INTRODUCTION - 1 INTRODUCTION - 2 base and data mining group, Strategic decision support Demand evolution analysis and forecast Critical business areas identification Budgeting and management transparency reporting, practices against frauds and money laundering Identification and implementation of winning strategies cost reduction and profit increase Business Intelligence base and data mining group, BI provides support to strategic decision support in companies Objective: transforming company data into actionable information at different detail s for analysis applications Users may have heterogeneous needs BI requires an appropriate hardware and software infrastructure INTRODUCTION - 3 INTRODUCTION - 4 Applications base and data mining group, Manifacturing companies: order management, client support Distribution: user profile, stock management Financial services: buyer behavior (credit cards) Insurance: claim analysis, fraud detection Telecommunication: call analysis, churning, fraud detection Public service: usage analysis Health: service analysis and evaluation base and data mining group, base devoted to decision support, which is kept separate from company operational databases which is devoted to a specific subject Integrated and consistent time dependent, non volatile used for decision support in a company W. H. Inmon, Building the data warehouse, 1992 INTRODUCTION - 5 INTRODUCTION - 6 Pag. 1

Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Why separate data? base and data mining group, base and data mining group, Performance complex queries reduce performance of operational transaction management different access methods at the physical management missing information (e.g., history) data consolidation data quality (inconsistency problems) structure and data analysis INTRODUCTION - 7 INTRODUCTION - 8 base and data mining group, Multidimensional representation base and data mining group, Multidimensional representation are represented as an (hyper)cube with three or more dimensions Measures on which analysis is performed: cells at dimension intersection for tracking sales in a supermarket chain: dimensions: product, shop, time measures: sold quantity, sold amount,... INTRODUCTION - 9 date 3 2-3-2000 INTRODUCTION - 10 shop SupShop MilkTTT product From Golfarelli, Rizzi, warehouse, teoria e pratica della progettazione, McGraw Hill 2006 base and data mining group, Relational representation: star model Numerical measures stored in the fact table attribute domain is numeric Dimensions describe the context of each measure in the fact table characterized by many descriptive attributes Example for tracking sales in a supermarket chain base and data mining group, Shop Sale Product Date INTRODUCTION - 11 INTRODUCTION - 12 Pag. 2

Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of size base and data mining group, Time dimension: 2 years x 365 days Shop dimension: 300 shops Product dimension: 30.000 products, of which 3.000 sold every day in every shop Number of rows in the fact table: 730 x 300 x 3000 = 657 millions Size of the fact table 21GB analysis base and data mining group, OLAP analysis: complex aggregate function computation support to different types of aggregate functions (e.g., moving average, top ten) analysis by means of data mining techniques various analysis types significant algorithmic contribution INTRODUCTION - 13 INTRODUCTION - 14 analysis base and data mining group, Presentation separate activity: data returned by a query may be rendered by means of different presentation Motivation search exploration by means of progressive, incremental refinements (e.g., drill down) architectures base and data mining group, INTRODUCTION - 15 INTRODUCTION - 16 base and data mining group, architectures Separation between transactional computing and data analysis avoid one architectures Architectures characterized by two or more s separate to a different extent data incoming into the data warehouse from analyzed data more scalable base and data mining group, : architecture (External) data sources ETL marts INTRODUCTION - 17 INTRODUCTION - 18 Pag. 3

Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of base and data mining group, and data mart Company data warehouse: it contains all the information on the company business extensive functional modelling process design and implementation require a long time mart: departimental information subset focused on a given subject twoarchitectures dependent, fed by the company data warehouse independent, fed directly by the sources faster implementation requires careful design, to avoid subsequent data mart integration problems base and data mining group, Servers for Warehouses ROLAP (Relational OLAP) server extended relational DBMS compact representation for sparse data SQL extensions for aggregate computation specialized access methods which implement efficient OLAP data access MOLAP (Multidimensional OLAP) server data represented in proprietary (multidimensional) matrix format sparse data require compression special OLAP primitives HOLAP (Hybrid OLAP) server INTRODUCTION - 19 INTRODUCTION - 20 base and data mining group, Extraction, Transformation and Loading (ETL) Prepares data to be loaded into the data warehouse data extraction from (OLTP and external) sources data cleaning data transformation data loading Performed when the DW is first loaded during periodical DW refresh ETL process base and data mining group, extraction: data acquisition from sources cleaning: techniques for improving data quality (correctness and consistency) transformation: data conversion from operational format to data warehouse format loading: update propagation to the data warehouse INTRODUCTION - 21 INTRODUCTION - 22 INTRODUCTION - 23 base and data mining group, metadata = data about data Different types of metadata: for data transformation and loading: describe data sources and needed transformation operations Useful using a common notation to represent data sources and data after transformation CWMI (Common Warehouse Initiative): standard proposed by OMG to exchange data between DW and repository of metadata in heterogenous and distributed environments for data management: describe the structure of the data in the data warehouse also for materialized view for query management: data on query structure and to monitor query execution SQL code for the query execution plan memory and CPU usage Two architecture sources (operational and esternal) Source ETL marts From Golfarelli, Rizzi,, teoria e pratica della progettazione, McGraw Hill 2006 INTRODUCTION - 24 base and data mining group, analysis Pag. 4

Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Two architecture features INTRODUCTION - 25 base and data mining group, Decoupling between source and DW data management of external (not OLTP) data sources (e.g., text files) data modelling suited for OLAP analysis physical design tailored for OLAP load Easy management of different temporal granularity of operational and analytical data Partitioning between transactional and analytical load On the fly data transformation and cleaning (ETL) ETL sources (operational and external) Source Three architecture Staging area ETL Loading marts From Golfarelli, Rizzi,, teoria e pratica della progettazione, McGraw Hill 2006 INTRODUCTION - 26 base and data mining group, analysis Three architecture features base and data mining group, Staging area: buffer area allowing the separation between ET management and data warehouse loading complex transformation and cleaning operations are eased provides an integrated model of business data, still close to OLTP representation sometime denoted as Operational Store (ODS) Introduces further redundancy more disk space is required for data storage INTRODUCTION - 27 Pag. 5