Understanding Data Warehousing. [by Alex Kriegel]

Similar documents
Enterprise Solutions. Data Warehouse & Business Intelligence Chapter-8

DATA WAREHOUSE CONCEPTS DATA WAREHOUSE DEFINITIONS

IST722 Data Warehousing

SAS BI Course Content; Introduction to DWH / BI Concepts

1. OLAP is an acronym for a. Online Analytical Processing b. Online Analysis Process c. Online Arithmetic Processing d. Object Linking and Processing

Sterling Business Intelligence

By Makesh Kannaiyan 8/27/2011 1

Data Warehousing Systems: Foundations and Architectures

Establish and maintain Center of Excellence (CoE) around Data Architecture

Data Warehouse Overview. Srini Rengarajan

OLAP and OLTP. AMIT KUMAR BINDAL Associate Professor M M U MULLANA

Data Warehousing with Oracle

Data Warehousing. Jens Teubner, TU Dortmund Winter 2015/16. Jens Teubner Data Warehousing Winter 2015/16 1

Data Warehousing and Data Mining

Trends in Data Warehouse Data Modeling: Data Vault and Anchor Modeling

Part 22. Data Warehousing

Business Intelligence. A Presentation of the Current Lead Solutions and a Comparative Analysis of the Main Providers

Dimensional Modeling and E-R Modeling In. Joseph M. Firestone, Ph.D. White Paper No. Eight. June 22, 1998

MDM and Data Warehousing Complement Each Other

Bussiness Intelligence and Data Warehouse. Tomas Bartos CIS 764, Kansas State University

BUILDING BLOCKS OF DATAWAREHOUSE. G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT

Introduction to Datawarehousing

B. 3 essay questions. Samples of potential questions are available in part IV. This list is not exhaustive it is just a sample.

Open Source Business Intelligence Intro

Sizing Logical Data in a Data Warehouse A Consistent and Auditable Approach

Lection 3-4 WAREHOUSING

Data Warehouse: Introduction

Course Outline: Course: Implementing a Data Warehouse with Microsoft SQL Server 2012 Learning Method: Instructor-led Classroom Learning

Designing a Dimensional Model

<Insert Picture Here> Extending Hyperion BI with the Oracle BI Server

Migrating a Discoverer System to Oracle Business Intelligence Enterprise Edition

An Introduction to Data Warehousing. An organization manages information in two dominant forms: operational systems of

SAS Business Intelligence Online Training

KDOT s Spatially Enabled Data Warehouse. Paul Bayless KDOT Data Warehouse Manager and Bill Schuman GeoDecisions Project Manager

Data warehouse Architectures and processes

14. Data Warehousing & Data Mining

Master Data Management and Data Warehousing. Zahra Mansoori

Republic Polytechnic School of Information and Communications Technology C355 Business Intelligence. Module Curriculum

Data warehouse and Business Intelligence Collateral

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 29-1

Data Search. Searching and Finding information in Unstructured and Structured Data Sources

COURSE OUTLINE. Track 1 Advanced Data Modeling, Analysis and Design

Overview. DW Source Integration, Tools, and Architecture. End User Applications (EUA) EUA Concepts. DW Front End Tools. Source Integration

Data Warehousing and Data Mining in Business Applications

資 料 倉 儲 (Data Warehousing)

Praxis Softek Solutions Statement Of Qualification DW & BI

Introduction to Oracle Business Intelligence Standard Edition One. Mike Donohue Senior Manager, Product Management Oracle Business Intelligence

HYPERION MASTER DATA MANAGEMENT SOLUTIONS FOR IT

Presented by: Jose Chinchilla, MCITP

When to consider OLAP?

Business Intelligence for the Modern Utility

Advanced Data Management Technologies

Implementing a Data Warehouse with Microsoft SQL Server 2012 MOC 10777

Driving Peak Performance IBM Corporation

Fluency With Information Technology CSE100/IMT100

Reflections on Agile DW by a Business Analytics Practitioner. Werner Engelen Principal Business Analytics Architect

There were various questions involving the breakdown of the 525 users. We are providing below a potential breakdown, but this is an estimate only:

About the Tutorial. Audience. Prerequisites. Disclaimer & Copyright. ETL Testing

Anwendersoftware Anwendungssoftwares a. Data-Warehouse-, Data-Mining- and OLAP-Technologies. Online Analytic Processing

Enterprise Data Warehouse (EDW) UC Berkeley Peter Cava Manager Data Warehouse Services October 5, 2006

Microsoft Data Warehouse in Depth

Data Warehousing. Read chapter 13 of Riguzzi et al Sistemi Informativi. Slides derived from those by Hector Garcia-Molina

INFORMATION TECHNOLOGY STANDARD

CASE PROJECTS IN DATA WAREHOUSING AND DATA MINING

COURSE 20463C: IMPLEMENTING A DATA WAREHOUSE WITH MICROSOFT SQL SERVER

Implementing a Data Warehouse with Microsoft SQL Server

Data Warehousing Methodologies

Applied Business Intelligence. Iakovos Motakis, Ph.D. Director, DW & Decision Support Systems Intrasoft SA

Survey of use of Data Warehousing and Business Intelligence at Australasian Universities 2008

Dimensional Modeling for Data Warehouse

From Data Warehouse to Business Intelligence: The Michigan Journey

East Asia Network Sdn Bhd

Published by: PIONEER RESEARCH & DEVELOPMENT GROUP ( 28

Tiber Solutions. Understanding the Current & Future Landscape of BI and Data Storage. Jim Hadley

CS2032 Data warehousing and Data Mining Unit II Page 1

Oracle BI Applications (BI Apps) is a prebuilt business intelligence solution.

MS 20467: Designing Business Intelligence Solutions with Microsoft SQL Server 2012

LEARNING SOLUTIONS website milner.com/learning phone

Budgeting and Planning with Microsoft Excel and Oracle OLAP

CHAPTER 3. Data Warehouses and OLAP

TRANSFORMING YOUR BUSINESS

DATA WAREHOUSING AND OLAP TECHNOLOGY

Cúram Business Intelligence Reporting Developer Guide

TRENDS IN THE DEVELOPMENT OF BUSINESS INTELLIGENCE SYSTEMS

SENG 520, Experience with a high-level programming language. (304) , Jeff.Edgell@comcast.net

An Architectural Review Of Integrating MicroStrategy With SAP BW

Implementing a Data Warehouse with Microsoft SQL Server

Data W a Ware r house house and and OLAP II Week 6 1

(Week 10) A04. Information System for CRM. Electronic Commerce Marketing

Implement a Data Warehouse with Microsoft SQL Server 20463C; 5 days

B.Sc (Computer Science) Database Management Systems UNIT-V

Sterling Business Intelligence

The Benefits of Data Modeling in Data Warehousing

REENGINEERING HR APPRAISAL LEGACY SYSTEM TO BI PLATFORM

BENEFITS OF AUTOMATING DATA WAREHOUSING

OVERVIEW OF THE BUSINESS PERFORMANCE SOLUTIONS

Lecture Data Warehouse Systems

Data Warehousing Concepts

Jason Essig DBMS Consulting OCUG 2009 New Orleans 06 October 2009 CTMS Focus Group Session 18

Transcription:

Understanding Data Warehousing 2008 [by Alex Kriegel]

Things to Discuss Who Needs a Data Warehouse? OLTP vs. Data Warehouse Business Intelligence Industrial Landscape Which Data Warehouse: Bill Inmon vs. Ralph Kimball Relational vs. Dimensional Data Warehouse Life Cycle

Data Concepts Data vs. Information Data an observable and recordable fact Information integrated collection of data used for decision making/analysis Data = OLTP (On-Line Transaction Processing) Optimized for insert/update/delete speed Contains transient detailed data Often used for analysis Information = OLAP (On-Line Analytical Processing) Optimized for querying Contains aggregated data Is NOT (usually) used for transaction processing

Problems with OLTP Reporting Data Credibility No time basis Levels of extraction Different reporting criteria Productivity Locate information Analyze and custom extract data Inability to Transform Data into Information Lack of historical information Absence of relevant data

OLTP vs. Data Warehouse Thousands of users Run repetitively Normalized data (many tables, few columns per table, no redundancy) Static structure, variable content Compatible with SDLC Managed in entirety OLTP Application/Transaction oriented Current data Accurate, at the moment of access Continuous updates High availability Small amounts of data used Simple to complex queries Business/Subject oriented Serves managerial community Few users (typically under 100) Run heuristically Historical data Snapshots in time De-normalized data (few tables, many columns per table; redundancy is a fact of life) Flexible structure Data Warehouse Different Life Cycle Managed by subsets Batch updates* Relaxed availability Large amounts of data used Usually very complex queries

Data Evolution

Foundation of BI Data Warehouse and RDBMS Alternative data storage Proprietary data structures XML N-Dimensional CUBE

Tools of Trade Reporting Dashboards Ad-hoc Query Applications Integration (e.g. MS Office) Drill-down capability Advanced visualization

BI : Industry Dynamics Pure BI players (hanging there ) Actuate Information Builders Microstrategy SAS Mergers and Acquisitions Cognos (IBM) Business Objects/Crystal Decisions (SAP) ProClarity (Microsoft) Hyperion/Brio (Oracle)

OLAP Market Share (2006) Microsoft Corporation 31.6% Hyperion (Oracle) 18.9% Cognos (IBM) 12.9% Business Objects (SAP) 7.3% MicroStrategy 7.3% SAP AG 5.8% Cartesis SA (SAP) 3.7% Applix (Hyperion) 3.6% Infor 3.5% Oracle Corporation 2.8% Source: OlapReport.com

Gartner s s Magic Quadrant for Business Intelligence Platforms 2007

Two Approaches to Data Warehousing* Bill Inmon CIF (Corporate Information Factory) Relational (Normalized) DW Ralph Kimball Bus Architecture Dimensional DW * Not mutually exclusive

Relational Concepts I Normalization Granularity vs. Performance Entity/Relationship Diagrams Entity Attribute Domain Relation Key

Basic Relational Concepts II Relationships Primary and Foreign Keys One-to to-one One-to to-many Many-to to-many Data Integrity Domain Integrity Entity Integrity Referential Integrity

Example of Relational Design Entity ORDER orderid customerid orderdate ORDER_DETAIL orderid productid Quantity shipto Attributes

DW Definitions I Data Warehouse Data Mart Staging Area Operational Data Store Star Schema Snowflake Schema ETL (Extract-Transform-Load)

Data Warehouse DW Definitions II Typically contains the full range of business intelligence available from all sources Data is cross-divisional Data Mart Typically contains specialized subset of data Data is division specific

DW/DM Common Characteristics Subject oriented (the data is organized around subjects) Nonvolatile (once placed in the warehouse, is not usually subject to change) Integrated (the data is consistent) Time variant (historical data is recorded)

DW Definitions III Staging Area temporary area used for data cleansing and transformations; usually structurally compatible with target schema Operational Data Store (ODS) is a database designed to integrate data from multiple sources to facilitate operations, analysis and reporting the integration often involves cleaning, redundancy resolution and business rule enforcement Is updateable and contains highly granular information (as opposed to DW proper)

Star Schema DW Definitions IV Single FACT table surrounded by Dimension Tables There could be many star schemas within database Snowflake Schema Extension of star schema where one or more dimensions could be split into dimensions of their own

Basic Dimensional Concepts Fact table contains Dimension Keys (foreign key to Dimension tables) Facts (actual data being measured) Facts are preferably numeric Grain All facts in the FACT table must be of the same level of detail (table s s grain) Dimension table contains Actual data pertaining to the Fact table

Example of Dimensional Design Customer custid custname Attribute Measure Fact Order_Item custid prodid dateid unitprice shipdate Dimension Product prodid desc Date dateid weekday

Basic Data Warehouse Components ETL Data Warehouse BI Reporting OLAP

Data Warehouse Lifecycle Management (DWLM) I Project Planning Readiness and risk assessment Scope, Time, Resource Roles and Responsibilities Gathering and Analyzing Requirements UI Analytics/Reporting Interface (technology) Presentation Architecture: OLAP, ROLAP,MOLAP

Data Warehouse Lifecycle Management (DWLM) II DW Schema Design Dimensional/Normalized Modeling Technical Architecture Design Architectural Components and Services Storage/staging consideration Metadata Repository Integrating Data Sources Logical Mapping/Transformation Rules/Data Quality Requirements ETL Process Design (logical/technology/workflow) OLTP data profiling

Data Warehouse Lifecycle Management (DWLM) III Implementation Develop DW schemas Develop ETL/Staging Schema Develop ETL routines Build and Populate Cubes Build Dashboard/Reports (COTS/MDX Queries) Testing (data/throughput/requirements)

Data Warehouse Lifecycle Management (DWLM) IV Deployment Alpha, beta, rollout release process Set up/configuration Training and support Documentation Sizing and Adjustments Performance tuning

Data Warehouse Lifecycle Management (DWLM) V Managing and Maintenance Support/training Tuning and Optimization Planning for Growth/Enhancements

DW Life Cycle Models Waterfall Model Spiral Model COTS Solutions BusinessObjects Cognos Informatica SAS Kalido

Waterfall Model Systems Analysis Design Plan & Budget Build Test Release Miracle happens here End of Project

Spiral Model

Ralph Kimball s s Four Principles Focus on the business: : Concentrate on identifying business requirements and their associated value. Use these efforts to develop solid relationships r with the business side and sharpen your business sense and consultative skills. s Build an information infrastructure: Design a single, integrated, easy : Design a single, integrated, easy-to- use, high-performing information foundation that will meet the broad range of business requirements you ve identified across the enterprise. Deliver in meaningful increments: : Build the data warehouse in increments that can be delivered in 6 to 12 month timeframes. Use clearly identified i business value to determine the implementation order of the increments. Deliver the entire solution: : Provide all the elements necessary to deliver value to the business users. This means a solid, well-designed, quality-tested, tested, accessible data warehouse database is only the start. You must also a deliver ad hoc query tools, reporting applications and advanced analytics, training, support, web site, and documentation.