Introduction to Datawarehousing



Similar documents
Data warehouse Architectures and processes

Data Warehouse: Introduction

A Model-based Software Architecture for XML Data and Metadata Integration in Data Warehouse Systems

A Survey on Data Warehouse Architecture

ETL-EXTRACT, TRANSFORM & LOAD TESTING

Chapter 6 Basics of Data Integration. Fundamentals of Business Analytics RN Prasad and Seema Acharya

MDM and Data Warehousing Complement Each Other

Business Intelligence. A Presentation of the Current Lead Solutions and a Comparative Analysis of the Main Providers

Lection 3-4 WAREHOUSING

Why Business Intelligence

Chapter 3 - Data Replication and Materialized Integration

Enterprise Solutions. Data Warehouse & Business Intelligence Chapter-8

Understanding Data Warehousing. [by Alex Kriegel]

Data warehouse and Business Intelligence Collateral

QlikView Business Discovery Platform. Algol Consulting Srl

Advanced Data Management Technologies

Data Warehousing. Jens Teubner, TU Dortmund Winter 2015/16. Jens Teubner Data Warehousing Winter 2015/16 1

IBM WebSphere DataStage Online training from Yes-M Systems

BENEFITS OF AUTOMATING DATA WAREHOUSING

Bussiness Intelligence and Data Warehouse. Tomas Bartos CIS 764, Kansas State University

Data Warehousing Systems: Foundations and Architectures

<Insert Picture Here> Extending Hyperion BI with the Oracle BI Server

Business Intelligence In SAP Environments

BUILDING BLOCKS OF DATAWAREHOUSE. G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT

Data Warehouse Overview. Srini Rengarajan

ORACLE DATA INTEGRATOR ENTERPRISE EDITION

An Oracle White Paper March Best Practices for Real-Time Data Warehousing

Enterprise Data Warehouse (EDW) UC Berkeley Peter Cava Manager Data Warehouse Services October 5, 2006

ENTERPRISE EDITION ORACLE DATA SHEET KEY FEATURES AND BENEFITS ORACLE DATA INTEGRATOR

Extraction Transformation Loading ETL Get data out of sources and load into the DW

Data Warehousing and Data Mining in Business Applications

TRENDS IN THE DEVELOPMENT OF BUSINESS INTELLIGENCE SYSTEMS

Integrating Netezza into your existing IT landscape

When to consider OLAP?

<Insert Picture Here> Oracle BI Standard Edition One The Right BI Foundation for the Emerging Enterprise

OLAP and OLTP. AMIT KUMAR BINDAL Associate Professor M M U MULLANA

Research on Airport Data Warehouse Architecture

Business Intelligence Solution for Small and Midsize Enterprises (BI4SME)

Data Warehouse Testing

Reverse Engineering in Data Integration Software

CDCR EA Data Warehouse / Strategy Overview. February 12, 2010

IT FUSION CONFERENCE. Build a Better Foundation for Business

ORACLE BUSINESS INTELLIGENCE SUITE ENTERPRISE EDITION PLUS

Oracle Business Intelligence 11g Business Dashboard Management

Business Intelligence Applications

Databases in Organizations

Management Accountants and IT Professionals providing Better Information = BI = Business Intelligence. Peter Simons peter.simons@cimaglobal.

ORACLE BUSINESS INTELLIGENCE SUITE ENTERPRISE EDITION PLUS

"The performance driven Enterprise" Emerging trends in Enterprise BI Platforms

Published by: PIONEER RESEARCH & DEVELOPMENT GROUP ( 28

The Role of the Analyst in Business Analytics. Neil Foshay Schwartz School of Business St Francis Xavier U

Common Warehouse Metamodel (CWM): Extending UML for Data Warehousing and Business Intelligence

Getting it Right: How to Find the Right BI Package for the Right Situation Norma Waugh. RMOUG Training Days February 15-17, 2011

Methods and Technologies for Business Process Monitoring

IBM Data Warehousing and Analytics Portfolio Summary

Breadboard BI. Unlocking ERP Data Using Open Source Tools By Christopher Lavigne

Establish and maintain Center of Excellence (CoE) around Data Architecture

Master Data Management and Data Warehousing. Zahra Mansoori

Data Warehousing and Data Mining Introduction

Paper DM10 SAS & Clinical Data Repository Karthikeyan Chidambaram

Data Warehousing and Data Mining

IT0457 Data Warehousing. G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT

HYPERION MASTER DATA MANAGEMENT SOLUTIONS FOR IT

Emerging Technologies Shaping the Future of Data Warehouses & Business Intelligence

Data Integration and ETL Process

Business Intelligence: Effective Decision Making

Metadata Strategies: your guide through the data jungle Achim Granzen EMEA Technology Strategist

Analysis and Design of ETL in Hospital Performance Appraisal System

Chapter 5. Learning Objectives. DW Development and ETL

Migrating Discoverer to OBIEE Lessons Learned. Presented By Presented By Naren Thota Infosemantics, Inc.

Data-Warehouse-, Data-Mining- und OLAP-Technologien

An Architectural Review Of Integrating MicroStrategy With SAP BW

META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING

Oracle BI Applications (BI Apps) is a prebuilt business intelligence solution.

Analance Data Integration Technical Whitepaper

IBM Informix Warehouse Accelerator (IWA)

Data Warehousing. Overview, Terminology, and Research Issues. Joachim Hammer. Joachim Hammer

SAP Real-time Data Platform. April 2013

<Insert Picture Here> The Age of the Pure Play BI Vendor is Over

8. Business Intelligence Reference Architectures and Patterns

A Perspective on the Benefits of Data Virtualization Technology

INTERACTIVE DECISION SUPPORT SYSTEM BASED ON ANALYSIS AND SYNTHESIS OF DATA - DATA WAREHOUSE

Integrating SAP and non-sap data for comprehensive Business Intelligence

Safe Harbor Statement

Moving Large Data at a Blinding Speed for Critical Business Intelligence. A competitive advantage

Building a Custom Data Warehouse

Migrating a Discoverer System to Oracle Business Intelligence Enterprise Edition

BUSINESSOBJECTS DATA INTEGRATOR

Essential Elements of a Master Data Management Architecture

Meta Data Management for Business Intelligence Solutions. IBM s Strategy. Data Management Solutions White Paper

Transcription:

DIPARTIMENTO DI INGEGNERIA INFORMATICA AUTOMATICA E GESTIONALE ANTONIO RUBERTI Master of Science in Engineering in Computer Science (MSE-CS) Seminars in Software and Services for the Information Society Introduction to Datawarehousing 1

Business Intelligence Architecture management system goals results management system KPI DSS MKT CRM HR Datamart-1 Datamart-2 Datamart-3 reporting OLAP data mining Datawarehouse operational system external data sources service systems ETL systems ERP systems internet / extranet operational system 2

What is Data Warehousing Collection of methods, technologies and tools to assist the knowledge worker (manager, analyst) to conduct data analysis aimed at supporting decision-making and/or improving the management of information assets 3

What is a Data Warehouse A data warehouse is a collection of data integrated (far beyond the organization) consistent (despite the heterogeneous origin) focused (an interest area is defined) historical (over a consistent timeframe) permanent (never delete your data!) 4

Purpose of a Data Warehouse A Data Warehouse helps (allows) you: to take decisions to identify and interpret phenomena to make predictions about the future to control a complex system 5

Value and quantity of information logistics value marketing competitors BD sales prices strategic information reports selected information $$$ $ primary information sources quantity 6

OLTP & OLAP OLTP - On-Line Transaction Processing realm of (write and / or read) transactions, recovery, consistency many, fast and frequent operations high level of concurrency access to a small amount of data on-the-fly data update OLAP - On-Line Analytical Processing read only few operations low level of concurrency access to huge amounts of data historical but essentially static data 7

Separation between: Operational Database & Data Warehouse different computational load different needs: DB: dynamic data, asynchronous updates DW: static data, periodic updates integration with business activity: DB: supporting operations (focused, timely) DW: supporting decisions (descriptive, historical) data collection: DB: minimal DW: maximal 8

Two issues with different perspectives Data redundancy OLTP (DB): to avoid, bringing to inconsistency and/or inefficiency on updates OLAP (DW): redundancy avoids recomputation and shorten response time Indexing OLTP (DB): good when you search bad when you update... you need some trade-off OLAP (DW): the more, the best 9

Some Data Warehouse Systems Oracle 12 IBM InfoSphere Warehouse Microsoft SQL-Server 2012 Analysis Services Sybase IQ Hyperion (bought by Oracle) Teradata(division of NCR) Netezza Cognos(bought by IBM) Business Objects (bought by SAP) 10

A comparison by Gartner Donald Feinberg, Mark A. Beyer Magic Quadrant for Data Warehouse Database Management Systems Gartner RAS Core Research Note G00173535, 28 January 2010 11

Architectures for Datawarehousing: issues separating OLTP & OLAP scalability extensibility security administrability 12

Architecture for Datawarehousing determined by design choices determined by / determines the choice of a software system determines the cost and makes possible future integration (quantitative and / or qualitative) affects the cost of data processing 13

Data Mart Collection of data focused on particular user profile or on particular target analysis Alternatives: 1.dependent Data Mart: it is a subset and/or an aggregation of data in the primary DW DM extracted from a DW 2.independent Data Mart: it is a subset and/or an aggregation of data in the operational DB DW=U i (DM i ), that is, DW is a set of DM 3. hybrid solution, combining 1, 2 14

DW architecture: 1 Level there is only an operational DW virtual DB (no OLTP-OLAP separation) data coincident with DB operational difficult integration with other sources data - level 1 (copy of) operational DB external sources middleware sources warehouse analysis 15

DW architecture: 2 Levels dependent DMs data sources complemented with external sources running on dedicated software platform ETL: Extraction, Transformation, Loading materialization of the DW materialization of Data Marts data -level 1 data -level 2 oper BD ext BD ETL DW Data Mart Data Mart sources feeding warehouse analysis 16

DW architecture: 2 Levels independent DMs Data Mart are materialized by feeding DW = union of DMs data -level 1 data -level 2 oper BD ext BD ETL Data Mart Data Mart sources feeding warehouse analysis 17

DW architecture: 3 Levels a level of "reconciled" data (operational data store) is introduced separation into two phases of ETL activities: 1. extraction / transformation 2. loading data -level 1 data -level 2 data -level 3 oper BD ext BD ET(L) reconcilied data loading DW Data Mart Data Mart sources feeding warehouse analysis 18

ETL: Extraction, Transformation, Loading Operational Data, External Data extraction cleaning - validation - filtering transformation Reconciled Data loading Data Warehouse 19

Extraction initial extraction: targeted at the creation of the DW furter extractions: static (replaces the whole DW) incremental log timestamp 20

Cleaning changing VALUES duplicates inconsistencies domain violation functional dependency violation null values misuse of fields spelling abbreviations (not homogeneous) 21

Transformation changing FORMATS: misalignment of formats field overloading unhomogeneous coding 22

Loading Refresh: ex-novo load of the whole DW Update: differential updates 23

Metadata internal metadata concerning the administration of the DW (i.e., sources, transformations, schemas, users, etc..) external metadata interesting for users (e.g., measurement units, possible combinations) STANDARDs CWM - Common Warehouse Model (OMG), defined by: UML (Unified Modeling Language) XML (extensible Markup Language) XMI (XML Metadata Interchange) OMG = Object Management Group: CORBA(Common Object Request Broker Architecture), UML (Unified Modeling Language), MDA(Model-Driven Architecture) 24