GENERATING DIMENSIONAL DATA MARTS FROM META-DATA: A PROOF-OF-CONCEPT PROTOTYPE. Abdellatif Lghali A THESIS

Similar documents
1. OLAP is an acronym for a. Online Analytical Processing b. Online Analysis Process c. Online Arithmetic Processing d. Object Linking and Processing

Part 22. Data Warehousing

IST722 Data Warehousing

When to consider OLAP?

Monitoring Genebanks using Datamarts based in an Open Source Tool

DATA WAREHOUSING - OLAP

LEARNING SOLUTIONS website milner.com/learning phone

Deductive Data Warehouses and Aggregate (Derived) Tables

Data W a Ware r house house and and OLAP II Week 6 1

DATA WAREHOUSING AND OLAP TECHNOLOGY

2074 : Designing and Implementing OLAP Solutions Using Microsoft SQL Server 2000

PREFACE INTRODUCTION MULTI-DIMENSIONAL MODEL. Chris Claterbos, Vlamis Software Solutions, Inc.

A Design and implementation of a data warehouse for research administration universities

MS 20467: Designing Business Intelligence Solutions with Microsoft SQL Server 2012

CHAPTER 4 Data Warehouse Architecture

Week 3 lecture slides

SAS BI Course Content; Introduction to DWH / BI Concepts

Data Testing on Business Intelligence & Data Warehouse Projects

Bussiness Intelligence and Data Warehouse. Tomas Bartos CIS 764, Kansas State University

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 29-1

Basics of Dimensional Modeling

Designing a Dimensional Model

Module 1: Introduction to Data Warehousing and OLAP

Designing Business Intelligence Solutions with Microsoft SQL Server 2012 Course 20467A; 5 Days

Implementing Data Models and Reports with Microsoft SQL Server 20466C; 5 Days

Data Warehouse: Introduction

SAS Business Intelligence Online Training

Methodology Framework for Analysis and Design of Business Intelligence Systems

14. Data Warehousing & Data Mining

Migrating a Discoverer System to Oracle Business Intelligence Enterprise Edition

Data Warehouse Snowflake Design and Performance Considerations in Business Analytics

OLAP Systems and Multidimensional Expressions I

Business Intelligence, Analytics & Reporting: Glossary of Terms

DATA WAREHOUSE CONCEPTS DATA WAREHOUSE DEFINITIONS

DATA CUBES E Jayant Haritsa Computer Science and Automation Indian Institute of Science. JAN 2014 Slide 1 DATA CUBES

Microsoft Implementing Data Models and Reports with Microsoft SQL Server

BUILDING BLOCKS OF DATAWAREHOUSE. G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT

Data Warehousing Systems: Foundations and Architectures

Republic Polytechnic School of Information and Communications Technology C355 Business Intelligence. Module Curriculum

Data Warehousing and Data Mining

Introduction to Data Warehousing. Ms Swapnil Shrivastava

OLAP Theory-English version

Sizing Logical Data in a Data Warehouse A Consistent and Auditable Approach

University of Gaziantep, Department of Business Administration

Designing Business Intelligence Solutions with Microsoft SQL Server 2012

Turkish Journal of Engineering, Science and Technology

Business Intelligence, Data warehousing Concept and artifacts

Data Warehousing: Data Models and OLAP operations. By Kishore Jaladi

A Critical Review of Data Warehouse

Data Warehousing and OLAP Technology for Knowledge Discovery

East Asia Network Sdn Bhd

HYPERION MASTER DATA MANAGEMENT SOLUTIONS FOR IT

Turning your Warehouse Data into Business Intelligence: Reporting Trends and Visibility Michael Armanious; Vice President Sales and Marketing Datex,

Overview of Data Warehousing and OLAP

Mastering Data Warehouse Aggregates. Solutions for Star Schema Performance

Why Business Intelligence

Course 20463:Implementing a Data Warehouse with Microsoft SQL Server

Data Warehousing. Read chapter 13 of Riguzzi et al Sistemi Informativi. Slides derived from those by Hector Garcia-Molina

Fluency With Information Technology CSE100/IMT100

Implementing Data Models and Reports with Microsoft SQL Server

Data Warehousing Fundamentals for IT Professionals. 2nd Edition

Distance Learning and Examining Systems

Data Warehousing and OLAP

Course Outline: Course: Implementing a Data Warehouse with Microsoft SQL Server 2012 Learning Method: Instructor-led Classroom Learning

Week 13: Data Warehousing. Warehousing

Business Intelligence & Product Analytics

CHAPTER 5: BUSINESS ANALYTICS

Designing Self-Service Business Intelligence and Big Data Solutions

SQL Server 2012 Business Intelligence Boot Camp

The Design and the Implementation of an HEALTH CARE STATISTICS DATA WAREHOUSE Dr. Sreèko Natek, assistant professor, Nova Vizija,

COURSE OUTLINE. Track 1 Advanced Data Modeling, Analysis and Design

<Insert Picture Here> Enhancing the Performance and Analytic Content of the Data Warehouse Using Oracle OLAP Option

Sterling Business Intelligence

COURSE 20463C: IMPLEMENTING A DATA WAREHOUSE WITH MICROSOFT SQL SERVER

Course 6234A: Implementing and Maintaining Microsoft SQL Server 2008 Analysis Services

Cognos 8 Best Practices

LITERATURE SURVEY ON DATA WAREHOUSE AND ITS TECHNIQUES

DATA WAREHOUSING APPLICATIONS: AN ANALYTICAL TOOL FOR DECISION SUPPORT SYSTEM

70-467: Designing Business Intelligence Solutions with Microsoft SQL Server

Enterprise Data Warehouse (EDW) UC Berkeley Peter Cava Manager Data Warehouse Services October 5, 2006

OLAP. Business Intelligence OLAP definition & application Multidimensional data representation

Lost in Space? Methodology for a Guided Drill-Through Analysis Out of the Wormhole

OLAP and Data Warehousing! Introduction!

What is OLAP - On-line analytical processing

PowerDesigner WarehouseArchitect The Model for Data Warehousing Solutions. A Technical Whitepaper from Sybase, Inc.

Business Intelligence Solutions. Cognos BI 8. by Adis Terzić

Unit -3. Learning Objective. Demand for Online analytical processing Major features and functions OLAP models and implementation considerations

Learning Objectives. Definition of OLAP Data cubes OLAP operations MDX OLAP servers

Outline. Data Warehousing. What is a Warehouse? What is a Warehouse?

Published by: PIONEER RESEARCH & DEVELOPMENT GROUP ( 28

CHAPTER 4: BUSINESS ANALYTICS

Optimizing Your Data Warehouse Design for Superior Performance

Implementing a Data Warehouse with Microsoft SQL Server 2012 (70-463)

Data Warehouse Overview. Srini Rengarajan

INFO 321, Database Systems, Semester

Transcription:

GENERATING DIMENSIONAL DATA MARTS FROM META-DATA: A PROOF-OF-CONCEPT PROTOTYPE by Abdellatif Lghali A THESIS Submitted to the Faculty of Sciences of the VU University in partial fulfillment of the requirements for the degree of MASTER OF BUSINESS ANALYTICS Abdellatif Lghali, Candidate ADVISORY COMMITTEE Hans van Leijen, Supervisor Wan Fokkink, Reader Peter Boncz, Second reader VRIJE UNIVERSITEIT AMSTERDAM

De Boelelaan 1105 Amsterdam, 1081 HV 2014 ii

c 2014, Abdellatif Lghali. All rights reserved.

iv Throughout my life some people have always been there during those di cult and trying time. I lovingly dedicate this thesis and everything I do to Sidi HAMZA al Qadiri al Boutchich and My Mother. For their constant love, encouragement and support.

v Throughout my internship Hans van Leijen has been an excellent supervisor. The supervision, patience, support and immense knowledge that he gave have truly helped the progression and smoothness of the my internship at TomTom. The co-operation is much indeed appreciated. My grateful thanks also go to both Wan Fokkink and Peter Boncz for their contributions, kind support and advice. I also would like to thank Mr. Asfar Towheed for introducing me to the BI team and giving me chance to complete my internship at TomTom.

vi Preface The purpose of this report is to fulfill the internship requirement for the Master of Business Analytics degree at the VU University of Amsterdam. While I was employed at the Group IT Business Application department at TomTom as an intern within the Business Intelligence team, I focused on the requirement specification and the design of a Data Mart Code generation tool. This tool will assist end-users to reduce the e ort needed to develop SQL code to transform data from Warehouse to Data Marts.

1 Glossary: Aggregation: One way of speeding up query performance. Facts are summed up for selected dimensions from the original fact table. The resulting aggregate table will have fewer rows, thus making queries that can use them go faster. Attribute: Attributes represent a single type of information in a dimension. For example, year is an attribute in the time dimension. Conformed Dimension: Adimensionthathasexactlythesamemeaningand content when being referred to from di erent fact tables. Data Mart: Data marts have the same definition as the data warehouse (see below), but data marts have a more limited audience and/or data content. Data Warehouse: Awarehouseisasubject-oriented,integrated,time-variantand non-volatile collection of data in support of management s decision making process (as defined by Bill Inmon). Data Warehousing: The process of designing, building, and maintaining a data warehouse system. Dimension: The same category of information. For example, year, month, day, and week are all part of the Time Dimension. Dimensional Model: Atypeofdatamodelingsuitedfordatawarehousing.Ina dimensional model, there are two types of tables: dimensional tables and fact tables. Dimensional table records information on each dimension, and fact table records all the fact, or measures. Dimensional Table: Dimension tables store records related to this particular dimension. No facts are stored in a dimensional table. Drill Across: Data analysis across dimensions.

2 Drill Down: Data analysis to a child attribute. Drill Through: Data analysis that goes from an OLAP cube into the relational database. Drill Up: Data analysis to a parent attribute. ETL: Stands for Extraction, Transformation, and Loading. The movement of data from one area to another. Fact Table: Atypeoftableinthedimensionalmodel.Afacttabletypically includes two types of columns: fact columns and foreign keys to the dimensions. Grain: The grain of a fact table represents the most atomic level by which the facts may be defined. Hierarchy: Ahierarchydefinesthenavigatingpathfordrillingup and drilling down. All attributes in a hierarchy belong to the same dimension. Meta-data: Data about data. For example, the number of tables in the database is atypeofmetadata. Metric: Ameasuredvalue.Forexample, TotalSales isametric. MOLAP: Multidimensional OLAP. MOLAP systems store data in the multidimensional cubes. OLAP: On-Line Analytical Processing. OLAP should be designed to provide end users a quick way of slicing and dicing the data. ROLAP: Relational OLAP. ROLAP systems store data in the relational database. Snowflake Schema: A common form of dimensional model. In a snowflake schema, di erent hierarchies in a dimension can be extended into their own dimensional tables. Therefore, a dimension can have more than a single dimension table. Star Schema: Acommonformofdimensionalmodel.Inastarschema,each dimension is represented by a single dimension table.