DataMart (Data Warehouse) Tool: Mondrian + JRubik

Similar documents
Monitoring Genebanks using Datamarts based in an Open Source Tool

Data W a Ware r house house and and OLAP II Week 6 1

CS2032 Data warehousing and Data Mining Unit II Page 1

DATA WAREHOUSING - OLAP

CHAPTER 4 Data Warehouse Architecture

Anwendersoftware Anwendungssoftwares a. Data-Warehouse-, Data-Mining- and OLAP-Technologies. Online Analytic Processing

1. OLAP is an acronym for a. Online Analytical Processing b. Online Analysis Process c. Online Arithmetic Processing d. Object Linking and Processing

2074 : Designing and Implementing OLAP Solutions Using Microsoft SQL Server 2000

Learning Objectives. Definition of OLAP Data cubes OLAP operations MDX OLAP servers

Business Intelligence, Data warehousing Concept and artifacts

A Technical Review on On-Line Analytical Processing (OLAP)

M Designing and Implementing OLAP Solutions Using Microsoft SQL Server Day Course

Business Intelligence & Product Analytics

Data Warehouse design

OLAP & DATA MINING CS561-SPRING 2012 WPI, MOHAMED ELTABAKH

DATA WAREHOUSING AND OLAP TECHNOLOGY

Reporting trends and pain points of current and new customers IBM Corporation

SQL SERVER BUSINESS INTELLIGENCE (BI) - INTRODUCTION

Open Source Business Intelligence Tools: A Review

Introduction to Data Warehousing. Ms Swapnil Shrivastava

OLAP Theory-English version

University of Gaziantep, Department of Business Administration

Open Source Business Intelligence Intro

SAS BI Course Content; Introduction to DWH / BI Concepts

BUSINESS ANALYTICS AND DATA VISUALIZATION. ITM-761 Business Intelligence ดร. สล ล บ ญพราหมณ

Data Warehousing. Paper

Bussiness Intelligence and Data Warehouse. Tomas Bartos CIS 764, Kansas State University

Business Intelligence for SUPRA. WHITE PAPER Cincom In-depth Analysis and Review

Unit -3. Learning Objective. Demand for Online analytical processing Major features and functions OLAP models and implementation considerations

Data Warehouse: Introduction

Building Data Cubes and Mining Them. Jelena Jovanovic

A DATA WAREHOUSE SOLUTION FOR E-GOVERNMENT

OLAP Systems and Multidimensional Expressions I

Data Warehousing OLAP

Week 3 lecture slides

The Art of Designing HOLAP Databases Mark Moorman, SAS Institute Inc., Cary NC

When to consider OLAP?

Part 22. Data Warehousing

PREFACE INTRODUCTION MULTI-DIMENSIONAL MODEL. Chris Claterbos, Vlamis Software Solutions, Inc.

Data Warehousing: Data Models and OLAP operations. By Kishore Jaladi

Business Intelligence, Analytics & Reporting: Glossary of Terms

Web Log Data Sparsity Analysis and Performance Evaluation for OLAP

Business Benefits From Microsoft SQL Server Business Intelligence Solutions How Can Business Intelligence Help You? PTR Associates Limited

OLAP OLAP. Data Warehouse. OLAP Data Model: the Data Cube S e s s io n

Building Cubes and Analyzing Data using Oracle OLAP 11g

UNIT-3 OLAP in Data Warehouse

Microsoft Services Exceed your business with Microsoft SharePoint Server 2010

Lost in Space? Methodology for a Guided Drill-Through Analysis Out of the Wormhole

Breadboard BI. Unlocking ERP Data Using Open Source Tools By Christopher Lavigne

OLAP Systems and Multidimensional Queries II

While people are often a corporation s true intellectual property, data is what

Turkish Journal of Engineering, Science and Technology

Multi-dimensional index structures Part I: motivation

Business Intelligence and Healthcare

Apache Kylin Introduction Dec 8,

CHAPTER 4: BUSINESS ANALYTICS

Data Testing on Business Intelligence & Data Warehouse Projects

An Architectural Review Of Integrating MicroStrategy With SAP BW

IAF Business Intelligence Solutions Make the Most of Your Business Intelligence. White Paper November 2002

Business Intelligence for Dynamics GP. Presented By: Rob Jackson, Business Intelligence Consultant Brent Keilin, GP Consultant

Implementing Data Models and Reports with Microsoft SQL Server

Database Applications. Advanced Querying. Transaction Processing. Transaction Processing. Data Warehouse. Decision Support. Transaction processing

The strategic importance of OLAP and multidimensional analysis A COGNOS WHITE PAPER

Business Intelligence

SAS Business Intelligence Online Training

Analysis Services Step by Step

Data Warehousing and OLAP

Super-Charged Oracle Business Intelligence with Essbase and SmartView

Large Telecommunications Company Gains Full Customer View, Boosts Monthly Revenue, Cuts IT Costs by $3 Million

Need for Business Intelligence

Hybrid OLAP, An Introduction

Integrating Business Intelligence Module into Learning Management System

Oracle OLAP What's All This About?

Automated Financial Reporting (AFR) Version 4.0 Highlights

Integrating GIS and BI: a Powerful Way to Unlock Geospatial Data for Decision-Making

OLAP. Business Intelligence OLAP definition & application Multidimensional data representation

SQL SERVER TRAINING CURRICULUM

Data Warehousing. Read chapter 13 of Riguzzi et al Sistemi Informativi. Slides derived from those by Hector Garcia-Molina

Oracle Warehouse Builder 10g

Turning your Warehouse Data into Business Intelligence: Reporting Trends and Visibility Michael Armanious; Vice President Sales and Marketing Datex,

SQL Server 2012 End-to-End Business Intelligence Workshop

Building Geospatial Business Intelligence Solutions with Free and Open Source Components

70-467: Designing Business Intelligence Solutions with Microsoft SQL Server

Delivering Business Intelligence With Microsoft SQL Server 2005 or 2008 HDT922 Five Days

End to End Microsoft BI with SQL 2008 R2 and SharePoint 2010

5.5 Copyright 2011 Pearson Education, Inc. publishing as Prentice Hall. Figure 5-2

II. OLAP(ONLINE ANALYTICAL PROCESSING)

The Microsoft Business Intelligence 2010 Stack Course 50511A; 5 Days, Instructor-led

The difference between. BI and CPM. A white paper prepared by Prophix Software

ORACLE OLAP. Oracle OLAP is embedded in the Oracle Database kernel and runs in the same database process

Building Views and Charts in Requests Introduction to Answers views and charts Creating and editing charts Performing common view tasks

Data Warehousing and OLAP Technology for Knowledge Discovery

BUILDING BLOCKS OF DATAWAREHOUSE. G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT

CHAPTER 11: BUSINESS INTELLIGENCE - SUPPLEMENTAL MATERIAL

Create Mobile, Compelling Dashboards with Trusted Business Warehouse Data

SQL Server Analysis Services Complete Practical & Real-time Training

Business Intelligence Solutions. Cognos BI 8. by Adis Terzić

Designing a Dimensional Model

Jet Enterprise Frequently Asked Questions Pg. 1 03/18/2011 JEFAQ - 02/13/ Copyright Jet Reports International, Inc.

ETL TESTING TRAINING

Transcription:

DataMart (Data Warehouse) Tool: Mondrian + JRubik Edwin Rojas (CIP) ICIS workshop 2006, CIMMYT May 2006

Data Warehouse Motivation and Examples Data Warehouse Motivation Huge amounts of data need to be summarized in various forms to enable data creators and data users to get quick overviews and dig into details as needed with high performance and flexibility CIP Example Solutions

Data Warehouse Architectural Data Warehouse Client (web, standalone app) Data Warehouse Engine Data Warehouse Repository (multidimensional data base) Script Data Source (relational db, flat files) -Populate database for dimensional model -Regenerate aggregated tables

Data Warehouse Types Part I In the OLAP world, there are mainly two different types: Multidimensional OLAP (MOLAP) and Relational OLAP (ROLAP). Hybrid OLAP (HOLAP) refers to technologies that combine MOLAP and ROLAP. MOLAP, This is the more traditional way of OLAP analysis. In MOLAP, data is stored in a multidimensional cube. The storage is not in the relational database, but in proprietary formats. Advantages: Excellent performance: MOLAP cubes are built for fast data retrieval, and is optimal for slicing and dicing operations. Can perform complex calculations: All calculations have been pre-generated when the cube is created. Hence, complex calculations are not only doable, but they return quickly. Disadvantages: Limited in the amount of data it can handle: Because all calculations are performed when the cube is built, it is not possible to include a large amount of data in the cube itself. This is not to say that the data in the cube cannot be derived from a large amount of data. Indeed, this is possible. But in this case, only summary-level information will be included in the cube itself. Requires additional investment: Cube technology are often proprietary and do not already exist in the organization. Therefore, to adopt MOLAP technology, chances are additional investments in human and capital resources are needed.

Data Warehouse Types Part II ROLAP This methodology relies on manipulating the data stored in the relational database to give the appearance of traditional OLAP's slicing and dicing functionality. In essence, each action of slicing and dicing is equivalent to adding a "WHERE" clause in the SQL statement. Advantages: Can handle large amounts of data: The data size limitation of ROLAP technology is the limitation on data size of the underlying relational database. In other words, ROLAP itself places no limitation on data amount. Can leverage functionalities inherent in the relational database: Often, relational database already comes with a host of functionalities. ROLAP technologies, since they sit on top of the relational database, can therefore leverage these functionalities. Disadvantages: Performance can be slow: Because each ROLAP report is essentially a SQL query (or multiple SQL queries) in the relational database, the query time can be long if the underlying data size is large. Limited by SQL functionalities: Because ROLAP technology mainly relies on generating SQL statements to query the relational database, and SQL statements do not fit all needs (for example, it is difficult to perform complex calculations using SQL), ROLAP technologies are therefore traditionally limited by what SQL can do. ROLAP vendors have mitigated this risk by building into the tool out-of-the-box complex functions as well as the ability to allow users to define their own functions. HOLAP HOLAP technologies attempt to combine the advantages of MOLAP and ROLAP. For summary-type information, HOLAP leverages cube technology for faster performance. When detail information is needed, HOLAP can "drill through" from the cube into the underlying relational data.

Multidimensional Model Elements Part I Dimension A category of information. For example, the taxonomy dimension. Hierarchy Levels The specification of levels that represents relationship between different attributes within a hierarchy. For example, one possible hierarchy in the Taxonomy dimension is Family --> Genus --> Series --> Species. A fact table is a table that contains the measures of interest. For example, accessions count would be such a measure. This measure is stored in the fact table with the appropriate granularity. A dimensional model includes fact tables and lookup tables. Fact tables connect to one or more lookup tables, but fact tables do not have direct relationships to one another. Dimensions and hierarchies are represented by lookup tables. Attributes are the non-key columns in the lookup tables.

Data Warehouse Viewers Mondrian = Data Warehouse Web Viewer + Data Warehouse Engine http://mondrian.sourceforge.net/ JRubik = Data Warehouse Standalone Viewer (Java Swing) http://rubik.sourceforge.net/ Mondrian engine versions 1.0 version 2004 Precomputed totals created when the first user run Stored in temporary cache 2.0 version August 2005 Precomputed totals created when the model db is created Stored in tables db 2.1 version March 2006 Mondrian as a component of business intelligent framework - BI

Open Source Data Warehouse Technology Relational Databases MySQL MS-Access MS-SQL PostgreSQL Relational Model Job Script Multidimensional Databases MySQL MS-Access MS-SQL PostgreSQL HTML Pivot table and chart Multidimensional Model Java/Swing application http://sourceforge.net/projects/rubik/ DBDesigner Eclipse Plug-in Mondrian

Case Study for ICIS Inventory (IMS) Database

Case Study for ICIS Genealogy (GMS) Database 2 millions of germplasm

Demos and Tutorial For Standalone: Rubik viewer View Video: http://research.cip.cgiar.org/docs/ mondrian/videos/general_rubik_summary/general_rubik_summary.html For Web: Mondrian viewer View Video: http://research.cip.cgiar.org/docs/ mondrian/videos/general_mondrian_summary/general_mondrian _summary.html PFD Tutorial for Mondrian: http://research.cip.cgiar.org/docs/mondrian /Tutorial_Mondrian.pdf