A Technical Review on On-Line Analytical Processing (OLAP)



Similar documents
DATA WAREHOUSING AND OLAP TECHNOLOGY

DATA WAREHOUSING - OLAP

Learning Objectives. Definition of OLAP Data cubes OLAP operations MDX OLAP servers

CHAPTER 4 Data Warehouse Architecture

Data W a Ware r house house and and OLAP II Week 6 1

Unit -3. Learning Objective. Demand for Online analytical processing Major features and functions OLAP models and implementation considerations

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP

UNIT-3 OLAP in Data Warehouse

2074 : Designing and Implementing OLAP Solutions Using Microsoft SQL Server 2000

OLAP Theory-English version

Anwendersoftware Anwendungssoftwares a. Data-Warehouse-, Data-Mining- and OLAP-Technologies. Online Analytic Processing

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 29-1

CS2032 Data warehousing and Data Mining Unit II Page 1


When to consider OLAP?

OLAP and OLTP. AMIT KUMAR BINDAL Associate Professor M M U MULLANA

Week 3 lecture slides

Data Warehousing and OLAP Technology for Knowledge Discovery

Data Warehouse design

M Designing and Implementing OLAP Solutions Using Microsoft SQL Server Day Course

1. OLAP is an acronym for a. Online Analytical Processing b. Online Analysis Process c. Online Arithmetic Processing d. Object Linking and Processing

OLAP & DATA MINING CS561-SPRING 2012 WPI, MOHAMED ELTABAKH

Monitoring Genebanks using Datamarts based in an Open Source Tool

II. OLAP(ONLINE ANALYTICAL PROCESSING)

The Art of Designing HOLAP Databases Mark Moorman, SAS Institute Inc., Cary NC

Business Intelligence, Analytics & Reporting: Glossary of Terms

Course DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

LITERATURE SURVEY ON DATA WAREHOUSE AND ITS TECHNIQUES

Foundations of Business Intelligence: Databases and Information Management

14. Data Warehousing & Data Mining

Chapter 3, Data Warehouse and OLAP Operations

OLAP Systems and Multidimensional Expressions I

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Data Warehousing OLAP

Business Intelligence Solutions. Cognos BI 8. by Adis Terzić

Fluency With Information Technology CSE100/IMT100

Week 13: Data Warehousing. Warehousing

John R. Vacca INSIDE

OLAP. Business Intelligence OLAP definition & application Multidimensional data representation

BUSINESS ANALYTICS AND DATA VISUALIZATION. ITM-761 Business Intelligence ดร. สล ล บ ญพราหมณ

A Critical Review of Data Warehouse

Vendor briefing Business Intelligence and Analytics Platforms Gartner 15 capabilities

Data Warehousing and Data Mining in Business Applications

Chapter 6 FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT Learning Objectives

CHAPTER SIX DATA. Business Intelligence The McGraw-Hill Companies, All Rights Reserved

Data Warehouse: Introduction

Data Warehousing. Read chapter 13 of Riguzzi et al Sistemi Informativi. Slides derived from those by Hector Garcia-Molina

Part 22. Data Warehousing

Enterprise Resource Planning Analysis of Business Intelligence & Emergence of Mining Objects

Integrating SAP and non-sap data for comprehensive Business Intelligence

DATA WAREHOUSING APPLICATIONS: AN ANALYTICAL TOOL FOR DECISION SUPPORT SYSTEM

BUILDING OLAP TOOLS OVER LARGE DATABASES

Self-Service Business Intelligence: The hunt for real insights in hidden knowledge Whitepaper

Chapter 5. Warehousing, Data Acquisition, Data. Visualization

OLAP Systems and Multidimensional Queries II

Adobe Insight, powered by Omniture

Data Warehousing. Paper

Data Warehousing. Outline. From OLTP to the Data Warehouse. Overview of data warehousing Dimensional Modeling Online Analytical Processing

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2007 Lecture 16 - Data Warehousing

Turkish Journal of Engineering, Science and Technology

Data Warehousing: Data Models and OLAP operations. By Kishore Jaladi

Oracle OLAP What's All This About?

DSS based on Data Warehouse

DATA CUBES E Jayant Haritsa Computer Science and Automation Indian Institute of Science. JAN 2014 Slide 1 DATA CUBES

Business Intelligence & Product Analytics

Datawarehousing and Business Intelligence

Introduction to Data Warehousing. Ms Swapnil Shrivastava

Web Log Data Sparsity Analysis and Performance Evaluation for OLAP

B.Sc (Computer Science) Database Management Systems UNIT-V

The Role of Data Warehousing Concept for Improved Organizations Performance and Decision Making

Building Data Cubes and Mining Them. Jelena Jovanovic

Hybrid OLAP, An Introduction

Overview of Data Warehousing and OLAP

Decision Support. Chapter 23. Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke 1

BUILDING BLOCKS OF DATAWAREHOUSE. G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT

DATA WAREHOUSE CONCEPTS DATA WAREHOUSE DEFINITIONS

Building Cubes and Analyzing Data using Oracle OLAP 11g

IST722 Data Warehousing

ORACLE OLAP. Oracle OLAP is embedded in the Oracle Database kernel and runs in the same database process

Data Warehousing and OLAP

Foundations of Business Intelligence: Databases and Information Management

Course MIS. Foundations of Business Intelligence

Outline. Data Warehousing. What is a Warehouse? What is a Warehouse?

Migrating a Discoverer System to Oracle Business Intelligence Enterprise Edition

This tutorial will help computer science graduates to understand the basic-toadvanced concepts related to data warehousing.

Foundations of Business Intelligence: Databases and Information Management

Basics of Dimensional Modeling

Lost in Space? Methodology for a Guided Drill-Through Analysis Out of the Wormhole

University of Gaziantep, Department of Business Administration

Making confident decisions with the full spectrum of analysis capabilities

Foundations of Business Intelligence: Databases and Information Management

Data Warehouse. MIT-652 Data Mining Applications. Thimaporn Phetkaew. School of Informatics, Walailak University. MIT-652: DM 2: Data Warehouse 1

Financial Series EXCEL-BASED BUDGETING

Dimensional Modeling for Data Warehouse

Bussiness Intelligence and Data Warehouse. Tomas Bartos CIS 764, Kansas State University

Business Intelligence for SUPRA. WHITE PAPER Cincom In-depth Analysis and Review

Self-Service Business Intelligence

Transcription:

A Technical Review on On-Line Analytical Processing (OLAP) K. Jayapriya 1., E. Girija 2,III-M.C.A., R.Uma. 3,M.C.A.,M.Phil., Department of computer applications, Assit.Prof,Dept of M.C.A, Dhanalakshmi Srinivasan College of Arts & Science,Perambalur 621 212. Tamil Nadu. lallipriya1994@gmail.com. girijashivani94@gmail.com. Uky.uma@gmail.com 9791386538. ABSTRACT: Online Analytical Processing (OLAP) refers to the general activity of querying and presenting text and number data from data warehouses and/or data marts for analytical purposes. Database solved one part of the problem that is the data storage. To retrieve the appropriate information from the business data a variety of tools, using different techniques, exists that performs simple or complex tasks involving mathematical and statistical operation. Its main idea is providing navigation through data to nonexpert users, so that they are able to interactively generate ad-hoc queries without the intervention of IT professionals. This name was introduced in contrast to On-Line Transactional Processing (OLTP), so that it reflected the different requirements and characteristics between both kinds of uses. Keywords: Data Mining, Data Warehouse, Decision Support System(DSS), Desktop OLAP (DOLAP), Hybrid OLAP (HOLAP), Multidimensional Data, Multidimensional OLAP (MOLAP), On-Line Analytical Processing (OLAP), Relational OLAP (ROLAP). 1. INTRODUCTION: OLAP (On-Line Analytical Processing) is different from both; it is primarily a software technology concerned with fast analysis of enterprise information. Often OLAP systems are data warehouse front-end software tools to make aggregate data available efficiently, for advanced analysis, to an enterprise s managers. It is essential that an OLAP system provides facilities for a manager to pose ad hoc complex queries to obtain the information that he/she requires. Another term that is being used increasing is business intelligence. It is sometimes used to mean both data warehousing and OLAP. DATA MINING: Data mining is a collection of techniques for efficient automated discovery of previously unknown, valid, novel, useful and understandable patterns must be actionable so that they may be 315

used in an enterprise s decision making process. GROWTH IN OLTP DATA: The first database systems were implemented in the 1960s and 1970s. Many enterprises therefore have more than 30 years of experience in using database systems and they have accumulated large amounts of data during that time. Also as noted earlier, other technological developments also have resulted in growth, in some cases dramatic growth, in accumulated data. DATA MINING PROCESS: Online analytical processing (OLAP) that is designed to assist decision makers in an enterprise by providing summary data based on information that is stored in a data warehouse. 2. GUIDELINES FOR OLAP IMPLEMENTATION: The guidelines are, perhaps not surprisingly, somewhat similar to those presented for data warehouse implementation. Vision: The OLAP team must, in consultation with the users, develop a clear vision for the OLAP system. Senior management support: The OLAP project should be fully supported by the senior managers. Since a data warehouse may have been developed already, is should not be difficult. Selecting an OLAP tool: The OLAP team should familiarize themselves with the ROLAP and MOLAP tools available in the market. Corporate strategy: The OLAP strategy should fit with the enterprise strategy and business objectives. Focus on the users: The OLAP project should be focused on the users. A good GUI user interface should be provided to non-technical users. Joint management: The OLAP project must be managed by both the IT and business professionals. Review and adapt: Organizations evolve and so must the OLAP system. 3. Types of OLAP servers: ROLAP versus MOLAP versus HOLAP: OLAP servers present business users with multidimensional data from data warehouses or data marts, without concerns regarding how or where the data are stored. However, the physical architecture and implementation of OLAP servers must consider data storage issues. Relational OLAP (ROLAP) servers: These are the intermediate servers that stand in between a relational back-end server and client front-end tools. They use a relational or extendedrelational DBMS to store and manage warehouse data, and OLAP middleware to support missing pieces. ROLAP 316

technology tends to have greater scalability than MOLAP technology. The features of ROLAP are; Store and manage warehouse data in relational DBMS or specialized relational DBMS. Maps operations on multidimensional data to standard relational operator. Requires OLAP middleware to support missing pieces. Advantages: o Can handle large amounts of data: The data size limitation of ROLAP technology is the limitation on data size of the underlying relational database. Multidimensional OLAP (MOLAP) servers: These servers support multidimensional views of data through array-based multidimensional storage engines. They map multidimensional views directly to data cube array structures. The advantages of using a data cube is that allows fast indexing to precompiled summarized data. Many MOLAP servers adopt a two-level storage representation to handle dense and sparse data sets. The features of MOLAP are; Store and manage warehouse data in multi-dimensional DBMS. Array based storage structure. Direct access to array data structure. Advantages: o Excellent performance: MOLAP cubes are built for fast data retrieval, and is optimal for slicing and dicing operations. o Can perform complex calculations: All calculations have been pregenerated when the cube is created. Hence, complex calculations are not only doable, but they return quickly. Hybrid OLAP (HOLAP) servers: The hybrid OLAP approach combines ROLAP and MOLAP technology, benefiting from the greater scalability of ROLAP and the faster computation of MOLAP. The Microsoft SOL server 2000 supports a hybrid OLAP server. The features of HOLAP are; Storing detailed data in RDBMS Storing aggregated data in MDDBMS User access via MOLAP tools. 4. Indexing OLAP data: To facilitate efficient data accessing, most data warehouse systems support index structures and materialized views. General methods of select cuboids for materialization were discussed in the previous section. In this section, we examine how to index OLAP data by bitmap indexing and join indexing. 317

The bitmap indexing method is popular in OLAP products because it allows quick searching in data cubes. The bitmap indexing is an alternative representation of the record-id (RID) list. If the attribute has the value v for a given row in the data to 1 in the corresponding row of the bitmap index. All other bits for that row are set to 0. RID item city RID H C P S RID V T R1 H V R1 1 0 0 0 R1 1 0 R2 C V R2 0 1 0 0 R2 1 0 R3 P V R3 0 0 1 0 R3 1 0 R4 S V R4 0 0 0 1 R4 1 0 R5 H T R5 1 0 0 0 R5 0 1 R6 C T R6 0 1 0 0 R6 0 1 R7 P T R7 0 0 1 0 R7 0 1 R8 S T R8 0 0 0 1 R8 0 1 table, then the bit representing that value is Base Table Item Bitmap Index Table City bitmap index table Note: H-Home entertainment. S- Security. P- Phone T- Toronto. C- Computer. V- Vancouver. The join indexing method gained popularly from its use in relational database query processing. Traditional indexing maps the value in a given column to a list of rows having the value. Then the join index record contains the pair (RID, SID), Where RID and SID are record identifiers from the joinable relation. 318

Sales Item Location T57 T238 Main Street T459 Sony- T v T884 Advantages: I. Useful for low cardinality domains. II. Reduction in space and processing time. 5. Efficient processing of OLAP Queries: The purpose of materializing cuboids and constructing OLAP index structure is to speed up query processing in data cubes. Determine which operations should be performed on the available cuboids: This involves transforming any selection, projection, roll-up, and drill-down operations specified in the query into corresponding SQL and/or OLAP operations. Determine to which materialized cuboids the relevant operations should be applied: This involves identifying all of the materialized cuboids that may potentially be used to answer the query, pruning the above set using knowledge of dominance relationships among the cuboids, estimating the costs of using the remaining materialized cuboids, and selecting the cuboids with the least cost. 319

6.From On-Line Analytical Processing To On-Line Analytical Mining: Online analytical mining (OLAM) (also called OLAP mining) integrates on-line analytical processing (OLAP) with data mining and mining knowledge in multidimensional databases. Among the many different paradigms and architectures of data mining systems, OLAM is particularly important for the following reasons. High quality of data in data warehouse. Available information processing infrastructure surrounding data warehouses. OLAP based exploratory data analysis. Online selection of data mining functions. A metadata directory is used to guide the access of the data cube. The data cube can be constructed by accessing and/or integrating multiple databases via an MDDB API and/or by filtering a data warehouse via a database API that may support OLE DB or ODBC connections. Since, such as concept description, association, classification, prediction, clustering, time-series analysis, and usually consists of multiple integrated data mining modules and is more sophisticated than an OLAP server. 7. Architecture for On-Line Analytical Processing: The OLAM and OLAP server platform analytical mining in data cubes in a similar manner as an OLAP server performs on-line analytical processing. 320

Constraint-based Layer 4 interface Mining query Mining result user Graphical user interface API OLAP/OLAM OLAM Engine OLAP Engine Layer 3 Cube API Layer 2 MDDB Meta data Multidimensional Database Database API Layer1 Data filtering data Filtering Dater I integration Data cleaning Databases Databases Data Data integration warehouse 321

The capability of OLAP to provide multiple and dynamic views of summarized data in a data warehouse sets a solid foundation for successful data mining. 8. OLAP Operations in the Multidimensional Data Model: A number of OLAP data cube operations exists to materialize these different views, allowing interactive querying and analysis of the data at hand. Hence OLAP provides a user-friendly environment for interactive data analysis. Roll-up: The roll-up operation performs aggregate on data cube, either by climbing-up a concept hierarchy for a dimension reduction. Drill-down: Drill down is the reverse of roll-up. Drill-down can be realized by either stepping down a concept hierarchy for a dimension or introducing additional dimensions. Slice and dice: The slice operation performs a selection on one dimension of the given cube, resulting in a sub cube. Pivot: Pivot is a visualization operation that rotates the data axes in views in order to provide an alternative presentation of the data. Shows a pivot operation where the item and location axes in a 2-d slice are rotated. 9. Need for OLAP: OLAP is a category of software technology that enables analysts, managers and executives to gain insight into data through fast, consistent, interactive access to a wide variety of possible views of information that has been transformed from raw data to reflect the real dimensionality of the enterprise as understood by the user. Calculation and modeling applied across dimensions, through hierarchies and/or across members Trend analysis over sequential time periods Slicing subsets for on-screen viewing Drill-down to deeper levels of consolidation Reach- through to underlying detail data Rotation to new dimensional comparison in the viewing area. OLAP is implemented in a multi-user client/server mode and offers consistently rapid response to queries, regardless of database size and complexity. 10. OLAP Server: An OLAP server is a highcapacity, multi-user data manipulation engine specifically designed to support and operate on multi-dimensional data structures. The OLAP server may either physically stage the processed multidimensional information to deliver consistent and rapid response times to end users, or it may populate its data structures in real-time relational or other databases, or offer a choice of both. Given the current state of technology and the user requirement for consistent and rapid response times, staging the multi dimensional data in the 322

OLAP server is often the preferred method. 11. OLAP Applications: OLAP is widely used in several realms of data management. Some of these applications include: 1. Financial Applications Activity based costing(resource allocation) Budgeting 2. Marketing/Sales Applications Market Research Analysis Sales Forecasting Promotions Analysis Customer Analyses Market/Customer Segmentation 3. Business Modeling Simulating business behavior. Extensive, real time decision support system for managers All of the above applications need ability to provide managers with the information they need to make effective decision about an organisation s strategic directions. The key indicator of a successful OLAP applications is its ability to provide information, as needed, that is, its ability to provide just-in-time information for effective decision making. This requires more than a base level of detailed data. 12. Benefits of using OLAP: OLAP holds several benefits for business: i. OLAP helps managers in decisionmaking through the multidimensional data views that is capable of providing, thus increasing their productivity. ii. OLAP applications are selfsufficient owing to the inherent flexibility provided to the organized databases. iii. It enables simulation of business models and problems, through extensive usage of analysiscapabilities. iv. In conjunction with data warehousing, OLAP can be used to provide reduction in the application backlog, faster information retrieval and reduction in query drag. 13. Conclusion: The applications and benefits of OLAP indicate clearly that is one of the revolutionary technologies that can be used extensively in data management. All business, big or small, can benefit from using OLAP, at some scale or other. The instant access to information and the apparent knowledge gained from this information is invaluable considering against the cost involved in establishing this setup. References: [1] G. k.gupta, Data Mining, Process, OLAP data, OLAP implementation, in Data Mining Eastern Economy Edition. 323

[2] Jiawei Han and Micheline Kamber, Types of OLAP servers, Architecture of OLAP, From OLAP to OLAM, Second Edition. [3] S. Poonkuzhali and C.Saravanakumar, Indexing OLAP data, OLAP Queries, Charulatha Publications in Revised Edition. [4] Alex Berson, Stephen Smith,Kurt Thearling, OLAP operations,tata McGraw-Hill Publishing. [5]http:// Web.iiit.ac.in/, OLAP Applications. 324