Modelling Architecture for Multimedia Data Warehouse

Similar documents
ISSN: ISO 9001:2008 Certified International Journal of Engineering Science and Innovative Technology (IJESIT) Volume 2, Issue 4, July 2013

Chapter 3 Data Warehouse - technological growth

Turkish Journal of Engineering, Science and Technology

A Design and implementation of a data warehouse for research administration universities

Course DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Published by: PIONEER RESEARCH & DEVELOPMENT GROUP ( 28

SALES BASED DATA EXTRACTION FOR BUSINESS INTELLIGENCE

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

DATA WAREHOUSING AND OLAP TECHNOLOGY

A Critical Review of Data Warehouse

LEARNING SOLUTIONS website milner.com/learning phone

Chapter 5. Warehousing, Data Acquisition, Data. Visualization

Data Warehouse Snowflake Design and Performance Considerations in Business Analytics

Meta-data and Data Mart solutions for better understanding for data and information in E-government Monitoring

A Model-based Software Architecture for XML Data and Metadata Integration in Data Warehouse Systems

RESEARCH ON THE FRAMEWORK OF SPATIO-TEMPORAL DATA WAREHOUSE

1. OLAP is an acronym for a. Online Analytical Processing b. Online Analysis Process c. Online Arithmetic Processing d. Object Linking and Processing

A Survey on Data Warehouse Architecture

ISSN: A Review: Image Retrieval Using Web Multimedia Mining

Databases in Organizations

Visual Data Mining in Indian Election System

THE QUALITY OF DATA AND METADATA IN A DATAWAREHOUSE

CHAPTER 3. Data Warehouses and OLAP

II. OLAP(ONLINE ANALYTICAL PROCESSING)

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 29-1

MS 20467: Designing Business Intelligence Solutions with Microsoft SQL Server 2012

Fuzzy Spatial Data Warehouse: A Multidimensional Model

Fluency With Information Technology CSE100/IMT100

An Introduction to Data Warehousing. An organization manages information in two dominant forms: operational systems of

Part 22. Data Warehousing

IST722 Data Warehousing

Data warehouses. Data Mining. Abraham Otero. Data Mining. Agenda

Designing Business Intelligence Solutions with Microsoft SQL Server 2012 Course 20467A; 5 Days

E-Governance in Higher Education: Concept and Role of Data Warehousing Techniques

An Approach for Facilating Knowledge Data Warehouse

A COMPARATIVE STUDY BETWEEN THE PERFORMANCE OF RELATIONAL & OBJECT ORIENTED DATABASE IN DATA WAREHOUSING

14. Data Warehousing & Data Mining

Enterprise Data Warehouse (EDW) UC Berkeley Peter Cava Manager Data Warehouse Services October 5, 2006

Data Warehousing and Data Mining in Business Applications

INTEROPERABILITY IN DATA WAREHOUSES

Data Warehousing Systems: Foundations and Architectures

Student Performance Analytics using Data Warehouse in E-Governance System

Data Warehousing and OLAP Technology for Knowledge Discovery

Enterprise Solutions. Data Warehouse & Business Intelligence Chapter-8

Data Warehouse: Introduction

When to consider OLAP?

LITERATURE SURVEY ON DATA WAREHOUSE AND ITS TECHNIQUES

Conceptual Workflow for Complex Data Integration using AXML

META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING

Proper study of Data Warehousing and Data Mining Intelligence Application in Education Domain

Dimensional Modeling for Data Warehouse

The Role of Data Warehousing Concept for Improved Organizations Performance and Decision Making

OLAP and OLTP. AMIT KUMAR BINDAL Associate Professor M M U MULLANA

CONCEPTUALIZING BUSINESS INTELLIGENCE ARCHITECTURE MOHAMMAD SHARIAT, Florida A&M University ROSCOE HIGHTOWER, JR., Florida A&M University

SQL Server 2012 Business Intelligence Boot Camp

IMPLEMENTATION OF DATA WAREHOUSE SAP BW IN THE PRODUCTION COMPANY. Maria Kowal, Galina Setlak

Data W a Ware r house house and and OLAP Week 5 1

Deriving Business Intelligence from Unstructured Data

New Approach of Computing Data Cubes in Data Warehousing

Considering unstructured data for OLAP: a feasibility study using a systematic review

BUILDING OLAP TOOLS OVER LARGE DATABASES

BUILDING BLOCKS OF DATAWAREHOUSE. G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT

Optimized Offloading Services in Cloud Computing Infrastructure

A DATA WAREHOUSE SOLUTION FOR E-GOVERNMENT

Lecture Data Warehouse Systems

Bitmap Index an Efficient Approach to Improve Performance of Data Warehouse Queries

Data Warehouse Architecture for Financial Institutes to Become Robust Integrated Core Financial System using BUID

Data Warehousing Fundamentals for IT Professionals. 2nd Edition

Managing a Fragmented XML Data Cube with Oracle and Timesten

Alejandro Vaisman Esteban Zimanyi. Data. Warehouse. Systems. Design and Implementation. ^ Springer

SAS BI Course Content; Introduction to DWH / BI Concepts

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2

Java Metadata Interface and Data Warehousing

A Knowledge Management Framework Using Business Intelligence Solutions

Optimization of ETL Work Flow in Data Warehouse

Data Mining: Concepts and Techniques. Jiawei Han. Micheline Kamber. Simon Fräser University К MORGAN KAUFMANN PUBLISHERS. AN IMPRINT OF Elsevier

Medical Info Warehouse

Presented by: Jose Chinchilla, MCITP

Methodology Framework for Analysis and Design of Business Intelligence Systems

2074 : Designing and Implementing OLAP Solutions Using Microsoft SQL Server 2000

Data Warehouse Overview. Srini Rengarajan

Metadata Technique with E-government for Malaysian Universities

Object Oriented Data Modeling for Data Warehousing (An Extension of UML approach to study Hajj pilgrim s private tour as a Case Study)

The Design and the Implementation of an HEALTH CARE STATISTICS DATA WAREHOUSE Dr. Sreèko Natek, assistant professor, Nova Vizija,

Business Intelligence in E-Learning

Apply On-Line Analytical Processing (OLAP)With Data Mining For Clinical Decision Support

Model-Driven Data Warehousing

Lection 3-4 WAREHOUSING

DATA INTEGRATION CS561-SPRING 2012 WPI, MOHAMED ELTABAKH

M3039 MPEG 97/ January 1998

Module Title: Business Intelligence

Data Warehousing and Data Mining

Emerging Technologies Shaping the Future of Data Warehouses & Business Intelligence

Business Intelligence and Healthcare

Course Code CE609. Lecture : 03. Practical : 01. Course Credit. Tutorial : 00. Total : 04. Course Learning Outcomes

Chapter 11 Mining Databases on the Web

A Review of Data Warehousing and Business Intelligence in different perspective

CMiS: A Cloud Computing Based Management Information System

SPATIAL DATA CLASSIFICATION AND DATA MINING

Transcription:

Modelling Architecture for Warehouse Mital Vora 1, Jelam Vora 2, Dr. N. N. Jani 3 Assistant Professor, Department of Computer Science, T. N. Rao College of I.T., Rajkot, Gujarat, India 1 Assistant Professor, Department of Computer Science, T. N. Rao College of I.T., Rajkot, Gujarat, India 2 Director, SKPIMCS - MCA, Gandhinagar, Gujarat, India 3 ABSTRACT: Warehouse is an information system mainly used to support strategic decision. During last few years there is a need arise to manage multimedia data in decision making process in business industry which leads to build data warehouse. data warehouse is a collection of large volume of image, audio, video and text data. To efficiently store, access and analyse such data there is a need arise to manage these data. management includes the access and storage mechanisms that support the data warehouse. Storage and retrieval of multimedia data is a critical issue for the overall system's performance and functionality. data warehouse must be studied in order to provide an efficient environment in which data can be efficiently stored, retrieved and analyzed. In this paper, we propose the architectural framework to build multimedia data warehouse with the aim to provide better performance. To achieve better storage, access and analysis performance certain techniques are incorporated. Storage efficiency is improved by using provided compression technique and partitioning method. Access and analysis efficiency is improved by representing multimedia data by multilevel features and by applying indexing technique. KEYWORDS: Warehouse, Analysis, integration, processing, Dimensional Modelling I. INTRODUCTION warehouse is subject-oriented, integrated, non volatile and time variant collection of data that support in management's decision making process [7]. A warehouse built by integrating large amount of data from multiple heterogeneous sources and is organized in a way to provide vital strategic information. warehouse uses dimensional modelling to store large amount of integrated data. Multidimensional modelling uses star schema or snow flake schema to store data in warehouse. warehouse supports analytical reporting, ad hoc queries and decision making. warehouses used to store numeric and textual data for decision making process and most commercial applications are designed to operate with data warehouse of this nature. Majority of the warehouse systems helps in analysing numeric data. Much research work has been done to design data warehouse for storing, aggregating and summarizing these data. warehouse technology with numeric data is considered to be matured [3]. data warehouse study is rooted in traditional areas of multimedia analysis and warehouse, which started in the late 1990s to early 2000s. In today s business scenario the type of data is not limited to numeric or textual data but it includes wide verities of images, audio, video, maps etc. Till date, new models, architectures and framework have continued to emerged and proposed in the multimedia data warehouse research and development (R&D) community to efficiently store, access and process multimedia data in warehouse environment. data is widely used in the field of information in science, engineering, medicine, modern biology, geography, biometrics, weather forecast, digital libraries, manufacturing and retailing, art and entertainment, journalism, social sciences and distance learning. These data comprise of various formats like image, audio, video, text and signal data. To efficiently access and analyse such data there is a need arise to manage these data tremendously. management includes the storage and access mechanisms that support multimedia data warehouse. Storage and access of multimedia data is a critical issue for the overall system's performance and functionality. Hence deployment of new DOI: 10.15680/IJIRSET.2015.0401048 www.ijirset.com 18699

techniques to store, retrieve and process the multimedia data is essential and imperative. There is much to do in regard to complex, multimedia data warehousing [6]. The focus of this paper is to address the issue of modelling multimedia data warehouse architectural framework which address the provision of efficient data storage and access mechanism. This needs optimizations in the storage structure and needs the provision of design for improvement in access latency occurring in the query processing. For the experimental purpose we have tested biometric face image data, geographic image data and video data for the proposed multimedia data warehouse model. II. RELATED WORK In the late 1990s, W Lee et al [13] developed multimedia data warehouse for EoD (Education on Demand) systems. The system is implemented for video data, the relevant shot is pre processed for its physical and semantic structure by providing appropriate indexing and retrieval characteristics. For data storage they use star schema model. [11] proposed multi-tier image data warehouse framework based on the OOAD and component based development and have not described modelling technique much. Researchers have built multimedia data warehouse which can analyse data coming from heterogeneous and distributed sources [12, 5]. [12] provides materialized views to use in the analysis of multimedia data. data can be represented by different level of features of abstraction. Researchers have suggested multiversion multidimensional model [2, 3] which describes and stores cardiology ECG data with content based or description based descriptors. They also use aggregation functions for multimedia data that are integrated into the data warehouse and in the OLAP engine. The limitation of their work is efficiency of storage and optimization of processes. Because of the interoperability and flexibility of the XML, researchers also proposed model which is used to build XML based data warehouse [5]. [5] uses XML technology to build framework for web enabled multimedia data warehouse which supports content-based integration and retrieval of multimedia data, and manages changes of data sources efficiently in a distributed environment. [1,14] presented the model that uses semantic based data. [1] proposed hierarchical way to store semantic data, they use two repositories of metadata which describes data in hierarchical manner in XML files. The authors of [4] have built a data warehouse which has two ontologies, one for the specific business terms and one for the technical terms, specific to the aggregation and knowledge extraction tools. This requires a one-time collaboration between the business experts and data warehouse designers, to produce a mapping between two ontologies. For multimedia data storage and representation [14] designed visual cube which uses two types of dimension. One type of dimension is meta information dimension which stores meta information regarding image and other is visual dimension which stores data based on image visual features. They also propose the idea of dynamic aggregation selection. The concept of content server is also proposed [9], which stores content of multimedia data in content server. proposed Content Server will maintain the indexes for metadata of the stored multimedia content as well as will have a repository of multimedia content. In [8] presents content based image retrieval using dynamic indexing and guided search combined with data mining and data warehousing techniques. They have developed wavelet based scheme for multiple feature extraction and developed multimedia starflake schema for image data warehouse, which support multiple feature integration and dynamic image indexing. III. PROPOSED ARCHITECTURAL FRAMEWORK FOR BUILDING MULTIMEDIA DATA WAREHOUSE Proposed architectural framework is aimed to provide further enhancement of the already proposed system for building multimedia data warehouse. As the storage and retrieval of multimedia data efficiently is a critical issue for the overall system s performance and functionality, the proposed work uses existing compression technique and partitioning for storage efficiency and for efficient retrieval proposed work focuses on the representation of multimedia data, indexing and partitioning mechanism. The proposed architectural framework comprises of mainly three phases: These three phases are: data extraction and Integration phase, data modeling phase and access and analysis phase. Following figure shows the generic architecture for multimedia data warehouse. data is extracted from the source system. These data are then processed to acquire different levels of feature. These data are then stored in data warehouse for data analytics and data retrieval. DOI: 10.15680/IJIRSET.2015.0401048 www.ijirset.com 18700

Sources Staging End User ETL Process Warehouse OLAP Server OLAP Report End User 3.1 data extraction and Integration phase Fig. 1 Warehouse Architecture In this phase, multimedia data is extracted from the operational sources. The relevant characteristic of multimedia data should be extracted according to the analysis goal. data is usually described by different levels of feature of abstraction. Low level features (color, texture, shape, etc) are widely used to describe multimedia data as these data can be extracted automatically using program. These features seldom represent the semantic content of the multimedia object. With regard to the data access and analysis, the high level semantic content is important. For this reason, this work has included low level features descriptor, high level semantic feature descriptor and calculated feature descriptors of multimedia content. Calculated features are features which can be extracted from multimedia data processing and can also be derived from the low level and high level features. for example, face nodal points and distance between major nodal points can be extracted from face image. These extracted features are known as calculated feature. After extracting multimedia data, low level feature extraction process takes place which is done automatically by using program and high level feature extraction process carried out manually. Some basic characteristics file size, filename, author name, format, compression rate, duration for videos or sounds and resolution for images are also extracted automatically using program. After extracting these data, multimedia data is compressed using existing lossy technique. After preparing the data in extraction part data is ready to get loaded in data warehouse. Following figure shows the low level and high level feature extraction process. data source Image Low - level Feature Extraction High - level Feature Extraction Features Staging Fig. 2 Feature Extraction process 3.2 data modeling phase warehouse allows the data to be modeled in multidimensional way and to be observed from different perspective. Dimensional model of warehouse allows the creation of appropriate analysis contexts and the preparation of data for analysis. This requires to build multimedia data cubes on which OLAP operation are performed. DOI: 10.15680/IJIRSET.2015.0401048 www.ijirset.com 18701

To design multimedia data warehouse for multimedia data, logical and physical model has been developed. Logical dimensional model shows Main entities and their relationships in a logically sound manner, to serve as model for physical implementation. Physical dimensional model shows the actual representation of dimension and facts in data warehouse as they are implemented. Proposed architectural framework uses star schema technique of multidimensional model. Star schema contains one fact table that is surrounded by number of dimension tables. Facts are considered as dynamic part of warehouse and dimensions are considered as static entities because dimensions are computed once during the ETL process. Proposed schema includes a fact containing measures for data and number of dimensions is created for low level and high level features of multimedia data, data related to analyze multimedia data and a number of other data for the targeted application. After storing these data, indexes are created for feature attributes to speed up the query processing. For better storage, management and for better processing performance, partitioning technique is applied on the year based criteria. 3.4 access and analysis phase In the data analysis phase, an end user tool has been created that accesses and analyzes multimedia data from the multimedia data warehouse. End user can analyze multimedia data by provided multiple features. Storage and access efficiency of multimedia data is studied. III. EXPERIMENTAL RESULTS To validate our proposed architectural framework for multimedia data warehouse, experiment has been performed on small scale OLAP environment. compression is applied only on multimedia data. To shows the query duration for simple, middle and complex level queries, two sample queries for each level is evaluated. Following figure shows the query performance for compressed image data without using indexing and partitioning approach. Fig. 3 Query duration for set of queries To shows the query duration for simple, middle and complex level queries, the same two sample queries for each level is evaluated on small scale OLAP environment. Following figure shows the query performance for compressed image data with using indexing approach. Fig. 4 Query duration for set of queries by using indexing DOI: 10.15680/IJIRSET.2015.0401048 www.ijirset.com 18702

To shows the query duration for simple, middle and complex level queries, the same two sample queries for each level is evaluated on small scale OLAP environment. Following figure shows the query performance for compressed image data with using indexing and partitioning approach together. Fig. 5 Query duration for set of queries by using indexing and partitioning Above figure shows the improved efficiency in data retrieval by using indexing and partitioning as compared to the data retrieval without using indexing and partitioning approach. IV. CONCLUSION Our paper has presented a systematic approach on building architectural framework for multimedia data warehouse in a generic way, so that the techniques can be applied to a wide range of multimedia data warehouse models and implementations. Implementation of these technique helps to build better multimedia model. Storage efficiency is improved by using provided compression and partitioning technique. Access and analysis efficiency is improved by representing multimedia data by multilevel features, by applying indexing technique and by using partitioning technique. By using this proposed approach, we not only obtain an efficient storage and processing of multimedia data warehouse but we also analyse multimedia data better. REFERENCES 1. Andrei Vanea, Rodica Potolea, A Hierarchical Semantically Enhanced Warehouse IEEE 2006. 2. Anne-Muriel Arigon, Anne Tchounikine, Maryvonne Miquel, Handling Multiple Points of View in a Warehouse, ACM, Vol. 2, No. 3, pp 199 218, August 2006. 3. Anne-Muriel Arigon, Maryvonne Miquel and Anne Tchounikine, data warehouses: a multiversion model and a medical application. Springer Science, 2007. 4. G. Xie, Y. Yang, S. Liu, Z. Qiu, Y. Pan, X. Zhou, EIAW: Towards a Business-friendly Warehouse Using Semantic Web Technologies, The Semantic Web, 6th International Semantic Web Conference, 2nd Asian Semantic Web Conference, ISWC 2007 + ASWC 2007. 5. Hyon Hee Kim, Seung Soo Park, Building a Web-Enabled Warehouse, Springer Volume 2713, pp 594-600, 2003. 6. H. Mahboubi, J.C. Ralaivao, S. Loudcher, O. Boussaid, F. Bentayeb, J. Darmont, X-WACoDa: An XML-based approach for Warehousing and Analyzing Complex, Advances in Warehousing and Mining, IGI Publishing, 2009. 7. Inmon W., Hackathorn, R., Using the data warehouse," Wiley-QED Publishing, Somerset, NJ, USA, 1994. 8. Jane You, Qin Li, Jinghua Wang, Image retrieval by dynamic indexing and guided search, proceedings of the 8 th IEEE international conference on cognitive informatics, 2009. 9. Meenakshi Srivastava, Dr. S. K. Singh, Dr. S. Q. Abbas, An Architecture for Creation of Warehouse IJESIT, volume 2, issue 4, pp. 309-315, July 2013. 10. Mohd. Fraz, Ajay Indian, Hina Saxena, Saurabh Verma, Improving Compression Efficiency of Warehouse, International Journal of Scientific & Engineering Research (IJSER), pp. 1575-1578 Volume 4, Issue7, Jul 2013. 11. Stephen T C Wong, Kent Soo Hoo Jr, Robert C Knowlton, Kenneth D Laxer, Xinhau Cao, Randall A Hawkins, William P Dillon, Ronald L Arenson, Design and Applications of a Multimodality Image Warehouse Framework, Journal of the American Medical Informatics Association Volume 9 No. 3, pp. 239-254, May / Jun 2002. 12. Tania Cerquitelli, Genoveva Vargas Solar, Jose Luis Zechinelli Martini, Building Warehouse from Distributed ", e- Gnosis Vol-2, 2004. 13. Wookey Lee, Yongkyu Kim, Yunsun Lee, Jinho Kim, Developing multimedia data warehouse of education on-demand systems, pp. 942-945, IEEE 1999. 14. Xin Jin, Jiawei Han, Liangliang Cao, Jiebo Luo, Bolin Ding, Cindy Xide Lin, Visual Cube and On-Line Analytical Processing of Images, ACM 2010. DOI: 10.15680/IJIRSET.2015.0401048 www.ijirset.com 18703