A Framework for the Design of Distributed Databases
|
|
- Arabella Carr
- 7 years ago
- Views:
Transcription
1 A Framework for the Design of Distributed Databases Fernanda Baiao Marta Mattoso Computer Science Department, COPPE/UFRJ Federal University of Rio de Janeiro, Brazil Gerson Zaverucha This work presents a framework to handle the class fragmentation problem during the design of distributed object databases. The framework works in the conceptual level, and thus uses the object data model to capture the application semantics represented by the user. The proposed framework integrates three modules. The heuristic module defines a set of heuristics to drive the fragmentation of object databases and incorporates them in a methodology that includes an analysis algorithm, horizontal and vertical class fragmentation algorithms. The theory revision module automatically improves the analysis algorithm through the use of an artificial intelligence technique named theory revision, using fragmentation schemas with previously known performance presented as examples. Finally, the branchand-bound module uses optimization techniques to perform an intelligent search for an optimal fragmentation schema through a larger space of hypotheses when compared to the space of hypotheses covered by the heuristic approach. INTRODUCTION Distributed and parallel processing on database management systems (DBMS) is an efficient way of improving performance of applications that manipulate large volumes of data. This may be accomplished by removing irrelevant data accessed during the execution of queries and by reducing the data exchange among sites, which are the two main goals of the design of distributed databases [1]. Also, many recent problem domains are supported by applications that are typically more complex than traditional applications, in addition to their great volume of data. Those applications require, at least in the conceptual level, a semantically richer data model which is capable of directly representing complex structures and operations in a more natural and adequate manner, such as the object data model. Therefore, in order to improve performance of those applications, it is very important to design information distribution properly, and take the application semantics into account as much as possible. The distribution design involves making decisions on the fragmentation and placement of data across the sites of a computer network. The first phase of the distribution design in a top-down approach is the fragmentation phase, which is the process of clustering in fragments the information accessed simultaneously by applications. The fragmentation phase is then followed by the allocation phase, which handles the physical storage of the generated fragments among the nodes of a computer network, and the replication of fragments. This work addresses the fragmentation phase of databases. We believe that, by outputting good fragmentation schemas with improved performance, data allocation and replication may then be carried out more efficiently, since the
2 fragmentation schema will adequately reflect appropriate units of distribution according to the application access patterns, and thus may significantly reduce the search space of the allocation phase. However, the generation of a good fragmentation schema of a database using the object data model is a difficult task, because of four basic reasons: (i) it is not a well-defined problem; (ii) it must take many parameters into account; (iii) it has a lot of conflicting goals, and (iv) it requires some estimates and heuristics that may be sometimes conflicting. However, the designer may concentrate on semantic relationships leaving physical distribution design to the last phase. To fragment a class, it is possible to use two basic techniques: horizontal fragmentation and vertical fragmentation. In object databases, horizontal fragmentation distributes class instances across the fragments. Thus, a horizontal fragment of a class contains a subset of the whole class extension. On the other hand, vertical fragmentation (VF) breaks the class logical structure (its attributes and methods) and distributes them across the fragments. The horizontal fragmentation is usually subdivided in primary and derived horizontal fragmentation. Primary horizontal fragmentation (PHF) basically optimizes set operations (search over a class extension), firstly by reducing the amount of irrelevant data accessed and, secondly, by permitting applications to be executed concurrently, thus achieving a high degree of parallelism. On the other hand, derived horizontal fragmentation (DHF) can be viewed as an approach of clustering objects of distinct classes in the disk, therefore clearly addressing the relationships between classes and improving performance of applications with navigational access. It is also possible to apply both vertical and horizontal fragmentation techniques in a class simultaneously (which we call hybrid fragmentation) or to apply different fragmentation techniques in different classes in the database schema (which we call mixed fragmentation). There are many approaches in the literature addressing the DDODB problem [2, 3, 4, 5, 6]. However, due to complexity, most of them rely on a specific set of estimates and heuristics. Also, some approaches require an instantiated database to work on, which may limit their application. Most important, the distribution design algorithms presented are limited to the application of just one of the fragmentation techniques (horizontal or vertical, but not both) in all classes of the schema, therefore proposing either a horizontal-only or a vertical-only class fragmentation approach for all classes of the schema. We have already pointed out in [7] the benefits of mixed fragmentation (that is, the combination of vertical and horizontal fragmentation in different classes of the schema) and hybrid fragmentation (in the same class) to increase the performance of applications. It is also important to analyze the database schema and the application characteristics in order to propose good fragmentation schemas. However, such issues are not addressed in other works in the literature.
3 FRAMEWORK FOR THE DESIGN OF DISTRIBUTED DATABASES This work presents a framework to handle the class fragmentation problem during the design of distributed databases, using the object model in the conceptual level. This way, the ideas presented may be applied in different environments (such as in domains where data is managed by object-relational or object-oriented database management systems), as long as the application conceptual model is compatible with the object-oriented model defined in this work. The proposed framework (illustrated in Figure 1) integrates three modules: the DDODB heuristic module, the theory revision module (TREND3) and the DDODB branch-and-bound module. Database Application (Semantics + Operations + quantitative info) Good fragmentation schema DDODB Heuristic Module (AA VF HF) Improved Analysis Algorithm (Revised Theory) Distribution Designer Known fragmentation schemas Analysis Algorithm (Initial Theory) (Examples) Optimal fragmentation schema (Examples) TREND 3 Module FORTE FORTE Module Optimal fragmentation schema DDODB Branch and Bound Module Query Processing Cost Function Figure 1. Overall framework for the class fragmentation in the DDODB The distribution designer provides input information about the database semantics, the operations that will be executed over the stored data and additional quantitative information such as the estimate cardinality of each class. This information is then passed to the DDODB Heuristic Module. The DDODB heuristic module defines a set of heuristics to search for the best fragmentation schema for a given database application. The execution of the algorithms from the heuristic module (AA-Analysis Algorithm, VF-Vertical Fragmentation and HF-Horizontal Fragmentation) will follow this set of heuristics and quickly output a good fragmentation schema to the distribution designer to be implemented on the database. Intermediary results of the heuristic module are presented in [7, 8, 9]. Performance results from these works have proven the effectiveness of the DDODB heuristic module during an experimental study on top of Benchmark 007. The set of heuristics implemented by the DDODB heuristic module may be further automatically improved by executing a theory revision process through the use of inductive logic programming (ILP) [10]. This process is called Theory REvisioN on the Design of Distributed Databases (TREND3), and is represented in our framework by the TREND3 module[11]. The improvement
4 process may be carried out by providing two input parameters to the TREND3 module: the analysis algorithm PROLOG implementation (representing the initial theory) and a fragmentation schema with previously known performance (representing a set of examples). The analysis algorithm is then automatically modified by a theory revision system (called FORTE) so as to produce a revised theory. The revised theory will represent an improved analysis algorithm that will be able to output the fragmentation schema given as input parameter, and this revised analysis algorithm will then substitute the original one in the DDODB Heuristic Module. Additionally, the input information from the distribution designer may be passed to our third module, the DDODB Branch-and-Bound Module[12]. This module represents an alternative approach to the heuristic module in searching for the best fragmentation schema for a given database application. The branch-and-bound procedure searches for an optimal solution in the space of potentially good fragmentation schemas for an application and outputs its result to the distribution designer. Although the search space covered by the branch-and-bound algorithm is much larger than the one covered by the heuristic algorithm, its execution cost is also much higher. To handle this, the branch-andbound algorithm tries to bound its search for the best fragmentation schema by using a query processing cost function during the evaluation of each fragmentation schema in the hypotheses space. This cost function, defined in [13], is responsible for estimating the execution cost of queries on top of a distributed database beign evaluated. The branch-and-bound algorithm then discards all the fragmentation schemas with an estimate cost higher than the cost of the fragmentation schema output from the heuristic module. Finally, the final result from the branch-and-bound algorithm, as well as the fragmentation schemas discarded during the searh, may generate examples (positive or negative) to the TREND3 module, thus incorporating the branch-and-bound results into the DDODB heuristic module. The complete framework is detailed in [14]. CONCLUSIONS This work presents a framework to handle the class fragmentation problem during the design of distributed object databases. The framework works in the conceptual level, and thus uses the object data model to capture the application semantics represented by the user. The proposed framework integrates three modules (heuristic, knowledge-based and branch-and-bound). The heuristic module defines a set of heuristics to drive the fragmentation of object databases and incorporates them in a methodology that includes an analysis algorithm, horizontal and vertical class fragmentation algorithms, addressing the need mentioned by Özsu and Valduriez [1] of a distribution design methodology which encompasses the horizontal and vertical fragmentation algorithms and uses them as part of a more general strategy. Experiments using our methodology resulted in fragmentation schemas with better performance results when compared to other fragmentation schemas proposed
5 in the literature. The main contribution of the heuristic module is the analysis phase, which chooses the most adequate fragmentation technique to be applied in each class of the database schema, based on heuristics derived from experimental results previously obtained. With current algorithms proposed in the literature, the distribution designer is induced to apply one single type of fragmentation to all classes. Even when the designer decides to use a horizontal fragmentation algorithm to one class and another vertical fragmentation algorithm to another class, he is left with no assistance to make this decision. REFERENCES [1] M. Özsu and P. Valduriez, Principles of Distributed Database Systems, 2 nd edition (1 st edition 1991), New Jersey, Prentice-Hall, [2] L. Bellatreche, K. Karlapalem and A. Simonet, "Algorithms and Support for Horizontal Class Partitioning in Object- Oriented Databases", International Journal of Distributed and Parallel Databases, Kluwer Academic Publishers, vol. 8(2), 2000, pp [3] Y. Chen and S. Su, "Implementation and Evaluation of Parallel Query Processing Algorithms and Data Partitioning Heuristics in Object Oriented Databases, International Journal of Distributed and Parallel Databases, Kluwer Academic Publishers, vol. 4(2), 1996, pp [4] C. Ezeife and K. Barker, "Distributed Object Based Design: Vertical Fragmentation of Classes", International Journal of Distributed and Parallel Databases, Kluwer Academic Publishers, vol. 6(4), 1998, pp [5] K. Karlapalem, S. Navathe and M. Morsi, Issues in Distribution Design of Object-Oriented Databases. In M. Özsu et al. (eds.), Distributed Object Management, Morgan Kaufmann Publishers Inc., San Francisco, USA, [6] M. Savonnet, M. Terrasse and K. Yétongnon, Fragtique: A Methodology for Distributing Object Oriented Databases. In: Proceedings of the International Conference on Computing and Information (ICCI'98), Winnipeg, Canada, 1998, pp [7] F. Baião and M. Mattoso, A Mixed Fragmentation Algorithm for Distributed Object Oriented Databases. In Special Issue of the Journal of Computing and Information (JCI), vol. 3(1), ICCI 98, March 2000, ISSN , pp [8] F. Baião, M. Mattoso and G. Zaverucha, Towards an Inductive Design of Distributed Object Oriented Databases. In Proceedings of the Third IFCIS Conference on Cooperative Information Systems (CoopIS'98), IEEE CS Press, New York, USA, Ago 1998, pp [9] F. Baião, M. Mattoso and G. Zaverucha, "Horizontal Fragmentation in Object DBMS: New Issues and Performance Evaluation". In Proceedings of the "19 th IEEE International Performance, Computing and Communications Conference" (IPCCC 2000), IEEE CS Press, Phoenix, Feb 2000, pp [10] N. Lavrac and S. Dzreroski, Inductive Logic Programming: Techniques and Applications, Ellis Horwood, [11] F. Baião, M. Mattoso, J. Shavlik and G. Zaverucha, "Applying Theory Revision in the Design of Distributed Databases". In preparation, Feb [12] F. Baião, M. Mattoso, J. Shavlik and G. Zaverucha, "A Branch-and-Bound Approach for the Design of Distributed Databases ". In preparation, Feb [13] G. Ruberg, F. Baião, M. Mattoso, "A Cost Model for the Evaluation of Path Expressions in Distributed Object Databases", submitted for publication, Nov [14] F. Baião A Methodology and Algorithms for the Design of Distributed Databases using Theory Revision D.Sc. Thesis, COPPE/UFRJ, Dec (
Horizontal Fragmentation Technique in Distributed Database
International Journal of Scientific and esearch Publications, Volume, Issue 5, May 0 Horizontal Fragmentation Technique in istributed atabase Ms P Bhuyar ME I st Year (CSE) Sipna College of Engineering
More informationFragmentation and Data Allocation in the Distributed Environments
Annals of the University of Craiova, Mathematics and Computer Science Series Volume 38(3), 2011, Pages 76 83 ISSN: 1223-6934, Online 2246-9958 Fragmentation and Data Allocation in the Distributed Environments
More informationAN OVERVIEW OF DISTRIBUTED DATABASE MANAGEMENT
AN OVERVIEW OF DISTRIBUTED DATABASE MANAGEMENT BY AYSE YASEMIN SEYDIM CSE 8343 - DISTRIBUTED OPERATING SYSTEMS FALL 1998 TERM PROJECT TABLE OF CONTENTS INTRODUCTION...2 1. WHAT IS A DISTRIBUTED DATABASE
More informationDistributed Databases. Fábio Porto LBD winter 2004/2005
Distributed Databases LBD winter 2004/2005 1 Agenda Introduction Architecture Distributed database design Query processing on distributed database Data Integration 2 Outline Introduction to DDBMS Architecture
More informationA Novel Cloud Computing Data Fragmentation Service Design for Distributed Systems
A Novel Cloud Computing Data Fragmentation Service Design for Distributed Systems Ismail Hababeh School of Computer Engineering and Information Technology, German-Jordanian University Amman, Jordan Abstract-
More informationAn Ants Algorithm to Improve Energy Efficient Based on Secure Autonomous Routing in WSN
An Ants Algorithm to Improve Energy Efficient Based on Secure Autonomous Routing in WSN *M.A.Preethy, PG SCHOLAR DEPT OF CSE #M.Meena,M.E AP/CSE King College Of Technology, Namakkal Abstract Due to the
More informationApplying Attribute Level Locking to Decrease the Deadlock on Distributed Database
Applying Attribute Level Locking to Decrease the Deadlock on Distributed Database Dr. Khaled S. Maabreh* and Prof. Dr. Alaa Al-Hamami** * Faculty of Science and Information Technology, Zarqa University,
More informationDistributed Database Design (Chapter 5)
Distributed Database Design (Chapter 5) Top-Down Approach: The database system is being designed from scratch. Issues: fragmentation & allocation Bottom-up Approach: Integration of existing databases (Chapter
More informationPrinciples of Distributed Database Systems
M. Tamer Özsu Patrick Valduriez Principles of Distributed Database Systems Third Edition
More informationChapter 3: Distributed Database Design
Chapter 3: Distributed Database Design Design problem Design strategies(top-down, bottom-up) Fragmentation Allocation and replication of fragments, optimality, heuristics Acknowledgements: I am indebted
More informationA Practical Approach of Storage Strategy for Grid Computing Environment
A Practical Approach of Storage Strategy for Grid Computing Environment Kalim Qureshi Abstract -- An efficient and reliable fault tolerance protocol plays an important role in making the system more stable.
More informationParGRES: a middleware for executing OLAP queries in parallel
ParGRES: a middleware for executing OLAP queries in parallel Marta Mattoso 1, Geraldo Zimbrão 1,3, Alexandre A. B. Lima 1, Fernanda Baião 1,2, Vanessa P. Braganholo 1, Albino Aveleda 1, Bernardo Miranda
More informationAn Overview of Distributed Databases
International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 4, Number 2 (2014), pp. 207-214 International Research Publications House http://www. irphouse.com /ijict.htm An Overview
More informationDistributed Databases in a Nutshell
Distributed Databases in a Nutshell Marc Pouly Marc.Pouly@unifr.ch Department of Informatics University of Fribourg, Switzerland Priciples of Distributed Database Systems M. T. Özsu, P. Valduriez Prentice
More informationImplementing New Approach for Enhancing Performance and Throughput in a Distributed Database
290 The International Arab Journal of Information Technology, Vol. 10, No. 3, May 2013 Implementing New Approach for Enhancing Performance and in a Distributed Database Khaled Maabreh 1 and Alaa Al-Hamami
More informationUsing Provenance to Improve Workflow Design
Using Provenance to Improve Workflow Design Frederico T. de Oliveira, Leonardo Murta, Claudia Werner, Marta Mattoso COPPE/ Computer Science Department Federal University of Rio de Janeiro (UFRJ) {ftoliveira,
More informationDWMiner : A tool for mining frequent item sets efficiently in data warehouses
DWMiner : A tool for mining frequent item sets efficiently in data warehouses Bruno Kinder Almentero, Alexandre Gonçalves Evsukoff and Marta Mattoso COPPE/Federal University of Rio de Janeiro, P.O.Box
More informationchapater 7 : Distributed Database Management Systems
chapater 7 : Distributed Database Management Systems Distributed Database Management System When an organization is geographically dispersed, it may choose to store its databases on a central database
More informationFourth generation techniques (4GT)
Fourth generation techniques (4GT) The term fourth generation techniques (4GT) encompasses a broad array of software tools that have one thing in common. Each enables the software engineer to specify some
More informationKnowledge based system to support the design of tools for the HFQ forming process for aluminium-based products
MATEC Web of Conferences 21, 05008 (2015) DOI: 10.1051/matecconf/20152105008 C Owned by the authors, published by EDP Sciences, 2015 Knowledge based system to support the design of tools for the HFQ forming
More informationOptimization of Image Search from Photo Sharing Websites Using Personal Data
Optimization of Image Search from Photo Sharing Websites Using Personal Data Mr. Naeem Naik Walchand Institute of Technology, Solapur, India Abstract The present research aims at optimizing the image search
More informationTECHNIQUES FOR DATA REPLICATION ON DISTRIBUTED DATABASES
Constantin Brâncuşi University of Târgu Jiu ENGINEERING FACULTY SCIENTIFIC CONFERENCE 13 th edition with international participation November 07-08, 2008 Târgu Jiu TECHNIQUES FOR DATA REPLICATION ON DISTRIBUTED
More informationArtificial Intelligence
Artificial Intelligence ICS461 Fall 2010 1 Lecture #12B More Representations Outline Logics Rules Frames Nancy E. Reed nreed@hawaii.edu 2 Representation Agents deal with knowledge (data) Facts (believe
More informationExperiments in Web Page Classification for Semantic Web
Experiments in Web Page Classification for Semantic Web Asad Satti, Nick Cercone, Vlado Kešelj Faculty of Computer Science, Dalhousie University E-mail: {rashid,nick,vlado}@cs.dal.ca Abstract We address
More informationTowards the Optimization of Data Mining Execution Process in Distributed Environments
Journal of Computational Information Systems 7: 8 (2011) 2931-2939 Available at http://www.jofcis.com Towards the Optimization of Data Mining Execution Process in Distributed Environments Yan ZHANG 1,,
More informationCloud Based Distributed Databases: The Future Ahead
Cloud Based Distributed Databases: The Future Ahead Arpita Mathur Mridul Mathur Pallavi Upadhyay Abstract Fault tolerant systems are necessary to be there for distributed databases for data centers or
More informationA Multidatabase System as 4-Tiered Client-Server Distributed Heterogeneous Database System
A Multidatabase System as 4-Tiered Client-Server Distributed Heterogeneous Database System Mohammad Ghulam Ali Academic Post Graduate Studies and Research Indian Institute of Technology, Kharagpur Kharagpur,
More informationSoftware Design. Design (I) Software Design Data Design. Relationships between the Analysis Model and the Design Model
Software Design Design (I) Software Design is a process through which requirements are translated into a representation of software. Peter Lo CS213 Peter Lo 2005 1 CS213 Peter Lo 2005 2 Relationships between
More informationDESIGN OF A SPATIAL DATA WAREHOUSE BASED ON AN INTEGRATED NON- SPATIAL DATABASE AND GEO-SPATIAL INFORMATION
DESIGN OF A SPATIAL DATA WAREHOUSE BASED ON AN INTEGRATED NON- SPATIAL DATABASE AND GEO-SPATIAL INFORMATION Abdulvahit Torun Harita Genel Komutanlığı (General Command of Mapping) (GCM), Kartografya Dairesi,
More informationApuama: Combining Intra-query and Inter-query Parallelism in a Database Cluster
Apuama: Combining Intra-query and Inter-query Parallelism in a Database Cluster Bernardo Miranda 1, Alexandre A. B. Lima 1,3, Patrick Valduriez 2, and Marta Mattoso 1 1 Computer Science Department, COPPE,
More informationChapter 10 Practical Database Design Methodology and Use of UML Diagrams
Chapter 10 Practical Database Design Methodology and Use of UML Diagrams Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 10 Outline The Role of Information Systems in
More informationADVANCED GEOGRAPHIC INFORMATION SYSTEMS Vol. II - Using Ontologies for Geographic Information Intergration Frederico Torres Fonseca
USING ONTOLOGIES FOR GEOGRAPHIC INFORMATION INTEGRATION Frederico Torres Fonseca The Pennsylvania State University, USA Keywords: ontologies, GIS, geographic information integration, interoperability Contents
More informationPartJoin: An Efficient Storage and Query Execution for Data Warehouses
PartJoin: An Efficient Storage and Query Execution for Data Warehouses Ladjel Bellatreche 1, Michel Schneider 2, Mukesh Mohania 3, and Bharat Bhargava 4 1 IMERIR, Perpignan, FRANCE ladjel@imerir.com 2
More informationAdaptive Virtual Partitioning for OLAP Query Processing in a Database Cluster
Adaptive Virtual Partitioning for OLAP Query Processing in a Database Cluster Alexandre A. B. Lima 1, Marta Mattoso 1, Patrick Valduriez 2 1 Computer Science Department, COPPE, Federal University of Rio
More informationData Integration using Agent based Mediator-Wrapper Architecture. Tutorial Report For Agent Based Software Engineering (SENG 609.
Data Integration using Agent based Mediator-Wrapper Architecture Tutorial Report For Agent Based Software Engineering (SENG 609.22) Presented by: George Shi Course Instructor: Dr. Behrouz H. Far December
More informationCOURSE NAME: Database Management. TOPIC: Database Design LECTURE 3. The Database System Life Cycle (DBLC) The database life cycle contains six phases;
COURSE NAME: Database Management TOPIC: Database Design LECTURE 3 The Database System Life Cycle (DBLC) The database life cycle contains six phases; 1 Database initial study. Analyze the company situation.
More informationDistributed Data Management
Introduction Distributed Data Management Involves the distribution of data and work among more than one machine in the network. Distributed computing is more broad than canonical client/server, in that
More informationTOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM
TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM Thanh-Nghi Do College of Information Technology, Cantho University 1 Ly Tu Trong Street, Ninh Kieu District Cantho City, Vietnam
More informationMauro Sousa Marta Mattoso Nelson Ebecken. and these techniques often repeatedly scan the. entire set. A solution that has been used for a
Data Mining on Parallel Database Systems Mauro Sousa Marta Mattoso Nelson Ebecken COPPEèUFRJ - Federal University of Rio de Janeiro P.O. Box 68511, Rio de Janeiro, RJ, Brazil, 21945-970 Fax: +55 21 2906626
More informationTransaction Management in Distributed Database Systems: the Case of Oracle s Two-Phase Commit
Transaction Management in Distributed Database Systems: the Case of Oracle s Two-Phase Commit Ghazi Alkhatib Senior Lecturer of MIS Qatar College of Technology Doha, Qatar Alkhatib@qu.edu.sa and Ronny
More informationMultiMedia and Imaging Databases
MultiMedia and Imaging Databases Setrag Khoshafian A. Brad Baker Technische H FACHBEREIGM W-C^KA VK B_l_3JLJ0 T H E K Inventar-N*.: Sachgebiete: Standort: Morgan Kaufmann Publishers, Inc. San Francisco,
More informationHorizontal Partitioning by Predicate Abstraction and its Application to Data Warehouse Design
Horizontal Partitioning by Predicate Abstraction and its Application to Data Warehouse Design Aleksandar Dimovski 1, Goran Velinov 2, and Dragan Sahpaski 2 1 Faculty of Information-Communication Technologies,
More informationTo Enhance The Security In Data Mining Using Integration Of Cryptograhic And Data Mining Algorithms
IOSR Journal of Engineering (IOSRJEN) ISSN (e): 2250-3021, ISSN (p): 2278-8719 Vol. 04, Issue 06 (June. 2014), V2 PP 34-38 www.iosrjen.org To Enhance The Security In Data Mining Using Integration Of Cryptograhic
More informationOptimization of ETL Work Flow in Data Warehouse
Optimization of ETL Work Flow in Data Warehouse Kommineni Sivaganesh M.Tech Student, CSE Department, Anil Neerukonda Institute of Technology & Science Visakhapatnam, India. Sivaganesh07@gmail.com P Srinivasu
More informationSemantic Search in Portals using Ontologies
Semantic Search in Portals using Ontologies Wallace Anacleto Pinheiro Ana Maria de C. Moura Military Institute of Engineering - IME/RJ Department of Computer Engineering - Rio de Janeiro - Brazil [awallace,anamoura]@de9.ime.eb.br
More informationIV Distributed Databases - Motivation & Introduction -
IV Distributed Databases - Motivation & Introduction - I OODBS II XML DB III Inf Retr DModel Motivation Expected Benefits Technical issues Types of distributed DBS 12 Rules of C. Date Parallel vs Distributed
More informationDECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES
DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES Vijayalakshmi Mahanra Rao 1, Yashwant Prasad Singh 2 Multimedia University, Cyberjaya, MALAYSIA 1 lakshmi.mahanra@gmail.com
More informationParallel Database Server. Mauro Sousa Marta Mattoso Nelson F. F. Ebecken. mauros, marta@cos.ufrj.br, nelson@ntt.ufrj.br
Data Mining: A Tightly-Coupled Implementation on a Parallel Database Server Mauro Sousa Marta Mattoso Nelson F. F. Ebecken COPPE - Federal University of Rio de Janeiro P.O. Box 68511, Rio de Janeiro, RJ,
More informationDatabase Management. Chapter Objectives
3 Database Management Chapter Objectives When actually using a database, administrative processes maintaining data integrity and security, recovery from failures, etc. are required. A database management
More informationUSING SCHEMA AND DATA INTEGRATION TECHNIQUE TO INTEGRATE SPATIAL AND NON-SPATIAL DATA : DEVELOPING POPULATED PLACES DB OF TURKEY (PPDB_T)
USING SCHEMA AND DATA INTEGRATION TECHNIQUE TO INTEGRATE SPATIAL AND NON-SPATIAL DATA : DEVELOPING POPULATED PLACES DB OF TURKEY () Abdulvahit Torun General Command of Mapping (GCM), Cartography Department,
More informationMobile Storage and Search Engine of Information Oriented to Food Cloud
Advance Journal of Food Science and Technology 5(10): 1331-1336, 2013 ISSN: 2042-4868; e-issn: 2042-4876 Maxwell Scientific Organization, 2013 Submitted: May 29, 2013 Accepted: July 04, 2013 Published:
More informationTask Scheduling in Hadoop
Task Scheduling in Hadoop Sagar Mamdapure Munira Ginwala Neha Papat SAE,Kondhwa SAE,Kondhwa SAE,Kondhwa Abstract Hadoop is widely used for storing large datasets and processing them efficiently under distributed
More informationComparative Analysis of Classification Algorithms on Different Datasets using WEKA
Volume 54 No13, September 2012 Comparative Analysis of Classification Algorithms on Different Datasets using WEKA Rohit Arora MTech CSE Deptt Hindu College of Engineering Sonepat, Haryana, India Suman
More informationLONG BEACH CITY COLLEGE MEMORANDUM
LONG BEACH CITY COLLEGE MEMORANDUM DATE: May 5, 2000 TO: Academic Senate Equivalency Committee FROM: John Hugunin Department Head for CBIS SUBJECT: Equivalency statement for Computer Science Instructor
More informationThe Design of a Distributed Database for Doctoral Studies Management
Informatica Economică vol. 14, no. 4/2010 139 The Design of a Distributed Database for Doctoral Studies Management Enikö Elisabeta TOLEA, Aurelian Razvan COSTIN Babes Bolyai University, Cluj-Napoca, Romania
More informationnot necessarily strictly sequential feedback loops exist, i.e. may need to revisit earlier stages during a later stage
Database Design Process there are six stages in the design of a database: 1. requirement analysis 2. conceptual database design 3. choice of the DBMS 4. data model mapping 5. physical design 6. implementation
More informationNew Approach of Computing Data Cubes in Data Warehousing
International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 4, Number 14 (2014), pp. 1411-1417 International Research Publications House http://www. irphouse.com New Approach of
More informationPHP Code Design. The data structure of a relational database can be represented with a Data Model diagram, also called an Entity-Relation diagram.
PHP Code Design PHP is a server-side, open-source, HTML-embedded scripting language used to drive many of the world s most popular web sites. All major web servers support PHP enabling normal HMTL pages
More informationAbstract. Keywords: Data Warehouse, Views, Fragmentation, Performance benefit
Optimizing Partition-Selection Scheme for Warehouse Aggregate Views * C.I. Ezeife School of Computer Science University of Windsor Windsor, Ontario Canada N9B 3P4 cezeife@cs.uwindsor.ca Tel: (519) 253-3000
More informationDatabase Replication with Oracle 11g and MS SQL Server 2008
Database Replication with Oracle 11g and MS SQL Server 2008 Flavio Bolfing Software and Systems University of Applied Sciences Chur, Switzerland www.hsr.ch/mse Abstract Database replication is used widely
More informationTowards Full-fledged XML Fragmentation for Transactional Distributed Databases
Towards Full-fledged XML Fragmentation for Transactional Distributed Databases Rebeca Schroeder 1, Carmem S. Hara (supervisor) 1 1 Programa de Pós Graduação em Informática Universidade Federal do Paraná
More informationA Virtual Machine Searching Method in Networks using a Vector Space Model and Routing Table Tree Architecture
A Virtual Machine Searching Method in Networks using a Vector Space Model and Routing Table Tree Architecture Hyeon seok O, Namgi Kim1, Byoung-Dai Lee dept. of Computer Science. Kyonggi University, Suwon,
More informationObject Oriented Databases. OOAD Fall 2012 Arjun Gopalakrishna Bhavya Udayashankar
Object Oriented Databases OOAD Fall 2012 Arjun Gopalakrishna Bhavya Udayashankar Executive Summary The presentation on Object Oriented Databases gives a basic introduction to the concepts governing OODBs
More informationIncorporating Evidence in Bayesian networks with the Select Operator
Incorporating Evidence in Bayesian networks with the Select Operator C.J. Butz and F. Fang Department of Computer Science, University of Regina Regina, Saskatchewan, Canada SAS 0A2 {butz, fang11fa}@cs.uregina.ca
More informationChapter 10. Practical Database Design Methodology. The Role of Information Systems in Organizations. Practical Database Design Methodology
Chapter 10 Practical Database Design Methodology Practical Database Design Methodology Design methodology Target database managed by some type of database management system Various design methodologies
More informationSecure Data Transfer and Replication Mechanisms in Grid Environments p. 1
Secure Data Transfer and Replication Mechanisms in Grid Environments Konrad Karczewski, Lukasz Kuczynski and Roman Wyrzykowski Institute of Computer and Information Sciences, Czestochowa University of
More informationData Engineering for the Analysis of Semiconductor Manufacturing Data
Data Engineering for the Analysis of Semiconductor Manufacturing Data Peter Turney Knowledge Systems Laboratory Institute for Information Technology National Research Council Canada Ottawa, Ontario, Canada
More informationLightweight Service-Based Software Architecture
Lightweight Service-Based Software Architecture Mikko Polojärvi and Jukka Riekki Intelligent Systems Group and Infotech Oulu University of Oulu, Oulu, Finland {mikko.polojarvi,jukka.riekki}@ee.oulu.fi
More informationSoftware Requirements Metrics
Software Requirements Metrics Fairly primitive and predictive power limited. Function Points Count number of inputs and output, user interactions, external interfaces, files used. Assess each for complexity
More informationAn Analysis of Four Missing Data Treatment Methods for Supervised Learning
An Analysis of Four Missing Data Treatment Methods for Supervised Learning Gustavo E. A. P. A. Batista and Maria Carolina Monard University of São Paulo - USP Institute of Mathematics and Computer Science
More informationSEMANTIC WEB BASED INFERENCE MODEL FOR LARGE SCALE ONTOLOGIES FROM BIG DATA
SEMANTIC WEB BASED INFERENCE MODEL FOR LARGE SCALE ONTOLOGIES FROM BIG DATA J.RAVI RAJESH PG Scholar Rajalakshmi engineering college Thandalam, Chennai. ravirajesh.j.2013.mecse@rajalakshmi.edu.in Mrs.
More informationSoftware Life-Cycle Management
Ingo Arnold Department Computer Science University of Basel Theory Software Life-Cycle Management Architecture Styles Overview An Architecture Style expresses a fundamental structural organization schema
More informationBackground knowledge-enrichment for bottom clauses improving.
Background knowledge-enrichment for bottom clauses improving. Orlando Muñoz Texzocotetla and René MacKinney-Romero Departamento de Ingeniería Eléctrica Universidad Autónoma Metropolitana México D.F. 09340,
More informationVII. Database System Architecture
VII. Database System Lecture Topics Monolithic systems Client/Server systems Parallel database servers Multidatabase systems CS338 1 Monolithic System DBMS File System Each component presents a well-defined
More informationKEEP THIS COPY FOR REPRODUCTION PURPOSES. I ~~~~~Final Report
MASTER COPY KEEP THIS COPY FOR REPRODUCTION PURPOSES 1 Form Approved REPORT DOCUMENTATION PAGE I OMS No. 0704-0188 Public reoorting burden for this collection of information is estimated to average I hour
More informationFig. 3. PostgreSQL subsystems
Development of a Parallel DBMS on the Basis of PostgreSQL C. S. Pan kvapen@gmail.com South Ural State University Abstract. The paper describes the architecture and the design of PargreSQL parallel database
More informationLoad balancing in a heterogeneous computer system by self-organizing Kohonen network
Bull. Nov. Comp. Center, Comp. Science, 25 (2006), 69 74 c 2006 NCC Publisher Load balancing in a heterogeneous computer system by self-organizing Kohonen network Mikhail S. Tarkov, Yakov S. Bezrukov Abstract.
More informationLDIF - Linked Data Integration Framework
LDIF - Linked Data Integration Framework Andreas Schultz 1, Andrea Matteini 2, Robert Isele 1, Christian Bizer 1, and Christian Becker 2 1. Web-based Systems Group, Freie Universität Berlin, Germany a.schultz@fu-berlin.de,
More informationExplanation-Oriented Association Mining Using a Combination of Unsupervised and Supervised Learning Algorithms
Explanation-Oriented Association Mining Using a Combination of Unsupervised and Supervised Learning Algorithms Y.Y. Yao, Y. Zhao, R.B. Maguire Department of Computer Science, University of Regina Regina,
More informationDistributed Databases
Distributed Databases Chapter 1: Introduction Johann Gamper Syllabus Data Independence and Distributed Data Processing Definition of Distributed databases Promises of Distributed Databases Technical Problems
More informationData-intensive HPC: opportunities and challenges. Patrick Valduriez
Data-intensive HPC: opportunities and challenges Patrick Valduriez Big Data Landscape Multi-$billion market! Big data = Hadoop = MapReduce? No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard,
More informationSEARCHING AND KNOWLEDGE REPRESENTATION. Angel Garrido
Acta Universitatis Apulensis ISSN: 1582-5329 No. 30/2012 pp. 147-152 SEARCHING AND KNOWLEDGE REPRESENTATION Angel Garrido ABSTRACT. The procedures of searching of solutions of problems, in Artificial Intelligence
More informationObjectives. Distributed Databases and Client/Server Architecture. Distributed Database. Data Fragmentation
Objectives Distributed Databases and Client/Server Architecture IT354 @ Peter Lo 2005 1 Understand the advantages and disadvantages of distributed databases Know the design issues involved in distributed
More informationFRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS
FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS Breno C. Costa, Bruno. L. A. Alberto, André M. Portela, W. Maduro, Esdras O. Eler PDITec, Belo Horizonte,
More informationMicrosoft TMG Replacement with NetScaler
Microsoft TMG Replacement with NetScaler Replacing Microsoft Forefront TMG with NetScaler for Optimization This deployment guide focuses on replacing Microsoft Forefront Threat Management Gateway (TMG)
More informationA Lab Course on Computer Architecture
A Lab Course on Computer Architecture Pedro López José Duato Depto. de Informática de Sistemas y Computadores Facultad de Informática Universidad Politécnica de Valencia Camino de Vera s/n, 46071 - Valencia,
More informationA Fast Partial Memory Approach to Incremental Learning through an Advanced Data Storage Framework
A Fast Partial Memory Approach to Incremental Learning through an Advanced Data Storage Framework Marenglen Biba, Stefano Ferilli, Floriana Esposito, Nicola Di Mauro, Teresa M.A Basile Department of Computer
More informationA Flexible Machine Learning Environment for Steady State Security Assessment of Power Systems
A Flexible Machine Learning Environment for Steady State Security Assessment of Power Systems D. D. Semitekos, N. M. Avouris, G. B. Giannakopoulos University of Patras, ECE Department, GR-265 00 Rio Patras,
More informationPHP FRAMEWORK FOR DATABASE MANAGEMENT BASED ON MVC PATTERN
PHP FRAMEWORK FOR DATABASE MANAGEMENT BASED ON MVC PATTERN Chanchai Supaartagorn Department of Mathematics Statistics and Computer, Faculty of Science, Ubon Ratchathani University, Thailand scchansu@ubu.ac.th
More informationSPATIAL DATA CLASSIFICATION AND DATA MINING
, pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal
More informationDistributed Database Management Systems
Page 1 Distributed Database Management Systems Outline Introduction Distributed DBMS Architecture Distributed Database Design Distributed Query Processing Distributed Concurrency Control Distributed Reliability
More informationExtend Table Lens for High-Dimensional Data Visualization and Classification Mining
Extend Table Lens for High-Dimensional Data Visualization and Classification Mining CPSC 533c, Information Visualization Course Project, Term 2 2003 Fengdong Du fdu@cs.ubc.ca University of British Columbia
More informationPhysical Database Design and Tuning
Chapter 20 Physical Database Design and Tuning Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 1. Physical Database Design in Relational Databases (1) Factors that Influence
More informationBachelor Degree in Informatics Engineering Master courses
Bachelor Degree in Informatics Engineering Master courses Donostia School of Informatics The University of the Basque Country, UPV/EHU For more information: Universidad del País Vasco / Euskal Herriko
More informationSelective Naive Bayes Regressor with Variable Construction for Predictive Web Analytics
Selective Naive Bayes Regressor with Variable Construction for Predictive Web Analytics Boullé Orange Labs avenue Pierre Marzin 3 Lannion, France marc.boulle@orange.com ABSTRACT We describe our submission
More informationAdding Semantics to Business Intelligence
Adding Semantics to Business Intelligence Denilson Sell 1,2, Liliana Cabral 2, Enrico Motta 2, John Domingue 2 and Roberto Pacheco 1,3 1 Stela Group, Universidade Federal de Santa Catarina, Brazil 2 Knowledge
More informationUSING SCHEMA AND DATA INTEGRATION TECHNIQUE TO INTEGRATE SPATIAL AND NON-SPATIAL DATA : DEVELOPING POPULATED PLACES DB OF TURKEY (PPDB_T)
ISPRS SIPT IGU UCI CIG ACSG Table of contents Table des matières Authors index Index des auteurs Search Recherches Exit Sortir USING SCHEMA AND DATA INTEGRATION TECHNIQUE TO INTEGRATE SPATIAL AND NON-SPATIAL
More informationThe Role of Controlled Experiments in Software Engineering Research
The Role of Controlled Experiments in Software Engineering Research Victor R. Basili 1 The Experimental Discipline in Software Engineering Empirical studies play an important role in the evolution of the
More informationWireless Sensor Networks Coverage Optimization based on Improved AFSA Algorithm
, pp. 99-108 http://dx.doi.org/10.1457/ijfgcn.015.8.1.11 Wireless Sensor Networks Coverage Optimization based on Improved AFSA Algorithm Wang DaWei and Wang Changliang Zhejiang Industry Polytechnic College
More informationObject Oriented Database Management System for Decision Support System.
International Refereed Journal of Engineering and Science (IRJES) ISSN (Online) 2319-183X, (Print) 2319-1821 Volume 3, Issue 6 (June 2014), PP.55-59 Object Oriented Database Management System for Decision
More information