Present a New Middleware to Control and Management Database Distributed Environment



Similar documents
Log Mining Based on Hadoop s Map and Reduce Technique

SODDA A SERVICE-ORIENTED DISTRIBUTED DATABASE ARCHITECTURE

How To Understand The Concept Of A Distributed System

Distributed Database Design

5-Layered Architecture of Cloud Database Management System

Distribution transparency. Degree of transparency. Openness of distributed systems

Developing Scalable Java Applications with Cacheonix

Hadoop. MPDL-Frühstück 9. Dezember 2013 MPDL INTERN

Component Approach to Software Development for Distributed Multi-Database System

Techniques for Scaling Components of Web Application

A Generic Model for Querying Multiple Databases in a Distributed Environment Using JDBC and an Uniform Interface

Distributed Data Management

Tier Architectures. Kathleen Durant CS 3200

A Multidatabase System as 4-Tiered Client-Server Distributed Heterogeneous Database System

The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets

Realization of Interoperability & Portability Among Open Clouds by using Agent s Mobility & Intelligence

Deploying a distributed data storage system on the UK National Grid Service using federated SRB

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

ORACLE DATABASE 10G ENTERPRISE EDITION

DISTRIBUTED AND PARALLELL DATABASE

A Novel Switch Mechanism for Load Balancing in Public Cloud

Efficient Data Replication Scheme based on Hadoop Distributed File System

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

low-level storage structures e.g. partitions underpinning the warehouse logical table structures

System Models for Distributed and Cloud Computing

Monitoring IBM WebSphere extreme Scale (WXS) Calls With dynatrace

A Review on Efficient File Sharing in Clustered P2P System

IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures

Chapter 10: Scalability

Network Attached Storage. Jinfeng Yang Oct/19/2015

A New Mechanism for Service Recovery Technology by using Recovering Service s Data

Scalability and Reliability Features of MySQL Connector/J

CHAPTER 2 MODELLING FOR DISTRIBUTED NETWORK SYSTEMS: THE CLIENT- SERVER MODEL

Design and Evaluation of a Hierarchical Multi-Tenant Data Management Framework for Cloud Applications

An Empirical Study and Analysis of the Dynamic Load Balancing Techniques Used in Parallel Computing Systems

Oracle Database 11g: New Features for Administrators DBA Release 2

Open Source DBMS CUBRID 2008 & Community Activities. Byung Joo Chung bjchung@cubrid.com

International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November ISSN

Enterprise GIS Architecture Deployment Options. Andrew Sakowicz

Basic TCP/IP networking knowledge of client/server concepts Basic Linux commands and desktop navigation (if don't know we will cover it )

Processing of Hadoop using Highly Available NameNode

Apache Ignite TM (Incubating) - In- Memory Data Fabric Fast Data Meets Open Source

EFFICIENT JOB SCHEDULING OF VIRTUAL MACHINES IN CLOUD COMPUTING

UPS battery remote monitoring system in cloud computing

Chapter 18: Database System Architectures. Centralized Systems

OBIEE 11g Analytics Using EMC Greenplum Database

Efficient Cloud Management for Parallel Data Processing In Private Cloud

Flash Databases: High Performance and High Availability

AN EFFICIENT LOAD BALANCING APPROACH IN CLOUD SERVER USING ANT COLONY OPTIMIZATION

No.1 IT Online training institute from Hyderabad URL: sriramtechnologies.com

Building Highly Available Database Applications for Apache Derby

Oracle 11g New Features - OCP Upgrade Exam

Virtual machine interface. Operating system. Physical machine interface

In Memory Accelerator for MongoDB

Principles of Distributed Database Systems

Data Migration In Heterogeneous Databases (ETL)

Recognization of Satellite Images of Large Scale Data Based On Map- Reduce Framework

NoSQL and Hadoop Technologies On Oracle Cloud

Presentation of Multi Level Data Replication Distributed Decision Making Strategy for High Priority Tasks in Real Time Data Grids

Business Application Services Testing

Chapter 3. Database Environment - Objectives. Multi-user DBMS Architectures. Teleprocessing. File-Server

Distributed Dynamic Load Balancing for Iterative-Stencil Applications

How To Balance In Cloud Computing

Data Management in an International Data Grid Project. Timur Chabuk 04/09/2007

Web Service Based Data Management for Grid Applications

Application Performance Management for Enterprise Applications

AN ADAPTIVE DISTRIBUTED LOAD BALANCING TECHNIQUE FOR CLOUD COMPUTING

Distributed Systems LEEC (2005/06 2º Sem.)

Real-time Data Replication

Migrating from Unix to Oracle on Linux. Sponsored by Red Hat. An Oracle and Red Hat White Paper September 2003

Efficient Scheduling Of On-line Services in Cloud Computing Based on Task Migration

Building Highly Available Database Applications with Geronimo and Derby

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical

Microsoft SQL Server for Oracle DBAs Course 40045; 4 Days, Instructor-led

International Journal of Innovative Research in Computer and Communication Engineering

Using Peer to Peer Dynamic Querying in Grid Information Services

Design of Electronic Medical Record System Based on Cloud Computing Technology

A Virtual Machine Searching Method in Networks using a Vector Space Model and Routing Table Tree Architecture

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage

A Novel Cloud Computing Data Fragmentation Service Design for Distributed Systems

Configuration Management of Massively Scalable Systems

Transparency in Distributed Systems

Minimize Response Time Using Distance Based Load Balancer Selection Scheme

Oracle: Database and Data Management Innovations with CERN Public Day

Availability Digest. Raima s High-Availability Embedded Database December 2011

Objectif. Participant. Prérequis. Pédagogie. Oracle Database 11g - New Features for Administrators Release 2. 5 Jours [35 Heures]

A Survey on Load Balancing and Scheduling in Cloud Computing

Time series IoT data ingestion into Cassandra using Kaa

Centralized Systems. A Centralized Computer System. Chapter 18: Database System Architectures

bigdata Managing Scale in Ontological Systems

Manifest for Big Data Pig, Hive & Jaql

FIFTH EDITION. Oracle Essentials. Rick Greenwald, Robert Stackowiak, and. Jonathan Stern O'REILLY" Tokyo. Koln Sebastopol. Cambridge Farnham.

Analysis of Issues with Load Balancing Algorithms in Hosted (Cloud) Environments

HOW CLOUD DATABASE ENABLES EFFICIENT REAL-TIME ANALYTICS?

How To Make A Distributed System Transparent

Objectives. Distributed Databases and Client/Server Architecture. Distributed Database. Data Fragmentation

WEBSPHERE APPLICATION SERVER ADMIN V8.5 (on Linux and Windows) WITH REAL-TIME CONCEPTS & REAL-TIME PROJECT

Outdated Architectures Are Holding Back the Cloud

In-Memory Computing for Iterative CPU-intensive Calculations in Financial Industry In-Memory Computing Summit 2015

Transcription:

Present a New Middleware to Control and Management Database Distributed Environment Mohammadjavad Hosseinpoor 1, Hamide Kazemi 2 1 Member of faculty Dept of Computer Engineering, Islamic Azad University Estahban Branch, Estahban,Iran 2 Dept of Computer Engineering, Islamic Azad University Estahban Branch, Estahban,Iran Abstract Database systems and computer networks technology led to distributed database development. In this article, a suitable and flexible middleware called CMDBMS is presented to control and manage database in distributed environment. In middleware the data are distributed entirely among the databases, and each database has its own database management system, and middleware performs as a controller in order to access the database management systems. This middleware hides the complexity of clustering while combining a set of heterogeneous database in a unit database, and it presents a database view for users. Also performance of CMDBMS in horizontal, vertical and complete scalabilities is better than centralized databases situation. Keywords Management, Queries, Middleware, Distributed database. I. INTRODUCTION Database systems and computer networks technology led to distributed database development. A distributed database system contains a distributed data management system, a distributed database and interior connected network [3,4]. In a distributed database, data has been distributed thorough several database. There is different architecture for a distributed database system. In architecture, the control is focused, while data has been distributed. Then there are several databases for designing a distributed database system. In this case there is no local database management system; distributed database management system manages all distributed data [1,2].Also, there is multi databases architecture in which each local database is managed with local database management system and different database management system is connected to a distributed database system [1,4]. In this paper is represented a middleware as CMDBMS that cluster and manage available database in distributed environment. Database connected into this middleware, all are managed by self-local database management system. This middleware works with each relation database that provides OLEDB driver. Its architecture is flexible and can be used for support great group from different degrees of efficiency, fault tolerance and availability. The outline of the rest of this paper is as follows. Section 2 describes previous middleware s and architecture of those. Section 3 presents the architecture of CMDBMS and the role of each of its components. Section 4 describes how replication and fragmentation is handled in CMDBMS. Section 5 describes failure manager in CMDBMS. Section 6 discusses horizontal, vertical and complete scalability. Section 7 describes the performance of CMDBMS in horizontal, vertical and complete scalabilities in the distributed environment. In the last section also will be discussed conclusion. II. BACKGROUND An In the second section we will explain two middleware, MOCHA and CJDBC respectively, about access to data in distributed database along with their architecture. MOCHA is scalability database middleware that has been designed for access to distributed database in the computer networks. MOCHA applies for scalability in great environments. MOCHA has been performed in java and acts as interface, those databases such as oracle, Informix and soon is connected to it. The aim of this middleware is integration of a collection from distributed data sources in computer network. Practically, this architecture has been focused in a data integrity server that provides classic applications with a similar view and similar access mechanism in order to access to each source [7]. There are two choices to develop an integrity server, a choice is commercial database server and the other is intermediary system. In commercial database server, access to far data performs by a database port. But in intermediary system, an intermediary has been designed, that performs distributed data process intermediary server uses wrapper to access to reserved information in data sites. A wrapper exploits data from data sources. MOCHA is a database middleware that connects hundreds data sources to each other. This middleware can use for data manipulation in for data. MOCHA use java code transportation to perform such work. MOCHA provides an efficacious queries process by java code transportation for queries functions. Figure1 shows MOCHA architecture [6]. 547

First user communicates by CMDBMS connection with middleware and then sends self-queries. User queries investigate semantically and also based on access authentication that if it is correct, it is analyzed is sent to databases. Obtained answers are entered into databases and are combined with each other and final answer is given to user following, we will explain each different part of system. Figure 1.MOCHA architecture. CJDBC is a suitable middleware to cluster database. This middleware is flexible and open. CJDBC hides clustering complexity and represents a unit database view for client requests that need no reform [5]. CJDBC works with each relationship database management system that provides JDBC driver. Distribution load balancer, fault tolerance and fault coverage all are managed by system. Its architecture is flexible and can be used to support great groups from different databases with different degrees of efficiency and fault tolerance clustering database in powerful and not precious [8]. CJDBC is a java middleware for clustering database, based on JDBC. Data can be iterate based on requests need completely or partial. Queries tracing is performed to different databases by CJDBC automatically. CJDBC can provide additional services such as monitoring and logging Figure 2 show CJDBC architecture [5]. Figure 2.CJDBC architecture. III. CMDBMS CMDBMS is a middleware to control, management and clustering databases in distributed environment, Figure 3 shows the architecture of this middleware. CMDBMS controller is an interface between databases and users. Controller provides a unit database view for system driver and users too. A. Connection CMDBMS Figure 3.CMDBMS architecture. Connection CMDBMS acts as interface between system and user. It takes user queries and gives system and then receives can communicate with system, just by it and can send self-queries in order to performance. When user has connection requests user, it is necessary to investigate confirmer. So enters its password and identifier user and it waits to connect. If the user identifier and password is correct, it is success to connect and can send self-queries, otherwise user will not be authorized to connect. B. Authentication Manager Authentication manager among middleware is the authentication and identification point of view. If query semantically and access authentication is reliable. It is sent to request manager to implement; otherwise, the query is invalid. C. Request Manager Request manager after it receives queries from authentication manager, it analyzes them into sub queries, then these sub queries become localized and scheduled and are sent into related databases. Then returned answers are entered into request manager from databases. 548

They are combined there and obtain final answer. Obtained final answer is sent into connection CMDBMS until is given to user. Request manage forming compilations are as follow respectfully. Request decomposition, data localization, global query optimization. Schedule, request prevalence and two optional compilations of recovery log and request results cache. Then we explain about each component. Request decomposition believes that queries are entered into request management section, it refers to when authentication manage knew user query reliable. Then request manager, sends queries to ward analysis function to request decomposition. This part divides queries into sub queries using data dictionary and is given to the other part in order to do localization functions. Data localization is localized in this part of sub queries based on information that exists in data dictionary about databases. It is obvious that each of these sub queries should perform on which tables will enter into global request optimization in order to optimize strategies. Global request optimizing, input this part, are localized query. The aim of global request optimizing is related find an completely optimize performance strategy for queries. Chosen a strategy, by global request optimizing, should from communication cost with tables point of view and from disc output-input cost, data receive and send cost and function processing cost by CPU point of view be minimal. Global request optimizing after performing this work, sends sub queries toward schedule to scheduler. Scheduler schedules optimized local queries in direction of performance. In scheduler section are scheduled all input sub queries for can schedule in similar times. But when queries are related to writing function or update on databases, scheduler schedules them in different time. In order to improve efficiency, transactions in CMDBMS can parallel be perform on databases, but it is possible when updating transactions and writing are not related to special table. Request results cache, CMDBMS represents two kind of results cache. One kind is request result cache that can be used to reserve results related to each query, this cache reduce results of answer receive response time as well. Since the system recognizes that there are results for query in cache, in this case it doesn t send it to performance on database. So answer receive response time reduces. Of course these functions are correct when any updating functions have not been performed on considered tables during this time. The other kind is sub request results cache that reserves all results related to each one sub queries in itself, until uses in later steps for the other requests, of course in this cache shouldn t be performed any updating function during this time on considered databases, such as previous cache. Recovery log reserves in itself exact situation of databases tables. 549 When a database becomes unavailable in any reason, recovery log interrupts connection related to database and it announces the failure database. In these situations all transactions are cancelled that are related to available tables and again obtain on the other copies from them. While there is no the other copy from failure database tables, all transactions related to these tables are cancelled until repair the database. Request prevalence based on the information that obtains from data dictionary about queries performance method, sends each of them to perform on related tables automatically. To do these functions, request prevalence first communicates by connection manager with related database then sends sub queries to perform on each one of them. Sub queries are performed in databases and their answers are sent into request prevalence then these answer to user. Data dictionary, this part contains information about all available databases, tables related to them and also the place of these tables and how these distribute. Data dictionary gives information to all system parts. IV. FRAGMENTATION AND REPLICATIONIN CMDBMS Fragmentation in this system databases tables have been fragmented vertically horizontally and both of these. These tables fragment are among databases connected to distributed system. Related information is in order to from min tables from these tables fragment in data dictionary. This system has no limitation in table's fragmentation. Main tables can be fragmented according to each fragmentation method. Main tables fragmentation and their distribution among available databases elevates parallel process power. Replication, Available tables fragment in system repeat among databases connected to system. This system with existence of each horizontal and vertical and complete scalability supports respectively data partial and complete replication. So same available tables in databases connected to system can repeat in other databases that are connected to system. Table s fragmentations have advantages between databases, that elevate availability and also tolerability because of several copies from each tables, but this problem raises cost in update functions. Because update functions should repeat in all tables copies. This system performs update functions in consistent situation. When a query based on up date is sent tables from different databases to system, this query is analyzed to sub queries and these sub queries and these sub queries are performed on related tables and makeup date functions. In some situations that there are different copies from these tables, updating should reflect in all similar tables. In this case don t create any in consistency in updating functions, because these functions are managed by local management system of each of database.

V. FAILURE MANAGER Failure manager in system often are performed by recovery log. Recovery log shows consistent situation from databases. If each of databases connected to system, involves in problem, recovery log rejects transactions related to them and announces unavailability to these databases. These rejected transactions register in recovery log, until database became revive and again had been applied. If server is failure, in this case all databases connected to it, are sent into the other server that is near to failure server. This server serves to client until repair the first server. All registered functions in the first server recovery log is copied in the second server recovery log. In these situations all transactions in the first server are rejected and in the second server again is applied. Also all queries are led into second server until repair the first server. VI. SCALABILITY Controller middleware of databases can be a failure unit point potentially. In this case, this problem can be solved by scalability. Scalability is complex that contains partition the database inside several parts that each one of these parts is in independence computers. There are three strategies, complete, vertical and horizontally for scalability of CMDBMS middleware to control and clustering database in distributed environment, that following we will explain about each of them. A. Horizontal Scalability In horizontal scalability, each of CMDBMS controllers according to Figure 4 communicates with each other horizontally. Horizontally scalability releases controller from failure unit point and creates security situation. In the case that each controller becomes unavailable in any reasons, all databases related to this controller are led to the other controller and selected controller should approach to failure controller based on distance point of view. In this kind controller structure communicate to each other horizontally and reading and writing orders on databases are led to considered controllers horizontally. In horizontal scalability, tables have been distributed between controller and maximum in the other controller. This kind of scalability doesn t support repeat of total databases, it means that there is no repeated database between controllers. Figure 4. Horizontal Scalability. 550 B. Vertically Scalability In vertical scalability, each of controllers is related to each other according to structure of binary tree. as you observe in Figure 5, father controllers engage control responsibility and manage each of child controller also control and manage each of databases connected to itself, and controller exist in root also manage children controllers. In this scalability in the case that each controller are failure, the other controllers that exist in the same level, will engage responsibility of failure controller databases control, but in the case that root controller destroys, all system will be failure. In this scalability, writing and reading orders has published from top to below, until gives to considered controllers. Then controllers have performed on related databases and gives results to users. In vertical scalability we have also the same as data partial repeat horizontal scalability, and maximum can be repeated in a controller. C. Complete Scalability Figure 5. Vertically Scalability In this scalability, all controllers have complete control with other controllers according to Figure 6 and make an integral complete environment. In complete scalability, tables can repeat between controllers and databases can repeat between controllers, so error tolerability and availability in this scalability is high because of several copies existence, but updating cost of these tables is more than two previous scalability. In this scalability, there are controllers that are connected to each one several databases that data have been reserved among them in the form of distributed. In such cases each controller is responsible for access control, connected bases management and generality and function with other controllers. Also it destroys all obtained complexity of distribution and make unit database view for users. In this scalability, in the case that each of these controllers becomes unavailable in any reason, all databases related to this controller are led to the other controller that is near failure controller from distance point of view.

In complete scalability reading and writing orders after receiving are led by a controller to the other controller to the other controllers, then that a controller perform orders and gives the results to users. Figure 7. Compare respond of requests in three of scalabilities and centralized case. Figure 6. Complete Scalability VII. EXPRIMET In this section, we study the performance of CMDBMS in horizontal, vertical and complete scalabilities in the distributed environment. Also, we compare this performance with the centralized databases situation. Six databases were used in these experiments which are set in one computer in the centralized case and in 6 computers (nodes) in the distributed case. The used computers have H.D.D 40G, RAM 512, and CPU 2000 AMD; in order to connect the computers star network was employed. The experiment studies the number of requests per minute responded by the middleware. According to Figure 7, in the centralized case 450 requests per minute responded by this case, in other word, in the vertically scalability 1190 requests per minute responded by the middleware with 6 computers.also, in the horizontal scalability 1550 requests per minute responded by the middleware with 6 nodes, and in Figure showed number of requests per minute responded by the middleware in complete scalability, so the middleware in this scalability with 6 nodes responded 4600 requests per minute.follow in, the level of response through the middleware in the complete scalability is the higher, and in the centralized case is the lower. Consequently, due to Figure 7, the best performance of the middleware corresponds to the complete scalability. VIII. CONCLUSION In this paper, we presented and suggested CMDBMS middleware to management and clustering databases in a distributed environment CMDBMS performs clustering the databases on OLEDB, and also hides clustering complexity and creates a unit databases view for users. Databases connected to this middleware each have special local databases management system, and all can be different or the same. This middleware in fact acts as on interface between available databases and client. Databases that have OLEDB driver can be connected to this middleware. Failure manager in CMDBMS is performed by recovery log that shows consistent situation from databases connected to middleware. Three represented scalability for CMDBMS, represents efficiency increase, high availability and fault tolerance in the middleware. Also, we studied the performance of CMDBMS in horizontal, vertical and complete scalabilities in the distributed environment and, we compared this performance with the centralized databases situation. In finally, the best performance of the middleware corresponds to the complete scalability. REFERENCES [1] J.J. Hu, H.C. Li, H.M. Tai, S.S. Yu, 2012, Thermal Management and Load Control of Container Data Center: A Case Study of Cloud Computing in a Rack, International Symposium on Computer, Consumer and Control, 978-0-7695-4655-1,IEEE. [2] J.D. Rio, D.M. Toma,T.C. O reilly,a.h. Bröring,A. Manuel, K.L. Headley, D. Edgington, 2011," Interoperable Data Management and Instrument Control Experiences at OBSEA, 78-1-61284-4577-0088-0/11, IEEE. 551

[3] S. Vukmirovi, A. Erdeljan, F. Kuli,S. Lukovi, 2010, A solution for CIM based integration of Meter Data anagement in Control Center of a power system, 978-1-4244-6276, IEEE. [4] W. Xu, J. Li, Y. Wu, X. Huang, G. Yang, 2008, VDM: Virtual Database Management for Distributed Databases and File Systems, Seventh International Conference on Grid and Cooperative Computing, 978-0-7695-3449, IEEE. [5] Cecchet,E., 2004, C-JDBC Horizontal Scalability a Controller Replication User guide, From the World Wide Web : Http://www.Objectweb.org. [6] Martinez,M.R.,Roussopoulos,N.,2000, MOCHA: A Self Extensible Database Middleware System for distributed Database Sources, ACM International Conference on Management of data. Dalas,TX. [7] Martinez., M.R., Roussopoulos.,N., 1998, MOCHA: A Self- Extensible Middleware Substrate For Distributed Data Sources, Technical Report UMIACS-TR 98-67,CS-TR 3955, University of Maryland. [8] Cecchet,E.,Marguerite,J.,Zwaenepoel,W., 2002, C-JDBC Flexible Database Clustering Middleware, From the World Wide Web : Http:// www.objectweb.org. 552