CHAPTER 1 INTRODUCTION
|
|
- David Ferguson
- 7 years ago
- Views:
Transcription
1 INTRODUCTION CHAPTER 1 This chapter provides a high-level overview of distributed database. It firstly describes the characteristics, general architecture, challenges and problem areas of distributed database. Then, it presents primary objective of our research. This chapter ends with a discussion on the organization for the rest of the thesis. 1.1 DISTRIBUTED DATABASE Today s business environment has an increasing need for distributed database and client/server applications as the desire for reliable, scalable and accessible information is steadily rising. Distributed database systems provide an improvement on communication and data processing due to its data distribution throughout different network sites. Not only is data access faster but a single-point of failure is less likely to occur and it provides local control of data for users. However, there is some complexity when attempting to manage and control distributed database systems. Distributed network computing environments have become a cost-effective and popular choice to achieve high performance and to solve large scale computational problems. Unlike past supercomputers, a distributed database computing system can be used as multi-purpose computing platform to run diverse high performance parallel applications. The developments in computer networking technology and database systems technology resulted in the development of distributed databases in the mid 1970s. It was felt that many applications would be distributed in the future and therefore, the databases had to be distributed also. Although, many definitions of a distributed database system have been given, there is no standard definition. A distributed database system includes a Distributed Database Management System (DDBMS), a distributed database and a network for interconnections. The objective of a DDBMS is to control the management of a Distributed Data Base (DDB) in such a way that it appears to the user as a centralized database. For general purposes, a database is a collection of data that is stored and maintained at one central location. A database is controlled by a database management system. The user 1
2 interacts with the database management system in order to utilize the database and transform data into information. Furthermore, a database offers many advantages compared to a simple file system with regard to speed, accuracy and accessibility such as: shared access, minimal redundancy, data consistency, data integrity and controlled access [CPP2003]. All of these aspects are enforced by a database management system. Dr. Edgar F. Codd designed a relational model to solve pre-existing model problems while at IBM in the late 1960 s. This relational model was built on mathematical principles which he expounded upon in a book entitled A Relational Model of Data for Large Shared Databanks [SSO2004]. A relational database is a set of tables (also called relations) that are separated into predefined categories. Each table contains records which are the horizontal rows that contain one related group of data. The vertical columns are known as the attributes. Data that is stored on two or more tables establishes a link between the tables based on one or more field values common in both tables. A relational database [TMC2004a] uses a standard user and application program interface called Structured Query Language (SQL). This program language uses statements to access and retrieve queries from the database. Relational databases are the most commonly used due to the reasonable ease of creating and accessing information as well as extending new data categories. When dealing with intricate data or complex relationships, object databases are more commonly used. Object databases, in contrast to relational databases, store objects rather than data such as integers, strings or real numbers. Each object consists of attributes which define the characteristics of an object. Objects also contain methods that define the behavior of an object (also known as procedures and functions). When storing data in an object database there are two main types of methods, one technique labels each object with a unique ID. Every unique ID is defined in a subclass of its own base class where inheritance is used to determine attributes. A second method is utilizing virtual memory mapping for object storage and management. Advantages of object databases with regard to relational databases allow more concurrency control, a decrease in paging and easy navigation. However, there are some disadvantages of object databases compared to relational databases such as: less effective with simple data and relationships, slow access speed and the fact that relational databases provide suitable standards oppose to those for object database systems [AAM2004]. 2
3 Hierarchical databases are organized in a tree like structure where tables act as the root of the database with other tables branching out. Relationships in such a system are thought of in terms of children and parents, such that a child may only have one parent but a parent can have multiple children. Parents and children are connected by links called pointers where a parent may have many pointers to each child. This relationship assumes that data is accessible for the user. On the other hand, hierarchical database systems are complex to use and require application developers to program routing through the linked records. In a hierarchical database, all possible access points must be predetermined and followed accordingly for a successful database otherwise access patterns not included can be extremely difficult to implement [EIC1994]. On the other hand, network databases alleviate some of the problem incorporated with hierarchical databases such as data redundancy. The network model represents the data in the form of a network of records and sets which are related to each other, forming a network of links [PAF1992]. The relationships are represented in terms of records, record types and sets rather than hierarchy. Records are sets of related data values which are equivalent to rows in a relational database model. Record types are a set of records and set types are relationships of one or more record types. The network model just like hierarchical model allows having a many-to-many relationship. Unfortunately, the network model is far more difficult to implement and maintain than what was needed by real end users to solve real problems [GDP1988]. Each database may involve different database management systems and different architectures that distribute the execution of transactions. A distributed database system consists of loosely coupled sites that share no physical component. Database systems that run on each site are independent of each other. Providing the appearance of a centralized database system is one of the many objectives of a distributed database system. Such an image is accomplished by using the following transparencies: Location Transparency, Performance Transparency, Copy Transparency, Naming Transparency, Transaction Transparency, Fragment Transparency, Schema Change Transparency and Local DBMS Transparency. These transparencies are believed to incorporate the desired functions of a distributed database system. Other goals of a successful distributed database include free object naming where free object naming means that it allows different users the ability to access the same object with different names or different 3
4 objects with the same internal name. Thus, giving the user complete freedom in naming the objects while sharing data without naming conflicts. 1.2 CLASSIFICATION OF DISTRIBUTED DATABASE Distributed database can be classified into: Homogeneous DBMS: It has multiple data collections and integrates multiple data resources. Homogeneous systems are similar to a centralized system but instead of preserving all data in a single place, these data are distributed among several places communicated by the network. Local users do not exist and all of them access to the database through a global interface. Heterogeneous DBMS: It is a system which interconnects already existing autonomous database systems to support global applications that access data items in more than one database. Other names proposed for such a system are: federated database system, multidatabase system, decentralized system etc. These systems are characterized by the autonomy of the individual sites as well as the cooperation among them. Three different aspects of the autonomy are: Design Autonomy: The individual sites may differ with respect to data models, physical design, data definition and manipulation languages, query processing strategies, concurrency control, recovery mechanisms etc. One reason for heterogeneity is that when the individual systems were designed, they were unaware of the intended interconnection with the other sites. Execution Autonomy: Each site executes its own local transactions and also subtransactions of the global transactions. All these transactions are treated in the same way. Thus, the site is entitled to decide when and how to execute a sub-transaction and commit it as soon as first execution is complete without waiting for the commitment of the entire global transaction. Communication Autonomy: The sites may be willing to share with other sites only some, not all, data and transaction processing information. Also each site communicates with other sites only when it finds it convenient in terms of bandwidth availability and data locality. Consequently, each site might be inaccessible to the other sites for long periods of time. 4
5 1.3 CHARACTERISTICS OF DISTRIBUTED DATABASE Availability and Reliability: The availability is defined as the probability that the system will be up continuously during a given time period. Reliability is defined as the probability that the system will be up at a given time. These important system parameters are improved with the DDBS. In the centralized database system, if any component of the database goes down, the entire system will go down whereas in the distributed database, only the affected site is down and the rest of the system will not be affected. Further more, if the data is replicated at the different sites, the effect is greatly minimized. Performance Improvement: When large database is distributed onto a number of sites, the local subset of the database is a lot smaller which will improve the size of transactions and the processing time. For the transactions that need access to more than one site, the processing can proceed in parallel improving response time. Communication via Computer Network: The ability to communicate via a computer network to send and receive data and queries from/to other sites on the network. DDBMS Catalog Maintenance: To keep track of the database distribution and replication among the different sites. This is maintained in the DDBMS catalog. Distributed Transactions: A distributed transaction is a transaction which operates on data located at more than one site. A distributed transaction is divided (by the transaction manager of the originating site) into a number of sub-transactions which will be executed by many nodes. The adaptation of the new concept of distributed transactions provides the ability of devising a strategy to execute a transaction that involves accessing more than one site. Replicated Data Consistency: The ability to maintain the consistency of replicated data across the network. 1.4 GENERAL ARCHITECTURE OF DISTRIBUTED DATABASE A distributed database is a set of databases stored on multiple computers that typically appears to applications as a single database [ZSZ2009]. Consequently, an application can simultaneously access and modify the data in several databases in a network. The computers in a distributed system communicate with one another through various communication media, such as high-speed networks or telephone lines. They do not share main memory or 5
6 disks. The computers in a distributed system may vary in size and function, ranging from workstations to mainframe systems. A database link connection allows local users to access data on a remote database. For this connection to occur, each database in the distributed system must have a unique global database name in the network domain. Database Technology Computer Networks Integration Distribution Distributed Database Systems Integration Integration Centralization Figure 1.1 Conceptual View of Distributed Database The global database name uniquely identifies a database server in a distributed system. As a result of it, users have access to the database at their location and they can access the data relevant to their tasks without interfering with the work of others. As shown in the Figure 1.1, the distributed database systems are simply a matter of integrating the database technologies over the computer network. Thus, the tradeoff will be between the integration and the centralization of the data. Site 1 Site 2 Site 5 Communication Network Site 4 Site 3 Figure 1.2 Centralized DBMS on a Network 6
7 Site 1 Site 2 Site 5 Communication Network Site 4 Site 3 Figure 1.3 Distributed DBMS Environment The main difference between centralized and distributed databases is that the distributed databases are typically geographically separated, separately administered and have slower interconnection. Figure 1.2 and Figure 1.3 clearly state the difference between the centralized and distributed DBMS. Also in distributed databases, we differentiate between local and global transactions. A local transaction is one that accesses data only from sites where the transaction originated. A global transaction, on the other hand, is one that either accesses data in a site different from the one at which the transaction was initiated or accesses data in several different sites Components of Distributed Database DDBMS comprises of following components: Database Manager: is the software responsible for processing a segment of the distributed database as shown in Figure 1.4. Distributed Database Management System: is defined as the software which governs a Distributed Database System. It supplies the user with the illusion of using a centralized database. User Request Interface: known some times as a customer user interface, which is usually a client program that acts as an interface to the distributed transaction manager. A customizable user interface is provided for entering requested parameters related to a 7
8 database query. The customized parameter user interface provides parameter entry dialogs/windows in correlation to a data view (e.g. form or report) that is produced according to a database query. The parameters entered may provide for modification of the data view. Also, the manager of the database may structure data views of a database to automatically include prompts for parameters before results are returned by the database. These prompts may be customized by the manager and may be provided according to dialogs such as pop-ups, pull-down menus, fly-outs or a variety of other user interface components. 8
9 Distributed Transaction Manager: is a program that translates requests from the user into actionable requests for the database managers which are typically distributed. A distributed database system is made of both the Distributed Transaction Manager (DTM) and the Data Base Manager (DBM). 1.5 CHALLENGES IN DISTRIBUTED DATABASE Distributed Database Design (Fragmentation, Replication, and Allocation): Data Fragmentation: It allows breaking a single object into two or more segments or fragments. Each fragment can be stored at any site over a computer network. Information about the fragmentation is stored in the distributed data catalog from which it is accessed by the transaction processor to process user requests. Data fragmentation strategies are based at the table level and consist of dividing a table into logical fragments. Three types of such data fragmentations are: Horizontal fragmentation refers to the division of a relation into subsets of rows. Vertical fragmentation refers to the division of a relation into attribute subsets. Mixed fragmentation refers to a combination of horizontal and vertical strategies. Data Replication: Data replication refers to the storage of data copies at multiple sites served by a computer network. Fragmented copies can be stored at several sites to serve specific information requirements. Because the existence of fragmentation copies can enhance data availability and response time, data copies can help to reduce communication and total query costs. Replicated data is subjected to the mutual consistency rule. The mutual consistency rule requires that all copies of data fragments be identical. Three replication scenarios exist: A fully replicated database stores multiple copies of each database fragment at multiple sites. It can be impractical due to the amount of overhead it imposes. A partially replicated database stores multiple copies of some database fragments at multiple sites. It is handled well by the most databases. A non-replicated database stores each database fragment at a single site. Data Allocation: Data allocation describes the process of deciding where to locate data. Data allocation strategies are as follows: With centralized data allocation, the entire database is stored at one site. 9
10 With partitioned data allocation, the database is divided into several disjointed parts and stored at several sites. With replicated allocation, copies of one or more database fragments are stored at several sites. Data distribution over a computer network is achieved through data partition, through data replication or through a combination of both. Distributed Query Processing: Query processing deals with designing algorithms that analyze queries and convert them into a series of data manipulation operations. The problem is how to decide on a strategy for executing every query over the network in the most cost effective way. The factors to be considered are the distribution of data, communication costs and lack of sufficient locally available information. Heterogeneous Databases: When there is no homogeneity among the databases at various sites either in terms of different ways of logically structuring data(data models) or in terms of mechanisms provided for accessing the data(data language), it becomes necessary to provide a translation mechanism between database systems. Distributed Concurrency Control: Concurrency control is an essential element for correctness in any system where two or more database transactions can access the same data concurrently. A well established concurrency control theory exists for database systems: serializability theory which allows effectively designing and analyzing concurrency control methods and mechanisms. To ensure correctness, a DBMS usually guarantees that only serializable transaction schedules are generated, unless serializability is intentionally relaxed. For maintaining correctness in cases of failed transactions (which can always happen) schedules also need to have the recoverability property. Distributed database system design of concurrency and recovery has to consider different aspects other than of those of centralized database systems. These aspects include: Concurrency has to maintain the multiple data copies as consistent. Recovery on the other hand has to make a copy consistent with others whenever a site recovers from a failure. Failure of communication links Failure of individual sites. Deadlocks on multiple sites. 10
11 If concurrent transactions are allowed in an uncontrolled manner, some unexpected result may occur. Here are some typical examples: Lost Update Problem: A second transaction writes a new value of a data-item (datum) on top of a first value written by a first concurrent transaction resulting in the loss of first value. The concurrently running transactions waiting for first value will end with incorrect results. The Dirty Read Problem: Transactions read a value written by a transaction that has been later aborted. This value disappears from the database upon abort and should not have been read by any transaction (dirty read). The reading transactions end with incorrect results. The Incorrect Summary Problem: While one transaction takes a summary over values of a repeated data-item, a second transaction updates some instances of that data-item. The resulting summary does not reflect a correct result for any (usually needed for correctness) precedence order between the two transactions (if one is executed before the other) but rather some random result, depending on the timing of the updates and whether a certain update result has been included in the summary or not. 1.6 PROBLEM AREAS IN DISTRIBUTED DATABASE Reliability of Distributed DBMS: When a failure occurs and various sites become either inoperable or inaccessible, the databases at operational sites must remain consistent and up to date. Furthermore, when the computer system or the network recovers from the failure, the distributed database system should be able to recover and bring the databases at the failed sites up-to-date. Distributed Directory Management: A directory contains information (such as description and locations) about data items in the database. A directory may be global to entire distributed database system or local to each site. It can be centralized at one site or distributed over several sites. Distributed Deadlock Management: The competition among users for access to a set of resources can result in a deadlock if the synchronization mechanism is based on locking. 11
12 Security of Distributed DBMS: The major issues in security are authentication, identification and enforcing appropriate access controls. Databases provide many layers and types of information security, typically specified in the data dictionary, including: Access control: Access Control is a system which enables an authority to control access to areas and resources in a given physical facility or computer-based information system. An access control system, within the field of physical security, is generally seen as the second layer in the security. Authentication: Authentication is the act of establishing or confirming something (or someone) as authentic i.e. the claims made by or about the subject are true. Encryption: In cryptography, encryption is the process of transforming information (referred to as plaintext) using an algorithm (called cipher) to make it unreadable to anyone except those possessing special knowledge, usually referred to as a key integrity Distributed Query Optimization: A database feature that reduces the amount of data transfer required between sites when a transaction retrieves data from remote tables referenced in a distributed SQL statement. Distributed query optimization uses cost-based optimization to find or generate SQL expressions that extract only the necessary data from remote tables, process that data at a remote site or sometimes at the local site and send the results to the local site for final processing. This operation reduces the amount of required data transfer when compared to the time it takes to transfer all the table data to the local site for processing. Load Balancing: A load balancing scheme comprises of three phases: information collection, decision making based on information and data migration. Load balancing or load distribution refers to the general practice of evenly distributing a load. Load balancing is the process by which inbound Internet Protocol (IP) traffic can be distributed across multiple servers. Typically, two or more web servers are employed in a load balancing scheme. In case, one of the servers begins to get overloaded, the requests are forwarded to another server. Load balancing brings down the service time by allowing multiple servers to handle the requests. This service time is reduced by using a load balancer to identify which server has the appropriate availability to receive the traffic. Checkpointing and Recovery (Fault Tolerance): The failure probability of the computing process increases greatly along with enlarging scale of the system. If a failure occurs in a 12
13 computing process and there is no appropriate method to protect it, more cost will be wasted for restarting the program. Check pointing and rollback recovery are the techniques that allow distributed computing to progress in spite of a failure and provide fault-tolerance in distributed systems. Cache Management: Caching popular objects close to clients is a fundamental technique for improving the performance and scalability of a system. Caching enables requests to be satisfied by a nearby copy and hence reduces not only the access latency but also the burden on the network as well as the server. A cache mechanism consists of two basic procedures, i.e., the cache access algorithms and cache replacement policies. Cache access algorithms describe how clients and servers exchange messages and maintain the consistency between the cached copies at clients and the original copies at servers. They are widely used in distributed systems for improving system performance, especially, access latency. A replacement policy describes what data items need to be evicted from the cache when there is no available cache space for storing a copy of the newly accessed data item. Replacement policies are important to the effectiveness of cache mechanisms. A well-designed replacement policy can significantly improve system performance. Caching frequently asked queries is an effective way to improve the performance of both centralized and distributed database systems. 1.7 RESEARCH OBJECTIVES The objective of this research is to improve the understanding of distributed database environment and contribute to the advancement in the areas of concurrency control, load balancing, network traffic management, check pointing and security strategies in distributed database. The present research contributes as follows: The concurrency control in distributed database, its characteristics, challenges, its basic model and performance is analyzed. Related work based on different existing concurrency control algorithms is investigated. A priority based load balancing algorithm is proposed and implemented using Java which balances the load on different nodes working in homogeneous environment in a fragmented distributed database. Memory and CPU utilization based priority method is used and data locality is also taken into consideration along with process waiting time 13
14 and data transmission time. A mobile, network efficient, cost effective multilayer peer to peer distributed model for E-Polling System is proposed. Modifications are made in traditional voting system by incorporating a system generated unique ID in order to reduce chances of duplicate or bogus voting. This system can cast and count votes with higher accuracy and efficiency which reduces the rate of mistakes made in manual methods to a greater extent. The problem of dynamic page generation delays in web sites has been addressed by the proposed Dynamic Content Acceleration (DCA) solution. A fragment-level caching approach is utilized which focuses on re-using HTML fragments of dynamic pages. The result has been evaluated in terms of processing time. A decentralized and cost effective check pointing algorithm suitable for cluster federation is proposed and implemented using java. A single message based communication strategy for cluster federation in distributed database is proposed and evaluated in terms of communication cost incurred and compared with existing algorithms also. Addressing security demands under fixed budgets and deadline constraints are becoming extremely challenging, time consuming and resource intensive. A framework that embeds security capabilities into distributed database by replicating different predefined security policies at different sites using multilevel secure database management system is proposed. Furthermore, a new optimal-bandwidth check pointing algorithm involving only active processes, suitable for network failure prone applications in distributed systems is presented and implemented in java. The algorithm overhead in terms of communication cost and execution time is evaluated and compared with other existing algorithms. 1.8 THESIS ORGANIZATION Chapter 2 starts with a brief description of problem areas of distributed database and then specifies and discusses the research work implemented in areas like concurrency control, load balancing, query optimization, traffic control, check pointing and recovery. A memory along with CPU utilization and data locality based dynamic load balancing algorithm for fragmented distributed database is presented in Chapter 3. Chapter 4 proposes a mobile, network efficient, cost effective, multilayer, peer to peer, distributed model for E-Polling 14
15 System. In Chapter 5, a query optimization model has been proposed which works on the concept of caching entire pages of dynamically generated content. In Chapter 6, a checkpointing algorithm for Cluster Federation has been developed resulting in reduced transmission delay, communication cost, better bandwidth utilization and faster speed of execution. Chapter 7 introduces a framework that embeds autonomic capabilities into distributed database by replicating different predefined security policies at different sites using multilevel secure database management system. 15
Distributed Data Management
Introduction Distributed Data Management Involves the distribution of data and work among more than one machine in the network. Distributed computing is more broad than canonical client/server, in that
More informationAn Overview of Distributed Databases
International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 4, Number 2 (2014), pp. 207-214 International Research Publications House http://www. irphouse.com /ijict.htm An Overview
More informationchapater 7 : Distributed Database Management Systems
chapater 7 : Distributed Database Management Systems Distributed Database Management System When an organization is geographically dispersed, it may choose to store its databases on a central database
More informationObjectives. Distributed Databases and Client/Server Architecture. Distributed Database. Data Fragmentation
Objectives Distributed Databases and Client/Server Architecture IT354 @ Peter Lo 2005 1 Understand the advantages and disadvantages of distributed databases Know the design issues involved in distributed
More informationDistributed Databases
Distributed Databases Chapter 1: Introduction Johann Gamper Syllabus Data Independence and Distributed Data Processing Definition of Distributed databases Promises of Distributed Databases Technical Problems
More informationDistributed Databases. Concepts. Why distributed databases? Distributed Databases Basic Concepts
Distributed Databases Basic Concepts Distributed Databases Concepts. Advantages and disadvantages of distributed databases. Functions and architecture for a DDBMS. Distributed database design. Levels of
More informationDistributed Databases
Chapter 12 Distributed Databases Learning Objectives After studying this chapter, you should be able to: Concisely define the following key terms: distributed database, decentralized database, location
More informationDatabase Management. Chapter Objectives
3 Database Management Chapter Objectives When actually using a database, administrative processes maintaining data integrity and security, recovery from failures, etc. are required. A database management
More informationCentralized Systems. A Centralized Computer System. Chapter 18: Database System Architectures
Chapter 18: Database System Architectures Centralized Systems! Centralized Systems! Client--Server Systems! Parallel Systems! Distributed Systems! Network Types! Run on a single computer system and do
More informationDISTRIBUTED AND PARALLELL DATABASE
DISTRIBUTED AND PARALLELL DATABASE SYSTEMS Tore Risch Uppsala Database Laboratory Department of Information Technology Uppsala University Sweden http://user.it.uu.se/~torer PAGE 1 What is a Distributed
More informationDistributed Architectures. Distributed Databases. Distributed Databases. Distributed Databases
Distributed Architectures Distributed Databases Simplest: client-server Distributed databases: two or more database servers connected to a network that can perform transactions independently and together
More informationWhen an organization is geographically dispersed, it. Distributed Databases. Chapter 13-1 LEARNING OBJECTIVES INTRODUCTION
Chapter 13 Distributed Databases LEARNING OBJECTIVES After studying this chapter, you should be able to: Concisely define each of the following key terms: distributed database, decentralized database,
More informationDistributed Databases
C H A P T E R 12 Distributed Databases Learning Objectives After studying this chapter, you should be able to: Concisely define the following key terms: distributed database, decentralized database, location
More informationNetwork Attached Storage. Jinfeng Yang Oct/19/2015
Network Attached Storage Jinfeng Yang Oct/19/2015 Outline Part A 1. What is the Network Attached Storage (NAS)? 2. What are the applications of NAS? 3. The benefits of NAS. 4. NAS s performance (Reliability
More informationConcepts of Database Management Seventh Edition. Chapter 9 Database Management Approaches
Concepts of Database Management Seventh Edition Chapter 9 Database Management Approaches Objectives Describe distributed database management systems (DDBMSs) Discuss client/server systems Examine the ways
More informationDistributed Database Management Systems
Distributed Database Management Systems (Distributed, Multi-database, Parallel, Networked and Replicated DBMSs) Terms of reference: Distributed Database: A logically interrelated collection of shared data
More information1. INTRODUCTION TO RDBMS
Oracle For Beginners Page: 1 1. INTRODUCTION TO RDBMS What is DBMS? Data Models Relational database management system (RDBMS) Relational Algebra Structured query language (SQL) What Is DBMS? Data is one
More informationConcepts of Database Management Seventh Edition. Chapter 7 DBMS Functions
Concepts of Database Management Seventh Edition Chapter 7 DBMS Functions Objectives Introduce the functions, or services, provided by a DBMS Describe how a DBMS handles updating and retrieving data Examine
More informationAN OVERVIEW OF DISTRIBUTED DATABASE MANAGEMENT
AN OVERVIEW OF DISTRIBUTED DATABASE MANAGEMENT BY AYSE YASEMIN SEYDIM CSE 8343 - DISTRIBUTED OPERATING SYSTEMS FALL 1998 TERM PROJECT TABLE OF CONTENTS INTRODUCTION...2 1. WHAT IS A DISTRIBUTED DATABASE
More informationB.Com(Computers) II Year DATABASE MANAGEMENT SYSTEM UNIT- V
B.Com(Computers) II Year DATABASE MANAGEMENT SYSTEM UNIT- V 1 1) What is Distributed Database? A) A database that is distributed among a network of geographically separated locations. A distributed database
More informationChapter 3. Database Environment - Objectives. Multi-user DBMS Architectures. Teleprocessing. File-Server
Chapter 3 Database Architectures and the Web Transparencies Database Environment - Objectives The meaning of the client server architecture and the advantages of this type of architecture for a DBMS. The
More informationTopics. Distributed Databases. Desirable Properties. Introduction. Distributed DBMS Architectures. Types of Distributed Databases
Topics Distributed Databases Chapter 21, Part B Distributed DBMS architectures Data storage in a distributed DBMS Distributed catalog management Distributed query processing Updates in a distributed DBMS
More informationObject Oriented Databases. OOAD Fall 2012 Arjun Gopalakrishna Bhavya Udayashankar
Object Oriented Databases OOAD Fall 2012 Arjun Gopalakrishna Bhavya Udayashankar Executive Summary The presentation on Object Oriented Databases gives a basic introduction to the concepts governing OODBs
More informationTier Architectures. Kathleen Durant CS 3200
Tier Architectures Kathleen Durant CS 3200 1 Supporting Architectures for DBMS Over the years there have been many different hardware configurations to support database systems Some are outdated others
More informationHighly Available Mobile Services Infrastructure Using Oracle Berkeley DB
Highly Available Mobile Services Infrastructure Using Oracle Berkeley DB Executive Summary Oracle Berkeley DB is used in a wide variety of carrier-grade mobile infrastructure systems. Berkeley DB provides
More informationTECHNIQUES FOR DATA REPLICATION ON DISTRIBUTED DATABASES
Constantin Brâncuşi University of Târgu Jiu ENGINEERING FACULTY SCIENTIFIC CONFERENCE 13 th edition with international participation November 07-08, 2008 Târgu Jiu TECHNIQUES FOR DATA REPLICATION ON DISTRIBUTED
More informationChapter 18: Database System Architectures. Centralized Systems
Chapter 18: Database System Architectures! Centralized Systems! Client--Server Systems! Parallel Systems! Distributed Systems! Network Types 18.1 Centralized Systems! Run on a single computer system and
More informationFragmentation and Data Allocation in the Distributed Environments
Annals of the University of Craiova, Mathematics and Computer Science Series Volume 38(3), 2011, Pages 76 83 ISSN: 1223-6934, Online 2246-9958 Fragmentation and Data Allocation in the Distributed Environments
More informationData Management in an International Data Grid Project. Timur Chabuk 04/09/2007
Data Management in an International Data Grid Project Timur Chabuk 04/09/2007 Intro LHC opened in 2005 several Petabytes of data per year data created at CERN distributed to Regional Centers all over the
More informationChapter 1 - Web Server Management and Cluster Topology
Objectives At the end of this chapter, participants will be able to understand: Web server management options provided by Network Deployment Clustered Application Servers Cluster creation and management
More informationVirtual machine interface. Operating system. Physical machine interface
Software Concepts User applications Operating system Hardware Virtual machine interface Physical machine interface Operating system: Interface between users and hardware Implements a virtual machine that
More informationAvailability Digest. MySQL Clusters Go Active/Active. December 2006
the Availability Digest MySQL Clusters Go Active/Active December 2006 Introduction MySQL (www.mysql.com) is without a doubt the most popular open source database in use today. Developed by MySQL AB of
More informationMS-40074: Microsoft SQL Server 2014 for Oracle DBAs
MS-40074: Microsoft SQL Server 2014 for Oracle DBAs Description This four-day instructor-led course provides students with the knowledge and skills to capitalize on their skills and experience as an Oracle
More informationHow To Understand The Concept Of A Distributed System
Distributed Operating Systems Introduction Ewa Niewiadomska-Szynkiewicz and Adam Kozakiewicz ens@ia.pw.edu.pl, akozakie@ia.pw.edu.pl Institute of Control and Computation Engineering Warsaw University of
More informationTransaction Management in Distributed Database Systems: the Case of Oracle s Two-Phase Commit
Transaction Management in Distributed Database Systems: the Case of Oracle s Two-Phase Commit Ghazi Alkhatib Senior Lecturer of MIS Qatar College of Technology Doha, Qatar Alkhatib@qu.edu.sa and Ronny
More informationChapter 10. Backup and Recovery
Chapter 10. Backup and Recovery Table of Contents Objectives... 1 Relationship to Other Units... 2 Introduction... 2 Context... 2 A Typical Recovery Problem... 3 Transaction Loggoing... 4 System Log...
More informationDATABASE MANAGEMENT SYSTEM
REVIEW ARTICLE DATABASE MANAGEMENT SYSTEM Sweta Singh Assistant Professor, Faculty of Management Studies, BHU, Varanasi, India E-mail: sweta.v.singh27@gmail.com ABSTRACT Today, more than at any previous
More informationCHAPTER 2 MODELLING FOR DISTRIBUTED NETWORK SYSTEMS: THE CLIENT- SERVER MODEL
CHAPTER 2 MODELLING FOR DISTRIBUTED NETWORK SYSTEMS: THE CLIENT- SERVER MODEL This chapter is to introduce the client-server model and its role in the development of distributed network systems. The chapter
More informationPrinciples and characteristics of distributed systems and environments
Principles and characteristics of distributed systems and environments Definition of a distributed system Distributed system is a collection of independent computers that appears to its users as a single
More informationPrinciples of Distributed Database Systems
M. Tamer Özsu Patrick Valduriez Principles of Distributed Database Systems Third Edition
More informationThe Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets
The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets!! Large data collections appear in many scientific domains like climate studies.!! Users and
More informationData Management in the Cloud
Data Management in the Cloud Ryan Stern stern@cs.colostate.edu : Advanced Topics in Distributed Systems Department of Computer Science Colorado State University Outline Today Microsoft Cloud SQL Server
More informationChapter Outline. Chapter 2 Distributed Information Systems Architecture. Middleware for Heterogeneous and Distributed Information Systems
Prof. Dr.-Ing. Stefan Deßloch AG Heterogene Informationssysteme Geb. 36, Raum 329 Tel. 0631/205 3275 dessloch@informatik.uni-kl.de Chapter 2 Architecture Chapter Outline Distributed transactions (quick
More informationIn Memory Accelerator for MongoDB
In Memory Accelerator for MongoDB Yakov Zhdanov, Director R&D GridGain Systems GridGain: In Memory Computing Leader 5 years in production 100s of customers & users Starts every 10 secs worldwide Over 15,000,000
More informationCHAPTER 2 DATABASE MANAGEMENT SYSTEM AND SECURITY
CHAPTER 2 DATABASE MANAGEMENT SYSTEM AND SECURITY 2.1 Introduction In this chapter, I am going to introduce Database Management Systems (DBMS) and the Structured Query Language (SQL), its syntax and usage.
More informationOptimizing Performance. Training Division New Delhi
Optimizing Performance Training Division New Delhi Performance tuning : Goals Minimize the response time for each query Maximize the throughput of the entire database server by minimizing network traffic,
More informationInformix Dynamic Server May 2007. Availability Solutions with Informix Dynamic Server 11
Informix Dynamic Server May 2007 Availability Solutions with Informix Dynamic Server 11 1 Availability Solutions with IBM Informix Dynamic Server 11.10 Madison Pruet Ajay Gupta The addition of Multi-node
More information1 File Processing Systems
COMP 378 Database Systems Notes for Chapter 1 of Database System Concepts Introduction A database management system (DBMS) is a collection of data and an integrated set of programs that access that data.
More informationThe EMSX Platform. A Modular, Scalable, Efficient, Adaptable Platform to Manage Multi-technology Networks. A White Paper.
The EMSX Platform A Modular, Scalable, Efficient, Adaptable Platform to Manage Multi-technology Networks A White Paper November 2002 Abstract: The EMSX Platform is a set of components that together provide
More informationMS SQL Performance (Tuning) Best Practices:
MS SQL Performance (Tuning) Best Practices: 1. Don t share the SQL server hardware with other services If other workloads are running on the same server where SQL Server is running, memory and other hardware
More informationComparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications
Comparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications White Paper Table of Contents Overview...3 Replication Types Supported...3 Set-up &
More informationDistributed Database Management Systems for Information Management and Access
464 Distributed Database Management Systems for Information Management and Access N Geetha Abstract Libraries play an important role in the academic world by providing access to world-class information
More informationCOMP5426 Parallel and Distributed Computing. Distributed Systems: Client/Server and Clusters
COMP5426 Parallel and Distributed Computing Distributed Systems: Client/Server and Clusters Client/Server Computing Client Client machines are generally single-user workstations providing a user-friendly
More informationTivoli Storage Manager Explained
IBM Software Group Dave Cannon IBM Tivoli Storage Management Development Oxford University TSM Symposium 2003 Presentation Objectives Explain TSM behavior for selected operations Describe design goals
More information1 Organization of Operating Systems
COMP 730 (242) Class Notes Section 10: Organization of Operating Systems 1 Organization of Operating Systems We have studied in detail the organization of Xinu. Naturally, this organization is far from
More informationMicrosoft SQL Server Data Replication Techniques
Microsoft SQL Server Data Replication Techniques Reasons to Replicate Your SQL Data SQL Server replication allows database administrators to distribute data to various servers throughout an organization.
More informationChapter 6: Physical Database Design and Performance. Database Development Process. Physical Design Process. Physical Database Design
Chapter 6: Physical Database Design and Performance Modern Database Management 6 th Edition Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden Robert C. Nickerson ISYS 464 Spring 2003 Topic 23 Database
More informationMicrosoft SQL Server for Oracle DBAs Course 40045; 4 Days, Instructor-led
Microsoft SQL Server for Oracle DBAs Course 40045; 4 Days, Instructor-led Course Description This four-day instructor-led course provides students with the knowledge and skills to capitalize on their skills
More informationCOSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters
COSC 6374 Parallel Computation Parallel I/O (I) I/O basics Spring 2008 Concept of a clusters Processor 1 local disks Compute node message passing network administrative network Memory Processor 2 Network
More informationOperating system Dr. Shroouq J.
3 OPERATING SYSTEM STRUCTURES An operating system provides the environment within which programs are executed. The design of a new operating system is a major task. The goals of the system must be well
More informationDistributed Systems LEEC (2005/06 2º Sem.)
Distributed Systems LEEC (2005/06 2º Sem.) Introduction João Paulo Carvalho Universidade Técnica de Lisboa / Instituto Superior Técnico Outline Definition of a Distributed System Goals Connecting Users
More informationLinuxWorld Conference & Expo Server Farms and XML Web Services
LinuxWorld Conference & Expo Server Farms and XML Web Services Jorgen Thelin, CapeConnect Chief Architect PJ Murray, Product Manager Cape Clear Software Objectives What aspects must a developer be aware
More informationManaging Users and Identity Stores
CHAPTER 8 Overview ACS manages your network devices and other ACS clients by using the ACS network resource repositories and identity stores. When a host connects to the network through ACS requesting
More informationChapter 13 File and Database Systems
Chapter 13 File and Database Systems Outline 13.1 Introduction 13.2 Data Hierarchy 13.3 Files 13.4 File Systems 13.4.1 Directories 13.4. Metadata 13.4. Mounting 13.5 File Organization 13.6 File Allocation
More informationChapter 13 File and Database Systems
Chapter 13 File and Database Systems Outline 13.1 Introduction 13.2 Data Hierarchy 13.3 Files 13.4 File Systems 13.4.1 Directories 13.4. Metadata 13.4. Mounting 13.5 File Organization 13.6 File Allocation
More informationDeploying a distributed data storage system on the UK National Grid Service using federated SRB
Deploying a distributed data storage system on the UK National Grid Service using federated SRB Manandhar A.S., Kleese K., Berrisford P., Brown G.D. CCLRC e-science Center Abstract As Grid enabled applications
More informationAzure Scalability Prescriptive Architecture using the Enzo Multitenant Framework
Azure Scalability Prescriptive Architecture using the Enzo Multitenant Framework Many corporations and Independent Software Vendors considering cloud computing adoption face a similar challenge: how should
More informationMcAfee Agent Handler
McAfee Agent Handler COPYRIGHT Copyright 2009 McAfee, Inc. All Rights Reserved. No part of this publication may be reproduced, transmitted, transcribed, stored in a retrieval system, or translated into
More informationBuilding a Highly Available and Scalable Web Farm
Page 1 of 10 MSDN Home > MSDN Library > Deployment Rate this page: 10 users 4.9 out of 5 Building a Highly Available and Scalable Web Farm Duwamish Online Paul Johns and Aaron Ching Microsoft Developer
More informationMobile and Heterogeneous databases Database System Architecture. A.R. Hurson Computer Science Missouri Science & Technology
Mobile and Heterogeneous databases Database System Architecture A.R. Hurson Computer Science Missouri Science & Technology 1 Note, this unit will be covered in four lectures. In case you finish it earlier,
More informationDistribution transparency. Degree of transparency. Openness of distributed systems
Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer Science steen@cs.vu.nl Chapter 01: Version: August 27, 2012 1 / 28 Distributed System: Definition A distributed
More informationIBM Tivoli Storage Manager Version 7.1.4. Introduction to Data Protection Solutions IBM
IBM Tivoli Storage Manager Version 7.1.4 Introduction to Data Protection Solutions IBM IBM Tivoli Storage Manager Version 7.1.4 Introduction to Data Protection Solutions IBM Note: Before you use this
More informationCisco and EMC Solutions for Application Acceleration and Branch Office Infrastructure Consolidation
Solution Overview Cisco and EMC Solutions for Application Acceleration and Branch Office Infrastructure Consolidation IT organizations face challenges in consolidating costly and difficult-to-manage branch-office
More informationCluster Computing. ! Fault tolerance. ! Stateless. ! Throughput. ! Stateful. ! Response time. Architectures. Stateless vs. Stateful.
Architectures Cluster Computing Job Parallelism Request Parallelism 2 2010 VMware Inc. All rights reserved Replication Stateless vs. Stateful! Fault tolerance High availability despite failures If one
More informationClient/Server Computing Distributed Processing, Client/Server, and Clusters
Client/Server Computing Distributed Processing, Client/Server, and Clusters Chapter 13 Client machines are generally single-user PCs or workstations that provide a highly userfriendly interface to the
More informationBBM467 Data Intensive ApplicaAons
Hace7epe Üniversitesi Bilgisayar Mühendisliği Bölümü BBM467 Data Intensive ApplicaAons Dr. Fuat Akal akal@hace7epe.edu.tr FoundaAons of Data[base] Clusters Database Clusters Hardware Architectures Data
More informationLoad Balancing in Distributed Data Base and Distributed Computing System
Load Balancing in Distributed Data Base and Distributed Computing System Lovely Arya Research Scholar Dravidian University KUPPAM, ANDHRA PRADESH Abstract With a distributed system, data can be located
More informationOnline Transaction Processing in SQL Server 2008
Online Transaction Processing in SQL Server 2008 White Paper Published: August 2007 Updated: July 2008 Summary: Microsoft SQL Server 2008 provides a database platform that is optimized for today s applications,
More informationTOP-DOWN APPROACH PROCESS BUILT ON CONCEPTUAL DESIGN TO PHYSICAL DESIGN USING LIS, GCS SCHEMA
TOP-DOWN APPROACH PROCESS BUILT ON CONCEPTUAL DESIGN TO PHYSICAL DESIGN USING LIS, GCS SCHEMA Ajay B. Gadicha 1, A. S. Alvi 2, Vijay B. Gadicha 3, S. M. Zaki 4 1&4 Deptt. of Information Technology, P.
More informationWebsense Support Webinar: Questions and Answers
Websense Support Webinar: Questions and Answers Configuring Websense Web Security v7 with Your Directory Service Can updating to Native Mode from Active Directory (AD) Mixed Mode affect transparent user
More informationDeploying Exchange Server 2007 SP1 on Windows Server 2008
Deploying Exchange Server 2007 SP1 on Windows Server 2008 Product Group - Enterprise Dell White Paper By Ananda Sankaran Andrew Bachler April 2008 Contents Introduction... 3 Deployment Considerations...
More informationObject Oriented Database Management System for Decision Support System.
International Refereed Journal of Engineering and Science (IRJES) ISSN (Online) 2319-183X, (Print) 2319-1821 Volume 3, Issue 6 (June 2014), PP.55-59 Object Oriented Database Management System for Decision
More informationCHAPTER 8 CONCLUSION AND FUTURE ENHANCEMENTS
137 CHAPTER 8 CONCLUSION AND FUTURE ENHANCEMENTS 8.1 CONCLUSION In this thesis, efficient schemes have been designed and analyzed to control congestion and distribute the load in the routing process of
More informationVII. Database System Architecture
VII. Database System Lecture Topics Monolithic systems Client/Server systems Parallel database servers Multidatabase systems CS338 1 Monolithic System DBMS File System Each component presents a well-defined
More informationlow-level storage structures e.g. partitions underpinning the warehouse logical table structures
DATA WAREHOUSE PHYSICAL DESIGN The physical design of a data warehouse specifies the: low-level storage structures e.g. partitions underpinning the warehouse logical table structures low-level structures
More informationWeb Email DNS Peer-to-peer systems (file sharing, CDNs, cycle sharing)
1 1 Distributed Systems What are distributed systems? How would you characterize them? Components of the system are located at networked computers Cooperate to provide some service No shared memory Communication
More informationCloud Based Application Architectures using Smart Computing
Cloud Based Application Architectures using Smart Computing How to Use this Guide Joyent Smart Technology represents a sophisticated evolution in cloud computing infrastructure. Most cloud computing products
More informationAdvantages of DBMS. Copyright @ www.bcanotes.com
Advantages of DBMS One of the main advantages of using a database system is that the organization can exert, via the DBA, centralized management and control over the data. The database administrator is
More informationBuilding Scalable Applications Using Microsoft Technologies
Building Scalable Applications Using Microsoft Technologies Padma Krishnan Senior Manager Introduction CIOs lay great emphasis on application scalability and performance and rightly so. As business grows,
More informationPARALLELS CLOUD STORAGE
PARALLELS CLOUD STORAGE Performance Benchmark Results 1 Table of Contents Executive Summary... Error! Bookmark not defined. Architecture Overview... 3 Key Features... 5 No Special Hardware Requirements...
More informationTIBCO ActiveSpaces Use Cases How in-memory computing supercharges your infrastructure
TIBCO Use Cases How in-memory computing supercharges your infrastructure is a great solution for lifting the burden of big data, reducing reliance on costly transactional systems, and building highly scalable,
More informationSoftware Life-Cycle Management
Ingo Arnold Department Computer Science University of Basel Theory Software Life-Cycle Management Architecture Styles Overview An Architecture Style expresses a fundamental structural organization schema
More informationHigh Availability Essentials
High Availability Essentials Introduction Ascent Capture s High Availability Support feature consists of a number of independent components that, when deployed in a highly available computer system, result
More informationCOSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters
COSC 6374 Parallel I/O (I) I/O basics Fall 2012 Concept of a clusters Processor 1 local disks Compute node message passing network administrative network Memory Processor 2 Network card 1 Network card
More information1.1.1 Introduction to Cloud Computing
1 CHAPTER 1 INTRODUCTION 1.1 CLOUD COMPUTING 1.1.1 Introduction to Cloud Computing Computing as a service has seen a phenomenal growth in recent years. The primary motivation for this growth has been the
More informationNaming vs. Locating Entities
Naming vs. Locating Entities Till now: resources with fixed locations (hierarchical, caching,...) Problem: some entity may change its location frequently Simple solution: record aliases for the new address
More informationAdapting Distributed Hash Tables for Mobile Ad Hoc Networks
University of Tübingen Chair for Computer Networks and Internet Adapting Distributed Hash Tables for Mobile Ad Hoc Networks Tobias Heer, Stefan Götz, Simon Rieche, Klaus Wehrle Protocol Engineering and
More informationOpenMosix Presented by Dr. Moshe Bar and MAASK [01]
OpenMosix Presented by Dr. Moshe Bar and MAASK [01] openmosix is a kernel extension for single-system image clustering. openmosix [24] is a tool for a Unix-like kernel, such as Linux, consisting of adaptive
More informationEvolution of Distributed Database Management System
Evolution of Distributed Database Management System During the 1970s, corporations implemented centralized database management systems to meet their structured information needs. Structured information
More information