Evolution of Distributed Database Management System



Similar documents
chapater 7 : Distributed Database Management Systems

Distributed Databases. Concepts. Why distributed databases? Distributed Databases Basic Concepts

Distributed Databases

Chapter 3. Database Environment - Objectives. Multi-user DBMS Architectures. Teleprocessing. File-Server

Distributed Database Management Systems for Information Management and Access

Module 5 Introduction to Processes and Controls

Distributed Database Management Systems

B.Com(Computers) II Year DATABASE MANAGEMENT SYSTEM UNIT- V

Objectives. Distributed Databases and Client/Server Architecture. Distributed Database. Data Fragmentation

Assistant Information Technology Specialist. X X X software related to database development and administration Computer platforms and

ICS 434 Advanced Database Systems

Introduction. Chapter 1. Introducing the Database. Data vs. Information

Introduction. Introduction: Database management system. Introduction: DBS concepts & architecture. Introduction: DBS versus File system

Introduction: Database management system

B.Sc (Computer Science) Database Management Systems UNIT-V

Chapter 1 - Database Systems

Database Management Systems

Chapter 18: Database System Architectures. Centralized Systems

CHAPTER 15: Operating Systems: An Overview

Distributed Systems LEEC (2005/06 2º Sem.)

An Overview of Distributed Databases

Database Management. Chapter Objectives

Centralized Systems. A Centralized Computer System. Chapter 18: Database System Architectures

AS/400 System Overview

COMPONENTS in a database environment

Introduction to Enterprise Data Recovery. Rick Weaver Product Manager Recovery & Storage Management BMC Software

Chapter 2 Database System Concepts and Architecture

Distributed Architectures. Distributed Databases. Distributed Databases. Distributed Databases

Deploying a distributed data storage system on the UK National Grid Service using federated SRB

B.Com(Computers) II Year RELATIONAL DATABASE MANAGEMENT SYSTEM Unit- I

AHAIWE Josiah Information Management Technology Department, Federal University of Technology, Owerri - Nigeria jahaiwe@yahoo.

Chapter 1. Database Systems. Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel

Distributed Data Management

ISM 318: Database Systems. Objectives. Database. Dr. Hamid R. Nemati

Tier Architectures. Kathleen Durant CS 3200

Concepts of Database Management Seventh Edition. Chapter 7 DBMS Functions

IT Components of Interest to Accountants. Importance of IT and Computer Networks to Accountants

BIT Course Description

C/S Basic Concepts. The Gartner Model. Gartner Group Model. GM: distributed presentation. GM: distributed logic. GM: remote presentation

Evolution of the Data Center

Scalability and BMC Remedy Action Request System TECHNICAL WHITE PAPER

3-Tier Architecture. 3-Tier Architecture. Prepared By. Channu Kambalyal. Page 1 of 19

CS 3530 Operating Systems. L02 OS Intro Part 1 Dr. Ken Hoganson

TECHNIQUES FOR DATA REPLICATION ON DISTRIBUTED DATABASES

IOS110. Virtualization 5/27/2014 1

VERITAS Business Solutions. for DB2

Outline. Mariposa: A wide-area distributed database. Outline. Motivation. Outline. (wrong) Assumptions in Distributed DBMS

Operating Systems OBJECTIVES 7.1 DEFINITION. Chapter 7. Note:

In-memory Tables Technology overview and solutions

DAS, NAS or SAN: Choosing the Right Storage Technology for Your Organization

Case Studies Using EMC Legato NetWorker for OpenVMS Backups

Veritas Cluster Server from Symantec

Dell Microsoft Business Intelligence and Data Warehousing Reference Configuration Performance Results Phase III

Informix Dynamic Server May Availability Solutions with Informix Dynamic Server 11

2) What is the structure of an organization? Explain how IT support at different organizational levels.

Virtual Operational Data Store (VODS) A Syncordant White Paper

Chapter 16 Distributed Processing, Client/Server, and Clusters

Introductory Concepts

Version 5.0. MIMIX ha1 and MIMIX ha Lite for IBM i5/os. Using MIMIX. Published: May 2008 level Copyrights, Trademarks, and Notices

Introduction to Databases

Data Management, Analysis Tools, and Analysis Mechanics

Installation Guide for Workstations

Active/Active DB2 Clusters for HA and Scalability

Architecture of Transaction Processing Systems

Database System Architecture & System Catalog Instructor: Mourad Benchikh Text Books: Elmasri & Navathe Chap. 17 Silberschatz & Korth Chap.

Concepts of Database Management Seventh Edition. Chapter 9 Database Management Approaches

Chapter 14: Distributed Operating Systems

Chapter 9 Understanding Complex Networks

HP-UX File System Replication

System Development and Life-Cycle Management (SDLCM) Methodology

IBM Global Technology Services March Virtualization for disaster recovery: areas of focus and consideration.

Module 15: Network Structures

KASPERSKY LAB. Kaspersky Administration Kit version 6.0. Administrator s manual

ORACLE DATABASE 10G ENTERPRISE EDITION

Implementing efficient system i data integration within your SOA. The Right Time for Real-Time

Distributed Databases in a Nutshell

Distributed Operating Systems

Clustered Database Reporting Solution utilizing Tivoli

Basic Concepts of Database Systems

An Intelligent Approach for Integrity of Heterogeneous and Distributed Databases Systems based on Mobile Agents

An Overview of Data Warehousing, Data mining, OLAP and OLTP Technologies

Chapter 16: Distributed Operating Systems

Distributed Databases

ICONICS Choosing the Correct Edition of MS SQL Server

Operating System Concepts. Operating System 資 訊 工 程 學 系 袁 賢 銘 老 師

Star System Deitel & Associates, Inc. All rights reserved.

PATROL From a Database Administrator s Perspective

Downsizing : Client/Server Computing Joe Wang, The Upjohn Company, Kalamazoo, MI (616)

2. Highlights and Updates: ITSM for Databases

IBM Virtualization Engine TS7700 GRID Solutions for Business Continuity

VERITAS Backup Exec 9.0 for Windows Servers

Symantec NetBackup 7 Clients and Agents

Transcription:

Evolution of Distributed Database Management System During the 1970s, corporations implemented centralized database management systems to meet their structured information needs. Structured information is usually presented as regularly issued formal reports in a standard format.

The use of centralized database required that corporate data be stored in a single central site, usually a mainframe computer. Data access was provided through serially connected dumb terminals. The centralized approach worked well to fill the structured information needs of corporations, but it fell short when quickly moving events required faster reponse times and equally quick access to information.

The 1980s gave birth to a series of crucial social and technological changes that affected database development and design, such as: Business operations became more decentralized geographically. Competition increased at the global level. Customer demands and market needs favored a decentralized management style.

Rapid technological change low cost mainframe. The large number of applications based on DBMS and the need to protect investments in centralized DBMS software made the notion of data sharing attractive.

During 1990s the factors we have just described became even more firmly pressured. However, how these factors were addressed was strongly influenced by: The growing acceptance of internet and, particularly, the WWW, as the platform for data access and distribution. The WWW is, in effect, the repository for distributed data. The increased focus on data analysis that led to data mining and warehousing. Although a data warehouse is not usually a distributed database, it does rely on techniques-such as data replication and distributed queries-that facilitate its data extraction and integration.

The decentralized database is especially desirable because centralized database management is subject to problems such as: Performance degradation due to a growing number of remote locations over greater distance. High Cost associated with maintaining and operating large central (mainframe) database systems. Reliability problems created by dependence on central site.

Advantages of DDBMS Data are located near the greatest demand site: The data in a distributed database system are dispersed to match business requirements.

Faster Data Processing: A distributed database system makes it possible to process data at several sites, thereby spreading out the system s workload.

Faster Data Access: End users often work with only a subset of the company s data. If such data subsets are locally stored and accessed, the database system will deliver faster data access than is possible with remotely located centralized data.

Growth Facilitation: New sites can be added to the network without affecting the operations of other sites. Such flexibility enables the company to expand relatively easily and rapidly.

Improved Communications: Because local sites are smaller and located closer to customers, local sites foster better communications among departments and between customers and company staff. Quicker and better communication often help to improve information systems.

Reduced operating costs: It is much more cost-effective to add workstations to a network than to update a mainframe system. The cost of dedicated data communication lines and mainframe software is reduced proportionally. Development work is done more cheaply and more quickly on low-cost PCs than on mainframe.

User Friendly Interface: PCs and workstations are usually endowed with an easy-to-use GUI. The GUI simplifies use and training for end users.

Less danger of a single-point failure: In a centralized system, the mainframe s failure brings down all the system s operations. In contrast, a distributed system is able to shift operations when one of the computer fails.the system workload picked up by other workstations, because one of the distributed system s features is that data exist at multiple sites.

Processor Independence: The end user is able to access any available copy of the data and an end user s request is processed by any processor at the data location. In other words, requests do not depend on a specific processor; any available processor can handle the user s request.

Disadvantages of DDBMS Complexity of Management and control: Management of distributed data is a more complex task than management of centralized data. Application must recognize data location, and they must be able to stitch together data from different sites. DBA must have the ability to coordinate database activities to prevent database degradation due to data anomalies. Transaction management, concurrency control, security, backup recovery, query optimization, access path selection, and so on, must be addressed and resolved.

Security: The probability of security lapses increases when data are located at multiple sites. The responsibility of data management will be shared by different people at several sites, and LANs do not yet have the sophisticated security of centralized mainframe installation.

Lack of Standards: Although distributed databases depend on effective communication, there are no standard communication protocols. In fact, few official standards exist in any of the distributed database protocols, whether they deal with communication or data access control. Consequently, distributed database users must wait for the definitive emergence of standard protocols before distributed databases can deliver all their potential goods.

Increased Storage Requirements: Data replication requires additional disk storage space. This disadvantage is minor one, because disk storage space is relatively cheap and it is becoming cheaper. However, disk access and storage management is a widely dispersed data storage environment are more complex than they would be in a centralized database.

Distributed Processing and Distributed Databases Distributed processing shares the database s logical processing among two or more physically independent sites that are connected through a network. For example, distributed processing might perform the data input/output, the data selection, and the data validation in one computer, and then create a report based on such data in another computer.

A distributed database, on the other hand, stores a logically related database over two or more physically sites. The sites are connected via a computer network. In contrast, the distributed processing system uses only a single-site database but shares the processing chores among the several sites. In a distributed database system, a database is composed of several parts known as database fragments. The database fragments are located at different sites.

Points to remember: Distributed processing does not require a distributed database, but a distributed database requires distributed processing. Distributed Processing may be based on a single database located on a single computer. In order to manage distributed data, copies or parts of the database processing functions must be distributed to all data storage sites. Both distributed processing and distributed databses require a network to connect all components.

DDBMS A distributed database management system governs the storage and processing of logically related data over interconnected computer systems in which both data and processing functions are distributed among several sites. A DBMS must have at least the following functions to be classified as distributed: Application interface to interact with end user or application programs and with other DBMSs within the distributed databases. Validation to analyze data requests.

Transformation to determine which data request components are distributed and which ones are local. Query optimization to find the best access strategy. Mapping to determine the data location of local and remote fragments. I/O interface to read or write data from or to permanent local storage. Formatting to prepare the data for presentation to the end user or an application program.

Security to provide data privacy at both local and remote databases. Backup and recovery to ensure the availability and recoverability of the database in case of a failure. DB administration for the database administrator. Concurrency control to manage simultaneous data access and ensure data consistency across database fragments in the DDBMS. Transaction management to ensure that the data move from one consistent state to another.

DDBMS Components The DDBMS must include the following components: Computer Workstations that form the network system. The distributed database system must be independent of the computer system hardware. Network hardware and software components that reside in each workstation. The network components allow all sites to interact and exchange data. Network system independence is a desirable distributed database system attribute.

Communication media that carry the data from one workstation to another. The DDBMS must be communications-media-independent; that is, it must be able to support several types of communications media. The Transaction Processor(TP) which is the software component found each computer that requests data. The transaction processor receives and processes the application s data requests (remote and local). The TP is also known as the application processor(ap) or the transaction manager(tm).

The data processor(dp), which is the software component residing on each computer that stores and retrieves data located at the site. The DP is also known as the data manager (DM). A data processor may even be a centralized DBMS.

Communication among TPs and DPs Dedicated data processor TP TP TP DP DP C o m m u n i c a t i o n s N e t w o r k TP DP TP DP DP Dedicated data processor Note: Each TP can access data on any DP, and each DP handles all requests for local data from any TP

The Protocols determine how the distributed database system will: Interface with the network to transport data and commands between DPs and TPs. Synchronize all data received from DPs and route retreived data to the appropriate TPs. Ensure common database functions in a distributed system. Such functions include security, concurrency control, backup and recovery.

Levels of Data and Process Distribution We can classify current database systems on the basis of how process distribution and data distribution are supported. For example, a DBMS may store data in a single site(centralized DB) or in multiple sites(distributed DB) and may support data processing at a single site and multiple site.

Single-site Processing, Single-site Data In the single-site processing, single-site data(spsd) scenario, all processing is done on a single CPU or host computer and all the data are stored on the host computer s local disk. Processing can be done on the end user s side of the system. Such a scenario is typical of most mainframe and minicomputer DBMSs. The DBMS is located on the host computer, which is accessed by dumb terminals connected of single-user microcomputer databases.

NONDISTRIBUTED (CENTRALIZED) DBMS T1 DBMS Front End Processor T2 Dumb terminals Database Communication via telephone lines T3 Remote dumb terminal

Multiple-site Processing, Single-site Data Under the multiple-site processing, single-site data (MPSD) scenario processes run on different sharing a single data repository. Typically, the MPSD scenario requires a network file server on which conventional applications are access through a LAN. Many multi user accounting applications, running under a personal computer network, fit such a description.

MULTIPLE-SITE PROCESSING, SINGLE-SITE DATA DP Site A Site B Site C TP TP TP Database Communication Network File Server

MULTIPLE-SITE PROCESSING, MULTIPLE-SITE DATA (MPMD) MPMD scenario describe a fully distributed database management system with support for multiple data processors and transaction processors at multiple sites. Depending on the level of support for different types of centralized DBMSs, distributed database management systems(dbmss) are classified as either homogeneous or heterogeneous.

Homogeneous DDBMS It integrate only one type of centralized DBMS over a network. Thus, the same DBMS will be running on different mainframes, minis, and microcomputers.

Heterogeneous DDBMS It integrate different types of central different type of centralized DBMSs over a network. A fully heterogeneous DDBMS will that may even support different data models(relational, hierarchical, or network) running under different computer systems, such as mainframes, minis, and microcomputers.

Heterogeneous Distributed Database Scenario PlatForm DBMS Op. system N/w Comm. Prot. IBM 3090 DB2 MVS APPC LU 6.2 DEC VAX VAX RDB MVS DECnet IBM AS/400 SQL/400 OS/400 3270 RISC INFORMIX UNIX TCP/IP PENTIUM CPU ORACLE OS/2 NetBIOS

No DDBMS currently provides full support for the scenario depicted in the previous figure, nor for the fully heterogeneous environment. Some DDBMS implementations support several platforms, operating systems, and networks and allow remote data access to another DBMS. However, such DDBMSs still are subject to certain restrictions:

Remote access is provided on a read-only basis and does not support write privileges. Restrictions are placed on the number of remote tables that may be accessed in a single transaction. Restrictions are placed on the number of distinct databases that may be accessed. Restrictions are placed on database model that may be accessed. Thus, access may be provided to relational databases but not to network or hierarchical databases.