Outline. EEC-681/781 Distributed Computing Systems. Review of Lecture 1. Lecture 2

Similar documents
Distributed System: Definition

Distribution transparency. Degree of transparency. Openness of distributed systems

Distributed Systems. REK s adaptation of Prof. Claypool s adaptation of Tanenbaum s Distributed Systems Chapter 1

2.1 What are distributed systems? What are systems? Different kind of systems How to distribute systems? 2.2 Communication concepts

Distributed Systems LEEC (2005/06 2º Sem.)

How To Make A Distributed System Transparent

Virtual machine interface. Operating system. Physical machine interface

Principles and characteristics of distributed systems and environments

How To Understand The Concept Of A Distributed System

Distributed Operating Systems

Basics. Topics to be covered. Definition of a Distributed System. Definition of a Distributed System

CS550. Distributed Operating Systems (Advanced Operating Systems) Instructor: Xian-He Sun

Distributed Systems. Examples. Advantages and disadvantages. CIS 505: Software Systems. Introduction to Distributed Systems

CORBA and object oriented middleware. Introduction

Chapter 1: Distributed Systems: What is a distributed system? Fall 2008 Jussi Kangasharju

Distributed System Principles

Chapter 17: Distributed Systems

COMP5426 Parallel and Distributed Computing. Distributed Systems: Client/Server and Clusters

Centralized Systems. A Centralized Computer System. Chapter 18: Database System Architectures

Client/Server Computing Distributed Processing, Client/Server, and Clusters

Web DNS Peer-to-peer systems (file sharing, CDNs, cycle sharing)

CHAPTER 2 MODELLING FOR DISTRIBUTED NETWORK SYSTEMS: THE CLIENT- SERVER MODEL

Introduction to CORBA. 1. Introduction 2. Distributed Systems: Notions 3. Middleware 4. CORBA Architecture

DISTRIBUTED SYSTEMS AND CLOUD COMPUTING. A Comparative Study

Deploying a distributed data storage system on the UK National Grid Service using federated SRB

CHAPTER 1: OPERATING SYSTEM FUNDAMENTALS

Software Concepts. Uniprocessor Operating Systems. System software structures. CIS 505: Software Systems Architectures of Distributed Systems

Chapter 6. CORBA-based Architecture. 6.1 Introduction to CORBA 6.2 CORBA-IDL 6.3 Designing CORBA Systems 6.4 Implementing CORBA Applications

BBM467 Data Intensive ApplicaAons

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

Chapter 18: Database System Architectures. Centralized Systems

Distributed Systems Lecture 1 1

Agenda. Distributed System Structures. Why Distributed Systems? Motivation

Middleware and Distributed Systems. Introduction. Dr. Martin v. Löwis

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical

1 Organization of Operating Systems

Transparency in Distributed Systems

Client/Server and Distributed Computing

CS 5523 Operating Systems: Intro to Distributed Systems

Highly Available Mobile Services Infrastructure Using Oracle Berkeley DB

Mobile and Heterogeneous databases Database System Architecture. A.R. Hurson Computer Science Missouri Science & Technology

Chapter 14: Distributed Operating Systems

DISTRIBUTED AND PARALLELL DATABASE

Network File System (NFS) Pradipta De

Symmetric Multiprocessing

Distributed Computing

Distributed Systems. Outline. What is a Distributed System?

Distributed Systems. Security concepts; Cryptographic algorithms; Digital signatures; Authentication; Secure Sockets

Chapter 16: Distributed Operating Systems

Proactive, Resource-Aware, Tunable Real-time Fault-tolerant Middleware

Distributed Data Management

Ingegneria del Software II academic year: Course Web-site: [

How To Virtualize A Storage Area Network (San) With Virtualization

Event-based middleware services

Data Center Fabrics and Their Role in Managing the Big Data Trend

Network Attached Storage. Jinfeng Yang Oct/19/2015

Diagram 1: Islands of storage across a digital broadcast workflow

Cloud Based Distributed Databases: The Future Ahead

Operating System Concepts. Operating System 資 訊 工 程 學 系 袁 賢 銘 老 師

CommuniGate Pro White Paper. Dynamic Clustering Solution. For Reliable and Scalable. Messaging

IV Distributed Databases - Motivation & Introduction -

Overview of Routing between Virtual LANs

1. Comments on reviews a. Need to avoid just summarizing web page asks you for:

System Models for Distributed and Cloud Computing

Tier Architectures. Kathleen Durant CS 3200

Distributed Operating Systems. Cluster Systems

Chapter 7: Distributed Systems: Warehouse-Scale Computing. Fall 2011 Jussi Kangasharju

Chapter Outline. Chapter 2 Distributed Information Systems Architecture. Middleware for Heterogeneous and Distributed Information Systems

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing WHAT IS CLOUD COMPUTING? 2

Multi-Site Software Development It s Not Just Replication Anymore

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

Distributed Systems and Recent Innovations: Challenges and Benefits

CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level. -ORACLE TIMESTEN 11gR1

IOS110. Virtualization 5/27/2014 1

Cisco Active Network Abstraction Gateway High Availability Solution

Distributed Systems. Distributed Systems

Session 11 Under the hood of a commercial website

Modular Communication Infrastructure Design with Quality of Service

Design and Implementation of the Heterogeneous Multikernel Operating System

Wide Area Networks. Learning Objectives. LAN and WAN. School of Business Eastern Illinois University. (Week 11, Thursday 3/22/2007)

Making Multicore Work and Measuring its Benefits. Markus Levy, president EEMBC and Multicore Association

Introduction CORBA Distributed COM. Sections 9.1 & 9.2. Corba & DCOM. John P. Daigle. Department of Computer Science Georgia State University

1.4 SOFTWARE CONCEPTS

Avid ISIS

Ethernet. Ethernet. Network Devices

EMC VPLEX FAMILY. Continuous Availability and data Mobility Within and Across Data Centers

EII - ETL - EAI What, Why, and How!

IBM Tivoli Storage Manager Version Introduction to Data Protection Solutions IBM

Local Area Networks: Software and Support Systems

Cisco Virtual SAN Advantages and Use Cases

High Availability Solutions for the MariaDB and MySQL Database

EMC VPLEX FAMILY. Transparent information mobility within, across, and between data centers ESSENTIALS A STORAGE PLATFORM FOR THE PRIVATE CLOUD

The EMSX Platform. A Modular, Scalable, Efficient, Adaptable Platform to Manage Multi-technology Networks. A White Paper.

HRG Assessment: Stratus everrun Enterprise

Middleware Lou Somers

Solaris For The Modern Data Center. Taking Advantage of Solaris 11 Features

Virtualization Technologies and Blackboard: The Future of Blackboard Software on Multi-Core Technologies

Module 15: Network Structures

Transcription:

EEC-681/781 Distributed Computing Systems Lecture 2 Outline Overview of distributed systems Design Goals (part 2) Hardware Concepts Software Concepts 2 Department of Electrical and Computer Engineering Cleveland State University wenbing@ieee.org Review of Lecture 1 3 Definition of a Distributed System 4 Definition of distributed systems Design goals (part 1) Original: A collection of independent computers that appear to the users as a single coherent system Modified: A piece of software that ensures a collection of autonomous computers to appear as a single coherent system Autonomous computers connected by a network Software specifically designed to provide an integrated computing facility 1

Design Goals 5 Distribution Transparency 6 Connecting users and resources Transparency: Users feel like they are using a single-user system Openness: Services provided are based on standards Flexibility: Separation of policy and mechanisms Access transparency Location transparency Migration transparency Relocation transparency Replication transparency Concurrency transparency Scalability, availability, security Persistency transparency Distribution Transparency Mini-quiz 7 Distribution Transparency Mini-quiz 8 The transparency that hides where a resource is located is called: a) Access transparency b) Location transparency c) Relocation transparency d) Migration transparency The transparency that hides differences in data representation and how a resource is accessed a) Access transparency b) Location transparency c) Relocation transparency d) Migration transparency The transparency that hides the fact that a resource may move to another location is called a) Access transparency b) Location transparency c) Relocation transparency d) Migration transparency The transparency that hides the fact that a resource may be moved to another location while in use is called a) Access transparency b) Location transparency c) Relocation transparency d) Migration transparency 2

9 10 Replication Transparency Concurrency Transparency Replication transparency - Hide that a resource is replicated More than one copy is available All replica should have the same visible name Concurrency transparency - Hide that a resource may be shared by several competitive users This feature is really nothing new. Operating systems have been offering concurrency transparency for a number of decades Easy to guarantee if accesses to the same resource are all read-only Care must be taken to maintain consistence if some accesses are updates 11 12 Failure Transparency Persistency Transparency Failure Transparency - Hide the failure and recovery of a resource Can be achieved through replication But, very challenging and costly in general Persistency Transparency - Hide whether a (software) resource is in memory or on disk 3

Degree of Transparency 13 Degree of Transparency 14 Observation: Aiming at full distribution transparency may be too much Sometime distribution is apparent and not something you want to hide, e.g., users may be located in different continents Completely hiding failures of networks and nodes is (theoretically and practically) impossible You cannot distinguish a slow computer from a failing one You can never be sure that a server actually performed an operation before a crash Full transparency will cost performance, exposing distribution of the system Keeping Web caches exactly up-to-date with the master copy Immediately flushing write operations to disk for fault tolerance Openness of Distributed Systems Open distributed system: Be able to interact with services from other open systems, irrespective of the underlying environment Systems should conform to well-defined interfaces Systems should support portability of applications Systems should easily interoperate 15 Openness of Distributed Systems Achieving openness: At least make the distributed system independent from heterogeneity of the underlying environment Hardware Platforms Languages 16 4

Implementation Openness 17 Implementation Openness 18 Openness requires flexibility Implementing openness: Requires support for different policies specified by applications and users What level of consistency do we require for client cached data? Which operations do we allow downloaded code to perform? Which QoS requirements do we adjust in the face of varying bandwidth? What level of secrecy do we require for communication? Implementing openness: Ideally, a distributed system provides only mechanisms: Allow (dynamic) setting of caching policies, preferably per cacheable item Support different levels of trust for mobile code Provide adjustable QoS parameters per data stream Offer different encryption algorithms 19 20 Mechanisms and Policies Example: Managing a Queue Mechanisms determine how to do something while policies decide what should be done The separation of policy from mechanism allows maximum flexibility in choosing policies and if policy decisions are to be changed later Let s use an abstract priority queue as example We need to support mechanisms for: Insert/Delete items at start Insert/Delete items at end Know length of queue The queue can be implemented in different ways Policies can be for example FIFO, LIFO should be decided by queue user 5

Scale in Distributed Systems 21 Size Scalability 22 Scalability can be measured at three dimensions: Size scalability We can easily add more users and resources to the system Geographical scalability users and resources may lie far apart geographically Administrative scalability The system can still be easy to manage even if it spans many independent administrative organizations Thomas J. Watson, Chairman of IBM, 1943: I think there is a world market for maybe five computers Internet: July 1993: 1,776,000 computers July 1999: 56,218,000 computers Scalability problems in distributed systems appear as performance problems caused by limited capacity of servers and network January 2002: 168,000,000 computers and > 23,000,000 DNS domains 23 24 Size Scalability Problems Size Scalability Problems Concept Centralized services Centralized data Centralized algorithms Example A single server for all users A single on-line telephone book Doing routing based on complete information Problem running centralized algorithms in distributed systems Would result in enormous number of messages have to be routed over many lines Any algorithm that operates by collecting information from all sites, sends it to a single machine for processing, and then distributes the results must be avoided 6

Decentralized Algorithm Characteristics 25 Geographical Scalability Problems 26 No machine has complete information about the system state Machines make decisions based only on local information Failure of one machine does not ruin the algorithm There is no implicit assumption that a global clock exists Interprocess communication in WANs has much longer latency than that in LANs Communication in WANs is inherently unreliable, and virtually always point-to-point Centralized components would reduce geographical scalability, just as does to size scalability Administration Scalability Problems 27 Techniques for Scaling 28 Different administrative domain usually impose different policies, e.g., with respect to resource usage, management, and security Hiding communication latencies Distribution Replication 7

Hiding Communication Latencies 29 Hiding Communication Latencies Move Computation to Clients 30 Hiding communication latencies is applicable to in the case of geographical scalability Technique #1: Try to avid waiting for responses to remote service requests as much as possible Technique #2: Reduce the overall communication by moving part of the computation that is normally done at the server to the client process requesting the service Technique for Scaling - Distribution 31 Decentralized Naming Service 32 Distribution: Partition data and computations across multiple machines Examples: Domain name services (DNS) DNS name space is hierarchically organized into a tree of domains, which are divided into nonoverlapping zones 8

Techniques for Scaling Replication 33 Techniques for Scaling Replication 34 Replication: Make copies of data available at different machines across the distributed system Examples: Replicated file servers (mainly for fault tolerance) Replicated databases Mirrored Web sites Large-scale distributed shared memory systems Replication not only increases availability, but also helps to balance the load between components leading to better performance Replication also help increase the geographical scalability by placing a copy nearby different users Problem with Scaling by Replication 35 Problem with Scaling by Replication 36 Applying scaling techniques through replication sounds straightforward, but be aware that having multiple copies might leads to inconsistencies: modifying one copy makes that copy different from the rest Always keeping copies consistent and in a general way requires global synchronization on each modification Global synchronization precludes large-scale solutions If we can tolerate inconsistencies, we may reduce the need for global synchronization Tolerating inconsistencies is application dependent 9

Techniques for Scaling Caching Caching: A special form of replication. It allows client processes to access local copies Web caches (browser/web proxy) File caching (at server and client) Similarity to replication: making a copy of a resource, generally in the proximity of the client accessing that resource Difference from replication: caching is a decision made by the client of a resource, not by the owner of the resource 37 Distributed Systems: Hardware Concepts Multiprocessors Multicomputers Networks of Computers 38 Multiprocessors and Multicomputers 39 Networks of Computers 40 Distinguishing features: Private versus shared memory Bus versus switched interconnection High degree of node heterogeneity: High-performance parallel systems (multiprocessors as well as multicomputers) High-end PCs and workstations (servers) Simple network computers (offer users only network access) Mobile computers (palmtops, laptops) Multimedia workstations High degree of network heterogeneity: Local-area gigabit networks Wireless connections Wide-area switched megabit connections 10

Distributed Systems: Software Concepts 41 Distributed Operating Systems 42 An overview between DOS (Distributed Operating Systems) NOS (Network Operating Systems) Middleware Some characteristics OS on each computer knows about the other computers OS on different computers generally the same Services are generally (transparently) distributed across computers System Description Main Goal DOS NOS Middleware Tightly-coupled operating system for multiprocessors and homogeneous multicomputers Loosely-coupled operating system for heterogeneous multicomputers (LAN and WAN) Additional layer atop of NOS implementing general-purpose services Hide and manage hardware resources Offer local services to remote clients Provide distribution transparency Multicomputer Operating Systems Harder than traditional (multiprocessor) OS, because memory is not shared, emphasis shifts to processor communication by message passing: Often no simple global communication No simple system-wide synchronization mechanisms Virtual (distributed) shared memory requires OS to maintain global memory map in software Inherent distributed resource management: no central point where allocation decisions can be made 43 Network Operating System Some characteristics: Each computer has its own operating system with networking facilities Computers work independently (i.e., they may even have different operating systems) Services are tied to individual nodes (ftp, ssh) Highly file oriented (basically, processors share only files) 44 11

45 46 Network Operating System Network Operating System Two clients and a server in a network operating system Different clients may mount the servers in different places. 47 Distributed System (Middleware-based) Characteristics: OS on each computer need not know about the other computers OS on different computers need not generally be the same Services are generally (transparently) distributed across computers 12