CS 5523 Operating Systems: Intro to Distributed Systems

Similar documents
Distribution transparency. Degree of transparency. Openness of distributed systems

Distributed Systems LEEC (2005/06 2º Sem.)

A distributed system is defined as

How To Understand The Concept Of A Distributed System

Distributed System: Definition

Software Concepts. Uniprocessor Operating Systems. System software structures. CIS 505: Software Systems Architectures of Distributed Systems

Principles and characteristics of distributed systems and environments

Distributed Systems. REK s adaptation of Prof. Claypool s adaptation of Tanenbaum s Distributed Systems Chapter 1

2.1 What are distributed systems? What are systems? Different kind of systems How to distribute systems? 2.2 Communication concepts

Client/Server and Distributed Computing

How To Make A Distributed System Transparent

Virtual machine interface. Operating system. Physical machine interface

CHAPTER 2 MODELLING FOR DISTRIBUTED NETWORK SYSTEMS: THE CLIENT- SERVER MODEL

Agenda. Distributed System Structures. Why Distributed Systems? Motivation

Tier Architectures. Kathleen Durant CS 3200

Distributed Systems. Examples. Advantages and disadvantages. CIS 505: Software Systems. Introduction to Distributed Systems

CORBA and object oriented middleware. Introduction

Distributed Data Management

CS550. Distributed Operating Systems (Advanced Operating Systems) Instructor: Xian-He Sun

Event-based middleware services

Distributed Systems Lecture 1 1

Objectives. Distributed Databases and Client/Server Architecture. Distributed Database. Data Fragmentation

Distributed Operating Systems

Grid Computing Vs. Cloud Computing

Basics. Topics to be covered. Definition of a Distributed System. Definition of a Distributed System

CHAPTER 1: OPERATING SYSTEM FUNDAMENTALS

System Models for Distributed and Cloud Computing

Middleware Lou Somers

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

Operating System Concepts. Operating System 資 訊 工 程 學 系 袁 賢 銘 老 師

Introduction to CORBA. 1. Introduction 2. Distributed Systems: Notions 3. Middleware 4. CORBA Architecture

Chapter Outline. Chapter 2 Distributed Information Systems Architecture. Middleware for Heterogeneous and Distributed Information Systems

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

Chapter 14: Distributed Operating Systems

Client/server and peer-to-peer models: basic concepts

Mobile and Heterogeneous databases Database System Architecture. A.R. Hurson Computer Science Missouri Science & Technology

Software design (Cont.)

Cluster, Grid, Cloud Concepts

Chapter 1: Distributed Systems: What is a distributed system? Fall 2008 Jussi Kangasharju

Chapter 18: Database System Architectures. Centralized Systems

Computer Network. Interconnected collection of autonomous computers that are able to exchange information

Chapter 16: Distributed Operating Systems

Dr Markus Hagenbuchner CSCI319. Distributed Systems

Network Attached Storage. Jinfeng Yang Oct/19/2015

Scalability of web applications. CSCI 470: Web Science Keith Vertanen

System types. Distributed systems

Module 15: Network Structures

1. Comments on reviews a. Need to avoid just summarizing web page asks you for:

Client/Server Computing Distributed Processing, Client/Server, and Clusters

A Brief Analysis on Architecture and Reliability of Cloud Based Data Storage

DFSgc. Distributed File System for Multipurpose Grid Applications and Cloud Computing

Cluster Computing. ! Fault tolerance. ! Stateless. ! Throughput. ! Stateful. ! Response time. Architectures. Stateless vs. Stateful.

IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures

Centralized Systems. A Centralized Computer System. Chapter 18: Database System Architectures

Service-Oriented Architecture and Software Engineering

Network Address Translation (NAT) Adapted from Tannenbaum s Computer Network Ch.5.6; computer.howstuffworks.com/nat1.htm; Comer s TCP/IP vol.1 Ch.

BBM467 Data Intensive ApplicaAons

Client/server is a network architecture that divides functions into client and server

Web DNS Peer-to-peer systems (file sharing, CDNs, cycle sharing)

Distributed Databases

Diagram 1: Islands of storage across a digital broadcast workflow

Combining Service-Oriented Architecture and Event-Driven Architecture using an Enterprise Service Bus

DISTRIBUTED SYSTEMS AND CLOUD COMPUTING. A Comparative Study

Top-Down Network Design

COMP5426 Parallel and Distributed Computing. Distributed Systems: Client/Server and Clusters

VXLAN: Scaling Data Center Capacity. White Paper

1.4 SOFTWARE CONCEPTS

Chapter 17: Distributed Systems

Lecture 02b Cloud Computing II

The Service Revolution software engineering without programming languages

Adapting Distributed Hash Tables for Mobile Ad Hoc Networks

Base One's Rich Client Architecture

Chapter 2: Enterprise Applications from a Middleware Perspective

- An Essential Building Block for Stable and Reliable Compute Clusters

DISTRIBUTED AND PARALLELL DATABASE

Chapter 3. Database Environment - Objectives. Multi-user DBMS Architectures. Teleprocessing. File-Server

CSIS CSIS 3230 Spring Networking, its all about the apps! Apps on the Edge. Application Architectures. Pure P2P Architecture

HyperQ Remote Office White Paper

LinuxWorld Conference & Expo Server Farms and XML Web Services

Network File System (NFS) Pradipta De

SOFT 437. Software Performance Analysis. Ch 5:Web Applications and Other Distributed Systems

Introduction to Computer Networks

WAN Optimization, Web Cache, Explicit Proxy, and WCCP. FortiOS Handbook v3 for FortiOS 4.0 MR3

The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets

Computer Networks & Security 2014/2015

Distributed Systems. Security concepts; Cryptographic algorithms; Digital signatures; Authentication; Secure Sockets

Distributed Databases. Concepts. Why distributed databases? Distributed Databases Basic Concepts

Distributed Systems. 2. Application Layer

Using Peer to Peer Dynamic Querying in Grid Information Services

Multicast vs. P2P for content distribution

A Reputation Replica Propagation Strategy for Mobile Users in Mobile Distributed Database System

Rackspace Cloud Databases and Container-based Virtualization

Building a Highly Available and Scalable Web Farm

Transcription:

CS 5523 Operating Systems: Intro to Distributed Systems Instructor: Dr. Tongping Liu Thank Dr. Dakai Zhu, Dr. Palden Lama for providing their slides.

Outline Different Distributed Systems Ø Distributed computing systems Ø Distributed information systems Ø Distributed pervasive systems OS in distributed systems Ø Distributed OS vs. Network OS vs. Middleware Design objectives of distributed systems Ø Transparency, openness and scalability Architecture of distributed systems Ø Software vs. system architectures 2

Computer System Revolution Computers" " "large/expensive à small/cheap" Networks: " "LAN à WAN, " " "bps à Kbps à Gbps" Now, it is easy to put together many computers. Why? " Ø Solve problems" Ø Share resources" Ø Increase collaboration" Centralized systems à Distributed systems!

What are Distributed Systems? ISP intranet backbone desktop computer: server: network link: satellite link A collection of networked independent computers that appears to its users as a single coherent system 4

Different Types of Distributed Systems Distributed Computing Systems: HP computing task Ø Cluster computing: similar components Ø Grid computing / Cloud computing: different components Distributed Information Systems Ø Web servers Ø Distributed database applications Distributed Pervasive Systems: instable Ø Smart home systems Ø Electronic health systems: monitor Ø Sensor networks: surveillance systems 5

Cluster Computing Systems High-performance computing Ø a group of high-end systems connected through a LAN Ø Homogeneous: same OS, near-identical hardware Ø Single managing node 6

Grid Computing Systems Lots of nodes from everywhere share resources and collaborate" Ø Heterogeneous! Ø Dispersed across several organizations" Ø Can easily span a wide-area network" To allow for collaborations, grids generally use virtual organizations. " Ø Same organization: a grouping of users (or better: their IDs) have the same access rights" Ø The key questions are " ü Authorize users from different administrative domains" ü Provide authorized users with the access to specific resources " 7

Grid Computing Systems (cont.) Application: " Ø Use the grid computing environment" Collective layer: " Ø Handles access to multiple resources (resource discovery, allocation and scheduling, data replication)" Connectivity layer: " Ø Communication protocols " (transfer data across resources, access a remote resource, security) "" Resource layer: " Ø Manage a single resource " (create a process, access control)" Fabric layer: " Ø Interface to local resources (query, locking)" 8

Cloud Computing Systems Cloud computing has become another buzzword after Web 2.0. " We won t compute on local computers, but on centralized facilities operated by third-party compute and storage utilities "" There are dozens of different definitions for cloud computing and there seems to be no consensus on what a cloud is. Here is one definition:!! A large-scale distributed computing paradigm that is driven by economies of scale, in which a pool of abstracted, virtualized, dynamically-scalable, managed computing power, storage, platforms, and services are delivered on demand to external customers over the Internet. " Cloud computing is not a completely new concept; it has intricate connection to the relatively new but thirteen-year established grid computing paradigm, and other relevant technologies such as utility computing, cluster computing, and distributed systems in general. " I. Foster, Cloud Computing and Grid Computing 360-Degree Compared, Grid Computing Environments Workshop, 2008. GCE '08 9

DIS: Distributed Information Systems Organizations have legacy networked applications, but it is hard to make them interoperate " Middleware can help " Integration can take place at several levels" Ø Client-servers wrap a number of requests into one and have it executed as a Distributed Transaction (all or none of requests would be executed)" Ø Applications can be detached from their databases and may need to directly communicate with each other: Enterprise Application Integration (EAI)"

DIS: Transaction Processing Systems Database applications, transactions properties (ACID) " Atomic: A transaction happens indivisibly; " Ø All operations either succeed, or all of them fail; " Consistent: Don t violate system invariants " Ø Internal transfer: total money should be the same" Ø Intermediate states may violate if not visible outside" Isolated (serializable): Concurrent transactions do not interfere with each other" Durable: Once a transaction commits, the changes are permanent "

DIS: Distributed Database Transactions n Transaction Processing Monitor: coordinate the execution of a transaction (subtransactions) when data is distributed across servers middleware

DIS: Enterprise Information Systems Inter-application Communication" Ø RPC (Remote Procedure Calls) or" RMI (Remote Method Invocations)" ü both applications must be up and running" ü know exactly how to refer to each other" Ø Message-Oriented Middleware (MOM)" ü send data to a logical contact" ü publish/subscribe" Stability: nodes are fixed and have high-quality connection to the system 13

DPS: Distributed Pervasive Systems DCS and DIS: stable distributed systems (fixed nodes good connections)" Unstable with mobile and embedded devices " Distributed Pervasive Systems:" Ø Computing anywhere and anytime! Ø Contextual change: environment changes should be immediately react." Ø Ad hoc composition: Each node may be used in a very different ways by different users. Requires ease-of-configuration." Ø Sharing is the default: Nodes come and go, providing sharable services and information. Calls again for simplicity." Expose distribution instead of hiding it! 14

DPS: Electronic Health Care Systems Devices are physically close to a person Ø Where and how should monitored data be stored? Ø How can we prevent loss of crucial data? Ø How can security be enforced? Ø How can physicians provide online feedback? 15

DPS: Sensor Networks The nodes to which sensors are attached are: n Many (10s-1000s) n Simple (small memory/compute/communication capacity) n Often battery-powered (or even battery-less) 16

Distributed Systems: Definition A distributed system (DS) is a piece of software that ensures that " a collection of independent computers, communicate through network by passing messages! And appears to its users as a single coherent system" " " But HOW can we" Ø hide the differences between independent computers &" Ø provide a single system view?" "

OS Structures in Distributed Systems Distributed Operating Systems: OS essentially tries to maintain a single, global view of the resources it manages (Tightly-coupled OS). Network Operating Systems: Collection of independent OSes augmented by network services (Loosely-coupled OS) Middleware: Provide a level of transparency between applications and local OSes 18

Distributed Operating Systems Full transparency: users feel a big system and are not aware of multiple different machines Access to remote services similar to local resources 19

Distributed Operating Systems (Cont.) Each node has its own kernel for managing local resources (memory, local CPU, disk, ) Common software layer implements OS and supports parallel and concurrent execution of various tasks. Possible complete software implementation of shared memory (distributed shared memory) Additional facilities: task assignments, handle hardware failures, transparent storage, inter-process communication, data/computation/ process migration. 20

Network Operating Systems (NOS) NO single view of the distributed system Users are aware of the multiplicity of the machines Applications use NOS services to access resources 21

Network Operating Systems (Cont.) NOS provide facilities to allow users to make use of services in other machines Ø Remote login (telnet, rlogin) Ø File transfer (ftp) The only communication is message passing Users need to explicitly log on into remote machines, or copy files from one machine to another. Need multiple passwords, multiple access permissions. In contrast, adding or removing a machine is relatively simple. 22

Middleware-Based Systems Middleware: a higher level of abstraction on top of network operating systems for efficient transparency 23

Middleware-Based Systems (cont.) Each local system is a part of underlying NOS Target of middleware : Ø Resolve the integration problems of various networked applications Examples: Remote Procedure Calls (RPC), Remote Method Invocations (RMI), Distributed File Systems, Distributed Object Systems 24

Design Objectives of Distributed Systems Goal of distributed systems Ø Make resources (hardware/software) available Ø Make it easy for users to access remote resources Ø Share resources in a controlled and efficient way Distribution transparency: hiding distribution Openness Scalability 25

Distribution Transparency Access Location Migration Relocation Replication Hides differences in data representation and invocation mechanisms Hides where a resource resides Hides that a resource may move Hides that a resource may be moved while in use Hides that a resource is replicated Concurrency Hides that other users may access the same resource Failure Hides failure and possible recovery of resources 26

Degree of Transparency Full transparency will can be costly Expose distribution of the system Ø Mobile phone is moving around. We may need location and text awareness Completely hiding failures of networks and nodes is (theoretically and practically) impossible Ø You cannot distinguish a slow computer from a failing one Ø You can never be sure that a server actually performed an operation before a crash Users may be located in different continents à distribution is apparent and not something you want to hide Observation: Aiming at full distribution transparency may be too much 27

Openness of Distributed Systems Offer services according to standard rules that describe the syntax and semantics of those services" How to achieve openness" Ø Well-defined interfaces (often described using IDL)" ü Easy to define syntax. But semantics are hard thus it is defined in a natural language" Ø portability!! ü The same implementation should work on different machines" Ø Easily interoperate! ü Two different implementations should work together" Distributed system should be independent from heterogeneity of the underlying environment Hardware, Software Platforms, and Languages " 28

Scalability in Distributed Systems Three aspects of scalability Ø size: number of users and/or processes Ø geographical: Maximum distance between nodes Ø administrative : Number of administrative domains Most systems account only, to a certain extent, for size scalability: powerful servers (supercomputer) Challenge nowadays: geographical and administrative scalability 29

Problems with Size Scalability n What happens when more users/resources added? n Limitations of centralized systems l Service l Data l Algorithm (e.g., single server) overloaded servers (e.g., single phone book) saturated communication links (e.g., routing based on global info) too much traffic n Use distributed service, database, and algorithm l No machine has complete info l Make decision based on local info l Failure of one node does affect others l No global clock

Problems with Geographical Scalability Suppose we have an interactive application working on a LAN, can we use it over a WAN?" Delay " Ø Blocking read/write might be OK on LAN but not on WAN" Reliability" Ø Longer the distance higher the chance of loosing messages" Bandwidth " n Locating a service by broadcasting is OK on LAN (e.g., ARP) but not on WAN "

Problems with Administrative Scalability n In a single domain: l We can try to optimize resource usage because each entity belongs to the same domain and can be trusted n In case of multiple and independent administrative domains: l We do not own all resources and cannot trust others l Several problems! Conflicting policies (who uses what and pays how much)! Management! Security (access rights and trust management)

Techniques for Scalability Hiding communication latency: (Geographical)" Ø Use asynchronous communication: " ü +: separate handler for incoming response and do something while waiting." ü - what if there is nothing else to do " Distribution: splitting it to small parts" Ø Domain naming systems (DNS)" Ø Decentralized data, information systems (WWW)" Ø Decentralized algorithm (Distance Vector)" Replicate: " Ø Increase availability " Ø Load balance"

Techniques for Scalability (cont d) Use Replication/caching that makes multiple copies of the same services or data available at different machines" Ø Mirrored Web sites" Ø Replicated file servers and databases" Ø Web caches (in browsers and proxies)" Ø File caching (at server and client)" Ø + increase availability" Ø + improve load balance and performance" Ø + hide communication latency" " Ø - Inconsistencies when one copy is modified " Ø - Global synchronization is need for keeping copies consistent but it precludes large-scale solutions" Ø - Tolerance to inconsistencies depends on application"

Techniques for Scalability (cont d) All the techniques discussed so far deal with performance problems due to size and geographical scalability " How about administrative scalability?" Ø The most challenging one (why?)" Ø The problems are often non-technical (Politics!)"

Developing Distributed Systems: Pitfalls Mistakes are often due to false assumptions:" Ø The same global time! Ø Perfect network/communication" ü Latency is zero"" ü Bandwidth is infinite" ü The network is reliable" ü The network is secure" ü The network is homogeneous" Ø The topology does not change" Ø There is one administrator!

Outline Different Distributed Systems Ø Distributed computing systems Ø Distributed information systems Ø Distributed pervasive systems OS in distributed systems Ø Distributed OS vs. Network OS vs. Middleware Design objectives of distributed systems Ø Transparency, openness and scalability Architecture of distributed systems Ø Software vs. system architectures 37

Software Architectures for DS Logical organization of different components; Ø How to divide to different components Ø How to connect and communicate with each other Ø How to figure different elements into a system division of responsibilities of components; distribute them over multiple machines, and allow them to communicate through connectors Ø Component: a modular unit with well-defined required and provided interfaces," Ø Connector: a mechanism that mediates communication, coordination, and cooperation (e.g., RPC, msg passing) 38

Software Architecture Style Layered style " Ø A component at higher layers can call components at a lower layer" Ø Widely adopted by the networking community" Object-based " Ø Components are connected through a procedure call" Ø Used for client-server systems" Two most important styles!

Software Architecture Style Event-based(Publish/subscribe) " Ø Communicate through propagation of events" Ø Loosely coupled components" ü decoupled in space or referentially decoupled" Data-centered: " Ø Communicate through a common repository (e.g., shared distributed file system) " Ø Can combine with event-based, yielding shared data spaces " ü processes are now decoupled in space and time" processes do not refer to each other processes do not need to be active at the same time

System Architectures for DS Consider how and where to place software components and realize their interactions! About physical realization, placement of software components & interactions Ø Centralized: client-server Ø Decentralized: Ø Hybrid: P2P (Structured vs. unstructured) Combination of centralized and P2P 41

Basic Client-Server Model Clients & servers may across different machines Clients follow request/reply model to use services +: efficient -: reliability. Can t detect whether a request is lost or a reply fail. Connection-oriented protocol: TCP/IP 42

Application Layering (logical) How to draw a clear line between client end server?! User-interface layer" Processing layer: " Ø functions of an application, i.e. without specific data" Data layer: " Ø data that a client wants to manipulate" Observation: layering is found in many distributed information systems, using traditional database technology and accompanying applications.

Traditional Two-Tiered Configurations How to place the three layers on client and server?! " "Thin client CS5523: " Operating System " @ UTSA "Fat client" 44

Multitiered Architectures The server part could be distributed over multiple machines," Three-tiered: each layer on a separate machine" Vertical Distribution 45

Servers and States: Stateful Servers Vertical Distribution: Ø Functions are logically and physically split across multiple machines. Ø Processes are not equal and Interactions are asymmetric: One acts as client while the other acts as server" Keeps track of the status of its clients Ø Record that a file has been opened, so that pre-fetching can be done Ø Knows which data a client has cached, and allows clients to keep local copies of shared data 46

Decentralized Architectures: P2P Systems P2P architectures are horizontal distribution:" Ø split up clients and servers into logically equivalent parts and let each part operate on its own share of data" Processes are equal and Interactions are symmetric" Ø Each acts as both client and server" Tremendous growth in the last couple of years" An Example: BitTorrent

Decentralized Architectures: An Example Star War Peer 1! Peer 2! Application! Sharable! objects! Application! Peer 3! Application! The Beatles Peer 4! Application! Peers 5... N! Roman Holiday 48

Decentralized Architectures: P2P Systems Key question:" Ø How to organize processes in an overlay network, where links are usually TCP channels" Three approaches to organize nodes into overlay networks through which data is routed" Ø Structured P2P: nodes are organized following a specific distributed data structure and deterministic algorithms" Ø Unstructured P2P: randomly selected neighbors" Ø Hybrid P2P: some nodes are appointed special functions in a well-organized fashion" 49

Structured P2P Systems Distributed Hash Table (DHT) is the most used one" Ø Assume we have a large ID space Ω (e.g., 128-bit)" Ø Assign random keys to data items (from Ω)" Ø Assign random number to nodes (from Ω)" Ø The key of DHT: implement an efficient and deterministic scheme that maps the key of a data item to node ID" Ø When looking up a data item, the system should route the request to the associated node and return the network address of that node" Example: Chord"

A DHT Example: Chord Data item with key k is mapped to a node with the smallest ID >= k." This node is called as the successor of key k and denoted by succ(k)! LOOKUP(key=8)? This should return succ(8), which is node 12 51

Unstructured P2P Architectures Basic principle: Each node maintains a random list of neighbors: Ø Each peer maintain a partial view of the network, consisting of c other nodes Ø Each node P periodically selects a node Q from its partial view: P and Q exchange information and exchange members from their respective partial views Ø Robustness of the network can be maintained depending on the random exchange 52

Superpeers Peers maintain an index (for search) Peers monitor the state of the network Peers setup connections 53

Hybrid Structures: Edge Servers Edge-server architectures, which are often used for Content Delivery Networks 54

Collaborative Hybrid Structures: BitTorrent Combining a P2P with a client-server architecture Basic idea: a node identifies where to download a file from and joins a swarm of downloaders; who get file chunks in parallel from the source, and distribute these chunks among each other 55

Summary Distributed Systems Ø Distributed computing systems Ø Distributed information systems Ø Distributed pervasive systems OS in distributed systems Ø Distributed OS vs. Network OS vs. Middleware Design objectives of distributed systems Ø Transparency, openness and scalability Architecture of distributed systems Ø Software vs. system architectures 56