Distributed Systems. Examples. Advantages and disadvantages. CIS 505: Software Systems. Introduction to Distributed Systems



Similar documents
Principles and characteristics of distributed systems and environments

Distributed Operating Systems

Distributed Systems LEEC (2005/06 2º Sem.)

How To Understand The Concept Of A Distributed System

Distribution transparency. Degree of transparency. Openness of distributed systems

Distributed Systems. REK s adaptation of Prof. Claypool s adaptation of Tanenbaum s Distributed Systems Chapter 1

Virtual machine interface. Operating system. Physical machine interface

CS550. Distributed Operating Systems (Advanced Operating Systems) Instructor: Xian-He Sun

2.1 What are distributed systems? What are systems? Different kind of systems How to distribute systems? 2.2 Communication concepts

Contents. Chapter 1. Introduction

How To Make A Distributed System Transparent

Distributed System: Definition

Chapter 18: Database System Architectures. Centralized Systems

Chapter 1: Distributed Systems: What is a distributed system? Fall 2008 Jussi Kangasharju

Client/Server and Distributed Computing

Agenda. Distributed System Structures. Why Distributed Systems? Motivation

Distributed Databases

Centralized Systems. A Centralized Computer System. Chapter 18: Database System Architectures

Software Concepts. Uniprocessor Operating Systems. System software structures. CIS 505: Software Systems Architectures of Distributed Systems

Making Multicore Work and Measuring its Benefits. Markus Levy, president EEMBC and Multicore Association

CORBA and object oriented middleware. Introduction

LinuxWorld Conference & Expo Server Farms and XML Web Services

Chapter 1: Introduction. What is an Operating System?

A Comparison of Distributed Systems: ChorusOS and Amoeba

Sensor Devices and Sensor Network Applications for the Smart Grid/Smart Cities. Dr. William Kao

Chapter 7: Distributed Systems: Warehouse-Scale Computing. Fall 2011 Jussi Kangasharju

Chapter 17: Distributed Systems

Operating Systems 4 th Class

CS 5523 Operating Systems: Intro to Distributed Systems

Network Architecture and Topology

COMP 422, Lecture 3: Physical Organization & Communication Costs in Parallel Machines (Sections 2.4 & 2.5 of textbook)

DISTRIBUTED SYSTEMS AND CLOUD COMPUTING. A Comparative Study

Network Attached Storage. Jinfeng Yang Oct/19/2015

3 - Introduction to Operating Systems

Basics. Topics to be covered. Definition of a Distributed System. Definition of a Distributed System

Mobile and Heterogeneous databases Database System Architecture. A.R. Hurson Computer Science Missouri Science & Technology

Virtual Routing: What s The Goal? And What s Beyond? Peter Christy, NetsEdge Research Group, August 2001

Distributed RAID Architectures for Cluster I/O Computing. Kai Hwang

SCALABILITY AND AVAILABILITY

independent systems in constant communication what they are, why we care, how they work

Introduction to Cloud Computing

How To Understand The Power Of The Internet Of Things

The Sierra Clustered Database Engine, the technology at the heart of

Content Delivery Networks. Shaxun Chen April 21, 2009

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical

Quality of Service in Industrial Ethernet Networks

CHAPTER 2 MODELLING FOR DISTRIBUTED NETWORK SYSTEMS: THE CLIENT- SERVER MODEL

SAN Conceptual and Design Basics

POWER ALL GLOBAL FILE SYSTEM (PGFS)

Microkernels, virtualization, exokernels. Tutorial 1 CSC469

Cisco Wide Area Application Services Software Version 4.1: Consolidate File and Print Servers

The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets

Distributed Data Management

Big data management with IBM General Parallel File System

Building a Highly Available and Scalable Web Farm

Star System Deitel & Associates, Inc. All rights reserved.

Proactive, Resource-Aware, Tunable Real-time Fault-tolerant Middleware

Scalability of web applications. CSCI 470: Web Science Keith Vertanen

List of courses MEngg (Computer Systems)

Design and Implementation of the Heterogeneous Multikernel Operating System

Interconnection Networks

CHAPTER 1: OPERATING SYSTEM FUNDAMENTALS

High Performance Computing. Course Notes HPC Fundamentals

This document describes how the Meraki Cloud Controller system enables the construction of large-scale, cost-effective wireless networks.

Distributed System Principles

Symmetric Multiprocessing

Exploiting peer group concept for adaptive and highly available services

SPI I2C LIN Ethernet. u Today: Wired embedded networks. u Next lecture: CAN bus u Then: wireless embedded network

Last time. Data Center as a Computer. Today. Data Center Construction (and management)

1 Organization of Operating Systems

Solving I/O Bottlenecks to Enable Superior Cloud Efficiency

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

Cisco Application Networking for IBM WebSphere

Region 10 Videoconference Network (R10VN)

Client/Server Computing Distributed Processing, Client/Server, and Clusters

A Brief Analysis on Architecture and Reliability of Cloud Based Data Storage

Protect Data... in the Cloud

1.4 SOFTWARE CONCEPTS

System Models for Distributed and Cloud Computing

Purpose-Built Load Balancing The Advantages of Coyote Point Equalizer over Software-based Solutions

Integrated Application and Data Protection. NEC ExpressCluster White Paper

Load Balancing and Maintaining the Qos on Cloud Partitioning For the Public Cloud

EMC VPLEX FAMILY. Continuous Availability and data Mobility Within and Across Data Centers

Overview and History of Operating Systems

Distributed Systems. Security concepts; Cryptographic algorithms; Digital signatures; Authentication; Secure Sockets

Ground up Introduction to In-Memory Data (Grids)

<Insert Picture Here> Oracle In-Memory Database Cache Overview

How To Virtualize A Storage Area Network (San) With Virtualization

Data Management in Sensor Networks

RAMCloud and the Low- Latency Datacenter. John Ousterhout Stanford University

On Cloud Computing Technology in the Construction of Digital Campus

M.Sc. IT Semester III VIRTUALIZATION QUESTION BANK Unit 1 1. What is virtualization? Explain the five stage virtualization process. 2.

Core Syllabus. Version 2.6 C OPERATE KNOWLEDGE AREA: OPERATION AND SUPPORT OF INFORMATION SYSTEMS. June 2006

CISCO WIDE AREA APPLICATION SERVICES (WAAS) OPTIMIZATIONS FOR EMC AVAMAR

Lecture 2 Parallel Programming Platforms

Chapter 14: Distributed Operating Systems

Scala Storage Scale-Out Clustered Storage White Paper

CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level. -ORACLE TIMESTEN 11gR1

How To Speed Up A Flash Flash Storage System With The Hyperq Memory Router

Transcription:

CIS 505: Software Systems Introduction to Distributed Systems Insup Lee Department of Computer and Information Science University of Pennsylvania Distributed Systems Why distributed systems? o availability of powerful yet cheap microprocessors (PCs, workstations, PDAs, embedded systems, etc.) o continuing advances in communication technology What is a distributed system? o A distributed system is a collection of independent computers that appear to the users of the system as a single coherent system. CIS 505, Spring 2007 1 CIS 505, Spring 2007 Distributed Systems 2 Examples The world wide web information, resource sharing Clusters, Network of workstations Distributed manufacturing system (e.g., automated assembly line) Network of branch office computers - Information system to handle automatic processing of orders Network of embedded systems New Cell processor (PlayStation 3) Advantages and disadvantages Advantages o Economics o Speed o Inherent distribution o Reliability o Incremental growth Disadvantages o Software o Network o More components to fail o Security CIS 505, Spring 2007 Distributed Systems 3 CIS 505, Spring 2007 Distributed Systems 4 1

Organization of a Distributed System Goals of Distributed Systems 1.1 Transparency Openness Reliability Performance Scalability A distributed system organized as middleware. Note that the middleware layer extends over multiple machines. CIS 505, Spring 2007 Distributed Systems 5 CIS 505, Spring 2007 Distributed Systems 6 Transparency How to achieve the single-system image, i.e., how to make a collection of computers appear as a single computer. Hiding all the distribution from the users as well as the application programs can be achieved at two levels: 1) hide the distribution from users 2) at a lower level, make the system look transparent to programs. 1) and 2) requires uniform interfaces such as access to files, communication. Transparency in a Distributed System Transparency Access Location Migration Relocation Replication Concurrency Failure Persistence Description Hide differences in data representation and how a resource is accessed Hide where a resource is located Hide that a resource may move to another location Hide that a resource may be moved to another location while in use Hide that a resource may be shared by several competitive users Hide that a resource may be shared by several competitive users Hide the failure and recovery of a resource Hide whether a (software) resource is in memory or on disk Different forms of transparency in a distributed system. CIS 505, Spring 2007 Distributed Systems 7 CIS 505, Spring 2007 Distributed Systems 8 2

Openness Make it easier to build and change Monolithic Kernel: systems calls are trapped and executed by the kernel. All system calls are served by the kernel, e.g., UNIX. Microkernel: provides minimal services. IPC some memory management some low-level process management and scheduling low-level i/o (E.g., Mach can support multiple file systems, multiple system interfaces.) Standard interface, separation of policy from mechanism Reliability Distributed system should be more reliable than single system. Availability: fraction of time the system is usable. Redundancy improves it. Need to maintain consistency Need to be secure Fault tolerance: need to mask failures, recover from errors. Example: 3 machines with.95 probability of being up (1-.05)**3 vs 1-.05**3 probability of being up CIS 505, Spring 2007 Distributed Systems 9 CIS 505, Spring 2007 Distributed Systems 10 Performance Scalability Without gain on this, why bother with distributed systems. Performance loss due to communication delays: fine-grain parallelism: high degree of interaction coarse-grain parallelism Performance loss due to making the system fault tolerant. Systems grow with time or become obsolete. Techniques that require resources linearly in terms of the size of the system are not scalable. (e.g., broadcast based query won't work for large distributed systems.) Examples of bottlenecks (i.e., scalability limitations) o Centralized components: a single mail server o Centralized tables/data: a single URL address book o Centralized algorithms: routing based on complete information CIS 505, Spring 2007 Distributed Systems 11 CIS 505, Spring 2007 Distributed Systems 12 3

Scalability Problems Characteristics of decentralized algorithms: No machine has complete information about the system state. Machines make decisions based only on local information. Failure of one machine does not ruin the algorithm. There is no implicit assumption that a global clock exists. CIS 505, Spring 2007 Distributed Systems 13 Scaling Techniques (1) 1.4 The difference between letting: a) a server or b) a client check forms as they are being filled CIS 505, Spring 2007 Distributed Systems 14 Scaling Techniques (2) 1.5 An example of dividing the DNS name space into zones. Pitfalls when Developing Distributed Systems False assumptions made by first time developer: The network is reliable. The network is secure. The network is homogeneous. The topology does not change. Latency is zero. Bandwidth is infinite. Transport cost is zero. There is one administrator. CIS 505, Spring 2007 Distributed Systems 15 CIS 505, Spring 2007 Distributed Systems 16 4

Hardware Concepts Multiprocessors (1) 1.6 1.7 Different basic organizations and memories in distributed computer systems: multiprocessors vs. multicomputers CIS 505, Spring 2007 Distributed Systems 17 A bus-based multiprocessor o Cache memory, hit rate, coherence, write-through cache, snoopy cache CIS 505, Spring 2007 Distributed Systems 18 Multiprocessors (2) a) A crossbar switch b) An omega switching network Homogeneous Multicomputer Systems a) Grid b) Hypercube 1.8 1-9 Tightly coupled vs. loosely coupled CIS 505, Spring 2007 Distributed Systems 19 CIS 505, Spring 2007 Distributed Systems 20 5

How slow is the network? Communication Latency ping www.cis.upenn.edu Round-trip times o Upenn.5ms o Princeton 5ms o Rice 43ms o Stanford 80ms o Tsinghua, Beijing 280ms Latency wire delay o Time to send and recv one byte of data o Depends on distance Bandwidth o Bytes/second o Depends on size of vehicle Latency is the bottleneck o It improves slower than bandwidth Speed of light Routers in the middle (traffic stops) o Request-respond cycles dominate application CIS 505, Spring 2007 Distributed Systems 21 CIS 505, Spring 2007 Distributed Systems 22 The speed pyramid Continuum of Distributed Systems register 1 L2 10 Memory 200 LAN 100,000 Disk 2,000,000 WAN 20,000,000 Will the ratios change? CIS 505, Spring 2007 Distributed Systems 23 small fast Parallel Architectures?? Multiprocessors low latency high bandwidth secure, reliable interconnect no independent failures coordinated resources Issues: naming and sharing performance and scale resource management clusters fast network trusting hosts coordinated Networks big slow Global LAN Internet slow network untrusting hosts high latency autonomy low bandwidth autonomous nodes unreliable network fear and distrust independent failures decentralized administration [J. Chase] CIS 505, Spring 2007 Distributed Systems 24 6

Types of Distributed Systems Cluster Computing Systems Distributed Computing Systems Distributed information systems Distributed Pervasive/Embedded Systems Figure 1-6. An example of a cluster computing system. CIS 505, Spring 2007 Distributed Systems 25 CIS 505, Spring 2007 Distributed Systems 26 Grid Computing Systems Transaction Processing Systems (1) Figure 1-7. A layered architecture for grid computing systems. Figure 1-8. Example primitives for transactions. CIS 505, Spring 2007 Distributed Systems 27 CIS 505, Spring 2007 Distributed Systems 28 7

Transaction Processing Systems (2) Transaction Processing Systems (3) Characteristic properties of transactions: Atomic: To the outside world, the transaction happens indivisibly. Consistent: The transaction does not violate system invariants. Isolated: Concurrent transactions do not interfere with each other. Durable: Once a transaction commits, the changes are permanent. CIS 505, Spring 2007 Distributed Systems 29 Figure 1-9. A nested transaction. CIS 505, Spring 2007 Distributed Systems 30 Enterprise Application Integration Distributed Pervasive Systems Requirements for pervasive systems Embrace contextual changes. Encourage ad hoc composition. Recognize sharing as the default. Figure 1-11. Middleware as a communication facilitator in enterprise application integration. CIS 505, Spring 2007 Distributed Systems 31 CIS 505, Spring 2007 Distributed Systems 32 8

Embedded Home Environment Example: Home and Personal Appliances Volume / Diversity Home care facilities Intelligent devices, tools, appliances and software for assisted living CIS 505, Spring 2007 Distributed Systems 33 2005 Smart homes, home theaters, games, smart cars, etc. Yr ~2015 ~2025 [Liu] CIS 505, Spring 2007 Distributed Systems 34 Justifications Rapid advances in component technologies, e.g., o Smart gadgets, wearable sensors and actuators, robotic helpers, mobile devices o Wireless, wideband interconnects Increasing critical needs due to o Aging baby-boom generation o Long life expectancy o New safety, security, and privacy concerns Observations Number of users: 10 1000 million Types of sensors and actuators: 100 s Number of suppliers: 10 100 s Required reliability: <10,000 recalls/year User tolerance to glitches: minimum Product life cycles: 3 20 yrs Tolerable upgrade effort: minimum The environment must be open and evolvable, & capable of self diagnosis, healing, maintenance CIS 505, Spring 2007 Distributed Systems 35 CIS 505, Spring 2007 Distributed Systems 36 9

Desired Trends Quality & Usability Volume & Diversity Unit cost Maintenance cost 2005 ~2015 ~2025 CIS 505, Spring Distributed 2007 Systems 37 Electronic Health Care Systems (1) Questions to be addressed for health care systems: Where and how should monitored data be stored? How can we prevent loss of crucial data? What infrastructure is needed to generate and propagate alerts? How can physicians provide online feedback? How can extreme robustness of the monitoring system be realized? What are the security issues and how can the proper policies be enforced? CIS 505, Spring 2007 Distributed Systems 38 Electronic Health Care Systems (2) Background: Sensor Networks Types of sensors o Seismic, low sampling rate magnetic, thermal, visual, infrared, acoustic and radar Conditions to monitor o Temperature, humidity, (vehicular) movement, lightning condition, pressure, soil makeup, noise levels o Presence or absence of certain kinds of objects o Mechanical stress levels on attached objects o Current characteristics such as speed, direction, and size of an object Figure 1-12. Monitoring a person in a pervasive electronic health care system, using (a) a local hub or (b) a continuous wireless connection. CIS 505, Spring 2007 Distributed Systems 39 CIS 505, Spring 2007 Distributed Systems 40 10

Technology Trends: Mote Evolution SN Characteristics Environment o connect to physical environment (large numbers, dense, real-time) o Sensor nodes are prone to failures, non-deterministic o wireless communication o massively parallel interfaces (to users and applications) o Limited resources: battery, bandwidth, memory, CPU (power management critical) Network o Topology changes dynamically o sporadic connectivity o new resources entering/leaving o large amounts of redundancy o self-configure/re-configure CIS 505, Spring 2007 Distributed Systems 41 CIS 505, Spring 2007 Distributed Systems 42 SN applications Smart Spaces Infrastructure security, military applications Environmental and Habitat monitoring Health applications Smart space/home applications Other commercial applications o Industrial Sensing o Traffic Control, vehicle tracking and detection o Interactive museums CIS 505, Spring 2007 Distributed Systems 43 Smart School Smart Factory Smart City Other Applications Battlefields/Surveillance Earthquake areas Environmental Monitoring Airport security Emergency Response Location Services CIS 505, Spring 2007 Distributed Systems 44 11

Sensor Networks (1) Sensor Networks (2) Questions concerning sensor networks: How do we (dynamically) set up an efficient tree in a sensor network? How does aggregation of results take place? Can it be controlled? What happens when network links fail? Figure 1-13. Organizing a sensor network database, while storing and processing data (a) only at the operator s site or CIS 505, Spring 2007 Distributed Systems 45 CIS 505, Spring 2007 Distributed Systems 46 Sensor Networks (3) The Challenges of Distributed Systems Figure 1-13. Organizing a sensor network database, while storing and processing data or (b) only at the sensors. CIS 505, Spring 2007 Distributed Systems 47 o Secure communication over public networks ACI: who sent it, did anyone see it, did anyone change it o Fault-tolerance Building reliable systems from unreliable components nodes fail independently; a distributed system can partly fail [Lamport]: A distributed system is one in which the failure of a machine I ve never heard of can prevent me from doing my work. o Replication, caching, naming Placing data and computation for effective resource sharing, hiding latency, and finding it again once you put it somewhere. o Coordination and shared state What should the system components do and when should they do it? Once they ve all done it, can they all agree on what they did and when? CIS 505, Spring 2007 Distributed Systems 48 12