Models of Distributed Computing

Similar documents
Middleware Lou Somers

A distributed system is defined as

Motivation Definitions EAI Architectures Elements Integration Technologies. Part I. EAI: Foundations, Concepts, and Architectures

What is Middleware? Software that functions as a conversion or translation layer. It is also a consolidator and integrator.

Chapter Outline. Chapter 2 Distributed Information Systems Architecture. Middleware for Heterogeneous and Distributed Information Systems

Infrastructure that supports (distributed) componentbased application development

Tier Architectures. Kathleen Durant CS 3200

Virtual machine interface. Operating system. Physical machine interface

WSO2 Message Broker. Scalable persistent Messaging System

Module 17. Client-Server Software Development. Version 2 CSE IIT, Kharagpur

Chapter 2: Remote Procedure Call (RPC)

Event-based middleware services

It is the thinnest layer in the OSI model. At the time the model was formulated, it was not clear that a session layer was needed.

Introduction to Computer Networks

Introduction CORBA Distributed COM. Sections 9.1 & 9.2. Corba & DCOM. John P. Daigle. Department of Computer Science Georgia State University

SOFT 437. Software Performance Analysis. Ch 5:Web Applications and Other Distributed Systems

Service Oriented Architectures

Middleware: Past and Present a Comparison

SOAP - A SECURE AND RELIABLE CLIENT-SERVER COMMUNICATION FRAMEWORK. Marin Lungu, Dan Ovidiu Andrei, Lucian - Florentin Barbulescu

Assignment # 1 (Cloud Computing Security)

Chapter 2: Enterprise Applications from a Middleware Perspective

How To Understand The Concept Of A Distributed System

Distributed Systems. REK s adaptation of Prof. Claypool s adaptation of Tanenbaum s Distributed Systems Chapter 1

MIDDLEWARE 1. Figure 1: Middleware Layer in Context

DISTRIBUTED AND PARALLELL DATABASE

Outline SOA. Properties of SOA. Service 2/19/2016. Definitions. Comparison of component technologies. Definitions Component technologies

Web Services. Copyright 2011 Srdjan Komazec

1. Comments on reviews a. Need to avoid just summarizing web page asks you for:

CS169.1x Lecture 5: SaaS Architecture and Introduction to Rails " Fall 2012"

Middleware and Distributed Systems. Introduction. Dr. Martin v. Löwis

Apache Tomcat. Load-balancing and Clustering. Mark Thomas, 20 November Pivotal Software, Inc. All rights reserved.

Contents. Client-server and multi-tier architectures. The Java 2 Enterprise Edition (J2EE) platform

Mobility Information Series

Large-Scale Web Applications

Agent Languages. Overview. Requirements. Java. Tcl/Tk. Telescript. Evaluation. Artificial Intelligence Intelligent Agents

3F6 - Software Engineering and Design. Handout 10 Distributed Systems I With Markup. Steve Young

Cache Database: Introduction to a New Generation Database

Lehrstuhl für Informatik 4 Kommunikation und verteilte Systeme. Middleware. Chapter 8: Middleware

What Is the Java TM 2 Platform, Enterprise Edition?

Exploring Oracle E-Business Suite Load Balancing Options. Venkat Perumal IT Convergence

Oracle WebLogic Foundation of Oracle Fusion Middleware. Lawrence Manickam Toyork Systems Inc

DB2 Connect for NT and the Microsoft Windows NT Load Balancing Service

Internet Control Protocols Reading: Chapter 3

MEGA Web Application Architecture Overview MEGA 2009 SP4

A Survey Study on Monitoring Service for Grid

Ingegneria del Software II academic year: Course Web-site: [

Resource Utilization of Middleware Components in Embedded Systems

Network Attached Storage. Jinfeng Yang Oct/19/2015

Developing a Web Server Platform with SAPI Support for AJAX RPC using JSON

Research on the Model of Enterprise Application Integration with Web Services

Distributed Objects and Components

Application Problems. Agenda. Design Patterns. What is a Pattern? Architectural choices when building COM(+) Applications

Extreme Java G

CHAPTER 2 MODELLING FOR DISTRIBUTED NETWORK SYSTEMS: THE CLIENT- SERVER MODEL

Concepts of Database Management Seventh Edition. Chapter 9 Database Management Approaches

The EMSX Platform. A Modular, Scalable, Efficient, Adaptable Platform to Manage Multi-technology Networks. A White Paper.

Getting started with API testing

Agenda. Distributed System Structures. Why Distributed Systems? Motivation

Introduction to CORBA. 1. Introduction 2. Distributed Systems: Notions 3. Middleware 4. CORBA Architecture

Web Application Architectures

LinuxWorld Conference & Expo Server Farms and XML Web Services

Zend Platform TM. White Paper: Zend Download Server. By Zend Technologies, Inc. May Zend Technologies, Inc. All rights reserved.

How to Build an E-Commerce Application using J2EE. Carol McDonald Code Camp Engineer

Software design (Cont.)

White paper. IBM WebSphere Application Server architecture

Client-Server Applications

Load Balancing and Sessions. C. Kopparapu, Load Balancing Servers, Firewalls and Caches. Wiley, 2002.

Lecture 6 Cloud Application Development, using Google App Engine as an example

White Paper: 1) Architecture Objectives: The primary objective of this architecture is to meet the. 2) Architecture Explanation

Middleware and the Internet. Example: Shopping Service. What could be possible? Service Oriented Architecture

LOAD BALANCING TECHNIQUES FOR RELEASE 11i AND RELEASE 12 E-BUSINESS ENVIRONMENTS

Classic Grid Architecture

Chapter 4. Architecture. Table of Contents. J2EE Technology Application Servers. Application Models

Network File System (NFS) Pradipta De

Apache Traffic Server Extensible Host Resolution

Distributed Systems Lecture 1 1

Network Programming TDC 561

Introduction to Cloud Computing. Lecture 02 History of Enterprise Computing Kaya Oğuz

Network Technologies

System Requirements for Microsoft Dynamics NAV 2016

S y s t e m A r c h i t e c t u r e

B. WEB APPLICATION ARCHITECTURE MODELS

Databases Lesson 04 Client Server Computing and Adaptation

Chapter 11 Distributed File Systems. Distributed File Systems

Chapter 6. CORBA-based Architecture. 6.1 Introduction to CORBA 6.2 CORBA-IDL 6.3 Designing CORBA Systems 6.4 Implementing CORBA Applications

A Standard Modest WebSite

IMPLEMENTATION OF AN AGENT MONITORING SYSTEM IN A JINI ENVIRONMENT WITH RESTRICTED USER ACCESS

Interface Definition Language

System Requirements for Microsoft Dynamics NAV 2016

<Insert Picture Here> WebLogic High Availability Infrastructure WebLogic Server 11gR1 Labs

System Services. Engagent System Services 2.06

1 Introduction: Network Applications

System Requirements. Microsoft Dynamics NAV 2016

Integrating Web Messaging into the Enterprise Middleware Layer

Internet Engineering: Web Application Architecture. Ali Kamandi Sharif University of Technology Fall 2007

Jitterbit Technical Overview : Microsoft Dynamics AX

SWE 444 Internet and Web Application Development. Introduction to Web Technology. Dr. Ahmed Youssef. Internet

Gladinet Cloud Enterprise

Architecture Design For Web-based Application Systems. Instructor: Dr. Jerry Gao Class: CMPE296U

VPN. Date: 4/15/2004 By: Heena Patel

Transcription:

COMP 150-IDS: Internet Scale Distributed Systems (Spring 2015) Models of Distributed Computing Noah Mendelsohn Tufts University Email: noah@cs.tufts.edu Web: http://www.cs.tufts.edu/~noah

Architecting a universal Web Identification: URIs Interaction: HTTP Data formats: HTML, JPEG, GIF, etc.

Goals Introduce basics of distributed system design Explore some traditional models of distributed computing Prepare for discussion of REST: the Web s model 3

Communicating systems

Communicating systems CPU Memory Storage CPU Memory Storage We have multiple programs, running asynchronously, sending messages Reference: http://www.usingcsp.com/cspbook.pdf (very theoretical)

Communicating Sequential Processes We ve got pretty clean higher level abstractions for use on a single machine CPU Memory Storage CPU Memory Storage We have multiple programs, running asynchronously, sending messages Reference: http://www.usingcsp.com/cspbook.pdf (very theoretical)

Communicating systems How can we get a clean model of two communicating machines? CPU Memory Storage CPU Memory Storage We have multiple programs, running asynchronously, sending messages Reference: http://www.usingcsp.com/cspbook.pdf (very theoretical)

Large scale systems How can we get a clean model of a worldwide network of communicating machines? What are the clean abstractions on this scale? Internet

WARNING!! This is a very big topic many important approaches have been studied and used there is lots of operational experience, and also formalisms This presentation does not attempt to be either comprehensive or balanced the goal is to introduce some key concepts

Traditional Models of Distributed Computing - Message Passing

Message passing CPU Memory Storage CPU Memory Storage Programs send messages to and from each others memories

Half duplex: one way at a time CPU Memory Storage CPU Memory Storage Programs send messages to and from each others memories

Full duplex: both ways at the same time CPU Memory Storage CPU Memory Storage Programs send messages to and from each others memories

Message passing Data abstraction: Low level: bytes (octets) Sometimes: agreed metaformat (XML, C struct, etc.) Synchronization Wait for message Timeout

Interaction Patterns

Between pairs of machines CPU Memory Storage CPU Memory Storage Request Response Message passing: no constraints Common pattern: request/response

Traditional Models of Distributed Computing - Client Server

Client / server CPU Memory Storage CPU Memory Storage Request service Response Request / response is a traffic pattern Client / server describes the roles of the nodes Server provides service for client

Client / server Probably the most common dist. sys. architecture Simple well understood Doesn t explain: How to exploit more than 2 machines How to make programming easier How to prove correctness: though the simple model helps Most client/server systems are request/response

Traditional Models of Distributed Computing - N-Tier

N-tier also called Multilevel Client/Server CPU Memory Storage CPU Memory Storage CPU Memory Storage Request Request Response Response Layered Each tier provides services for next higher level Reasons: Information hiding Management Scalability

Typical N-tier system: airline reservation Reservation Records iphone or Android Reservation Application Flight Reservation Logic Browser or Phone App Application - logic Application - logic Many commercial applications work this way

The Web itself is a 2 or 3 Tier system Web Server Browser Proxy Cache (optional!) E.g. Firefox E.g. Squid E.g. Apache Many commercial applications work this way

Web Reservation System Web-Base Reservation Application Reservation Records Proxy Cache (optional!) Flight Reservation Logic HTTP HTTP RPC? ODBC? Proprietary? Browser or Phone App E.g. Squid Application - logic Application - logic Many commercial applications work this way

Web Publishing System Web-Base Reservation Application Content Management System Content Distribution Network Content Web Site Browser or Phone App E.g. Akamia E.g. cnn.com Database or CMS Many commercial applications work this way

Advantages of n-tier system Separation of concerns each layer has own role Parallism and performance? If done right: multiple mid-tier servers work in parallel Back end systems centralize mainly data requiring sharing & synchronization Mid tier can provide shared, scalable caching Information hiding Mid-tier apps shielded from data layout Security Credit card numbers etc. not stored at mid-tier

Other patterns Spanning tree Broadcast (send to many nodes at once) Flood Various P2P Etc.

Traditional Models of Distributed Computing - Remote Procedure Call

Remote Procedure Call The term RPC was coined by the late Bruce Nelson in his 1981 CMU PhD thesis Key idea: an ordinary function call executes remotely The trick: the language runtime or helper code must automatically generate code to send parameters and results For languages like C: proxies and stubs are generated Not needed in dynamic languages like Ruby, JavaScript, etc. RPC is often (erroneously IMO) used to describe any request / response system

RPC: Call remote functions automatically x = sqrt(4) float sqrt(float n) { send n; read s; return s; } proxy CPU Memory Storage Request invoke sqrt(4) CPU Memory Storage result=2 (no exception thrown) Response Interface definition: float sqrt(float n); void domsg(msg m) { s = sqrt(m.s); send s; } stub Proxies and stubs generated automatically RPC provides transparent remote invocation float sqrt(float n) { compute sqrt return result; }

RPC: Pros and Cons Pros: Transparency is very appealing Simple programming model Useful as organizing principle even when not fully automated Cons Getting language details right is tricky (e.g. exceptions) No client/server overlap: doesn t work well for long-running operations May not optimize large transfers well Not all APIs make sense to remote: e.g. answer = search(tree) Versioning can be a problem: client and server need to agree exactly on interface (or have rules for dealing with differences)

Traditional Models of Distributed Computing - Distributed Object Systems

How do you build an RPC for this? Call method on remoted object Class Point { int x,y int getx() {return x;} int gety() {return y;} } Class Rectangle { members and constructs not shown Point getupperleft() { }; Point getlowerright { }; } int area (Rectangle r) { width=r.getlowerright().getx() r.getupperleft.getx(); width=r.getlowerright().gety() r.getupperleft.gety(); } myrect = new Rectangle; assume position set here.. int a = area(myrect); // REMOTE THIS CALL! Pass object to remote method Distributed Object systems make this work!

Distributed object systems In the 1990s, seemed like a great idea Advantages of OO encapsulation & inheritance + RPC Examples CORBA (Industry standard) DCOM (Microsoft) Still quite widely used within enterprises Complicated Marshalling object references Distributed object lifetime management Brokering: which object provides the service today Remote new : creating objects on remote systems All the pros & cons of RPC, plus the above Generally not appropriate at Internet scale

Traditional Models of Distributed Computing - Some Other Options

Special Purpose Models Remote File System Network provides transparent access to remote files Examples: NFS, CIFS Remote Database Examples: ODBJ, JDBC Remote Device Remote printing, disk drive etc. Virtual terminal One computer simulates an interactive terminal to another

Some other interesting models Broadcast / multicast Send messages to everyone (broadcast) / named group (multicast) Publish / subscribe (pub/sub) Subscribe to named events or based on query filter Call me whenever Pepsi s stock price changes Implements a distributed associative memory Reliable queuing Examples: IBM MQSeries, Java Message Service (JMS) Model: queued messages, preserved across hardware crashes Widely used for bank machine transactions; long-running (multi-day) ecommerce transactions; Depends on disk-based transaction systems at each node to keep queues Tuple spaces Pioneered by Gelernter at Yale (Linda kernel), picked up by Jini (Sun), and TSpaces (IBM) Network-scale shared variable space, with synchronization Good for queues of work to do: some cloud architectures use a related model to distribute work to servers

Stateful and Stateless Protocols

Stateful and Stateless Protocols Stateful: server knows which step (state) has been reached Stateless: Client remembers the state, sends to server each time Server processes each request independently Can vary with level Many systems like Web run stateless protocols (e.g. HTTP) over streams at the packet level, TCP streams are stateful HTTP itself is mostly stateless, but many HTTP requests (typically POSTs) update persistent state at the server

Advantages of stateless protocols Protocol usually simpler Server processes each request independently Load balancing and restart easier Typically easier to scale and make fault-tolerant Visibility: individual requests more self-describing

Advantages of stateful protocols Individual messages carry less data Server does not have to re-establish context each time There s usually some changing state at the server at some level, except for completely static publishing systems

Text vs. Binary Protocols

Protocols can be text or binary on the wire Text: messages are encoded characters Binary: any bit patterns Pros and cons quite similar to those for text vs. binary file formats When sending between compatible machines, binary can be much faster because no conversion needed Most Internet-scale application protocols (HTTP, SMTP) use text for protocol elements and for all content except photo/audio/video HTTP 2.0 moving to binary (for msg size and parsing speed)

Summary

Summary The machine-level model is complex: multiple CPUs, memories A number of abstractions are widely used for limited-scale distribution RPC is among the most interesting and successful Statefulness / statelessness is a key design tradeoff We ll see next time why a new model was needed for the Web