Introduction. The Evolution of the Grid. Introduction (cont.) Grid Computing Fall 2004 Paul A. Farrell 9/2/2004



Similar documents
Classic Grid Architecture

A Taxonomy and Survey of Grid Resource Planning and Reservation Systems for Grid Enabled Analysis Environment

How does the Grid extend the Internet, and what is the future vision for this development?

Survey and Taxonomy of Grid Resource Management Systems

Data Management in an International Data Grid Project. Timur Chabuk 04/09/2007

Concepts and Architecture of the Grid. Summary of Grid 2, Chapter 4

A Survey Study on Monitoring Service for Grid

Grid Scheduling Dictionary of Terms and Keywords

THE CCLRC DATA PORTAL

An Evaluation of Economy-based Resource Trading and Scheduling on Computational Power Grids for Parameter Sweep Applications

Distributed Systems and Recent Innovations: Challenges and Benefits

Resource Management on Computational Grids

An approach to grid scheduling by using Condor-G Matchmaking mechanism

Introduction. MCSN N. Tonellotto Complements of Distributed Enabling Platforms

Principles and Foundations of Web Services: An Holistic View (Technologies, Business Drivers, Models, Architectures and Standards)

Service-Oriented Architecture and Software Engineering

A Web Services Data Analysis Grid *

The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets

Concepts and Architecture of Grid Computing. Advanced Topics Spring 2008 Prof. Robert van Engelen

Web Service Based Data Management for Grid Applications

CHAPTER 2 MODELLING FOR DISTRIBUTED NETWORK SYSTEMS: THE CLIENT- SERVER MODEL

Basic Scheduling in Grid environment &Grid Scheduling Ontology

Grid Scheduling Architectures with Globus GridWay and Sun Grid Engine

Building Problem Solving Environments with Application Web Service Toolkits

System Models for Distributed and Cloud Computing

An Economy Driven Resource Management Architecture for Global Computational Power Grids

IBM Solutions Grid for Business Partners Helping IBM Business Partners to Grid-enable applications for the next phase of e-business on demand

FIPA agent based network distributed control system

Introduction to Service Oriented Architectures (SOA)

Evolution of an Inter University Data Grid Architecture in Pakistan

Monitoring Clusters and Grids

Resource Management and Scheduling. Mechanisms in Grid Computing

LinuxWorld Conference & Expo Server Farms and XML Web Services

Distributed systems. Distributed Systems Architectures

Global Grid Forum: Grid Computing Environments Community Practice (CP) Document

An Experience in Accessing Grid Computing Power from Mobile Device with GridLab Mobile Services

Emerging Technologies Shaping the Future of Data Warehouses & Business Intelligence

G-Monitor: Gridbus web portal for monitoring and steering application execution on global grids

Research on the Model of Enterprise Application Integration with Web Services

e-science Technologies in Synchrotron Radiation Beamline - Remote Access and Automation (A Case Study for High Throughput Protein Crystallography)

A Web Services Data Analysis Grid *

GridFTP: A Data Transfer Protocol for the Grid

Analyses on functional capabilities of BizTalk Server, Oracle BPEL Process Manger and WebSphere Process Server for applications in Grid middleware

Grid Sun Carlo Nardone. Technical Systems Ambassador GSO Client Solutions

Service Oriented Architectures

Client/server is a network architecture that divides functions into client and server

G-Monitor: A Web Portal for Monitoring and Steering Application Execution on Global Grids

Introduction to WebSphere Process Server and WebSphere Enterprise Service Bus

Service Oriented Distributed Manager for Grid System

What Is the Java TM 2 Platform, Enterprise Edition?

2 (18) - SOFTWARE ARCHITECTURE Service Oriented Architecture - Sven Arne Andreasson - Computer Science and Engineering.

E-Business Technologies for the Future

A High-Performance Virtual Storage System for Taiwan UniGrid

Motivation Definitions EAI Architectures Elements Integration Technologies. Part I. EAI: Foundations, Concepts, and Architectures

Building Platform as a Service for Scientific Applications

Deploying a distributed data storage system on the UK National Grid Service using federated SRB

Technical Guide to ULGrid

Globus Striped GridFTP Framework and Server. Raj Kettimuthu, ANL and U. Chicago

The Lattice Project: A Multi-Model Grid Computing System. Center for Bioinformatics and Computational Biology University of Maryland

Jini and JXTA Based Lightweight Grids

The glite File Transfer Service

ELIXIR LOAD BALANCER 2

Multi-Agent Support for Internet-Scale Grid Management

Web Service Robust GridFTP

Data Management System for grid and portal services

SOA Myth or Reality??

Service Oriented Architecture

GRIP:Creating Interoperability between Grids

3-Tier Architecture. 3-Tier Architecture. Prepared By. Channu Kambalyal. Page 1 of 19

MIGRATING DESKTOP AND ROAMING ACCESS. Migrating Desktop and Roaming Access Whitepaper

Software design (Cont.)

Distributed Systems Architectures

A Taxonomy and Survey of Grid Resource Management Systems

An IDL for Web Services

Enterprise Application Integration (Middleware)

Grid based Integration of Real-Time Value-at-Risk (VaR) Services. Abstract

SOFT 437. Software Performance Analysis. Ch 5:Web Applications and Other Distributed Systems

Relational Join Queries Over Grid-aware Architectures

Introduction to CORBA. 1. Introduction 2. Distributed Systems: Notions 3. Middleware 4. CORBA Architecture

Distributed File Systems

Web DNS Peer-to-peer systems (file sharing, CDNs, cycle sharing)

GRID COMPUTING: A NEW DIMENSION OF THE INTERNET

How To Understand The Concept Of A Distributed System

IT Architecture Review. ISACA Conference Fall 2003

GRID RESOURCE MANAGEMENT (GRM) BY ADOPTING SOFTWARE AGENTS (SA)

GridSolve: : A Seamless Bridge Between the Standard Programming Interfaces and Remote Resources

Contents. Client-server and multi-tier architectures. The Java 2 Enterprise Edition (J2EE) platform

Users-Grid: A Unique and Transparent Grid-Operating System

Collaborative & Integrated Network & Systems Management: Management Using Grid Technologies

Data Grids. Lidan Wang April 5, 2007

Building Grids with Jini and JavaSpaces

Web Services and Service Oriented Architectures. Thomas Soddemann, RZG

KNOWLEDGE GRID An Architecture for Distributed Knowledge Discovery

WebSphere Portal Server and Web Services Whitepaper

CSF4:A WSRF Compliant Meta-Scheduler

Grid Computing: A Ten Years Look Back. María S. Pérez Facultad de Informática Universidad Politécnica de Madrid mperez@fi.upm.es

TUTORIAL. Rebecca Breu, Bastian Demuth, André Giesler, Bastian Tweddell (FZ Jülich) {r.breu, b.demuth, a.giesler,

Oracle WebLogic Foundation of Oracle Fusion Middleware. Lawrence Manickam Toyork Systems Inc

Digital Library for Multimedia Content Management

Service Virtualization: Managing Change in a Service-Oriented Architecture

Transcription:

The Evolution of the Grid The situation has changed Introduction David De Roure, Mark A. Baker, Nicholas R. Jennings, Nigel R. Shadbolt Grid Computing Making the Global Infrastructure a Reality Fran Berman, Geoffrey Fox, and Tony Hey 2002 John Wiley & Sons, Ltd. Chapter 3 Based on slides by Kwangwon, Koh One s computing needs to be serviced by localized computing platforms and infrastructures Commodity computer, Network components, Other factors The capability for effective and efficient utilization of widely distributed resources to fulfill a range of needs The issues in designing, building and deploying distributed computer systems have now been explored over many years Metacomputing, Scalable computing, Global computing, Internet computing, and Grid computing Grid Computing 1 Grid Computing 2 Introduction (cont.) Grid problem (Foster, Kesselman, Tuecke, 2001) Flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions, and resources Emphasizes the importance of information aspects to essential for resource discovery and interoperability XML, Resource Description Framework (RDF) The Evolution of the Grid: The First Generation The early Grid efforts started as projects to link supercomputing sites FAFNER and I-WAY had to overcome a number of similar hurdles, including communications, resource management, and the manipulation of remote data, to be able to work efficiently and effectively Grid Computing 3 Grid Computing 4 Kent State University 1

The First Generation: FAFNER 1995 FAFNER Factoring via Network-Enabled Recursion To factor RSA130 using a new numerical technique called the Number Field Sieve (NFS) factoring method using computational Web servers Contributors downloaded a sieving software daemon that used HTTP GET/POST to get values and post results CGI scripts supported cluster management minimizing impact on owners by regulating day/night usage Three factors combined to make this approach successful: The NFS implementation allowed even workstations with 4Mbytes of memory to perform useful work FAFNER supported anonymous registration A hierarchical network of servers reduced the potential administration bottleneck The First Generation: I-WAY I-WAY was an experimental high performance network linking many high performance computers (17 sites) and advanced visualization environments Not to build a network but to integrate existing high bandwidth networks (ATM) To help standardize the I-WAY software interface and management, the key sites installed I-POP servers to act as gateways to I-WAY The I-POP servers were UNIX workstations configured uniformly and possessing a standard software environment called I-Soft The I-POP server mechanisms allowed uniform I-WAY authentication, resource reservation, process creation, and communication functions I-WAY developed resource scheduler Computational Resource Broker (CRB) Security using telnet modified to use Kerberos Used Andrew File System (AFS) To support user-level tools, a low-level communications library, Nexus, was adapted to execute in the I-WAY environment Grid Computing 5 Grid Computing 6 A Summary of Early Experience FAFNER and I-Way attempted to produce metacomputing environments by integrating resources from opposite ends of the computing spectrum FAFNER was tailored to a particular factoring application that was in itself trivially parallel and was not dependent on a fast interconnect Forerunner of the likes of SETI@home and Distributed.Net I-WAY was designed to cope with a range of diverse high performance applications that typically needed a fast interconnect and powerful resources Forerunner of the likes of Globus and Legion Had a few shortcomings I-POP had to be specially set up and was single point of failure The Evolution of the Grid: The Second Generation Today the grid infrastructure is capable of binding together more than just a few specialized supercomputing centers Grid: a viable distributed infrastructure on a global scale that can support diverse applications requiring large-scale computation and data Three main issues Heterogeneity Scalability - latency tolerance, handle authentication over organizational boundaries Adaptability tolerant of resource failure Middleware is used to hide the heterogeneous nature and provide users and applications with a homogeneous and seamless environment by providing a set of standardized interfaces to a variety of services on the Grid Grid Computing 7 Grid Computing 8 Kent State University 2

Requirements for the data and computation infrastructure Administrative Hierarchy Communication Services various types needed, QoS Information Services dynamic info on location/type of services Naming Services uniform namespace (X500 or DNS) Distributed File Systems and Caching uniform global namespace Security and Authorization System Status and Fault Tolerance - monitoring tools required Resource Management and Scheduling efficient global scheduling that interact with local schedulers User and Administrative GUI intuitive/heterogeneous, usually Web-based Second Generation Core Technologies Globus A software infrastructure that enables applications to handle distributed heterogeneous computing resources as a single virtual machine The GTK provides a bag of services which developers of specific tools or applications can use to meet their own particular needs The GTK is modular and consists of the following: GRAM: Globus Toolkit Resource Allocation Manager HTTP based GridFTP security, parallelism GSI: Grid Security Infrastructure - authentication LDAP: Lightweight Directory Access Protocol distributed access to structure/state information GASS: Global Access to Secondary Storage GEM: Globus Executable Manager construction, caching, location of executables GARA: Globus Advanced Reservation and Allocation Grid Computing 9 Grid Computing 10 Second Generation Core Technologies (cont.) Legion - UVa An object-based metasystem A system to enable heterogeneous, geographically distributed, high performance machines to interact seamlessly Provide users with a single integrated infrastructure regardless of scale, physical location, language and underlying operating system Difference from Globus Encapsulate all of its components as objects Core object types : classes/metaclasses, host objects, vault objects, implementation objects/caches, binding agents, context objects/spaces Distributed Object Systems Common Object Resource Broker Architecture (CORBA) Open distributed object-computing infrastructure Automates network programming tasks such as object registration, location activation; request de-multiplexing; framing/error handling; parameter marshalling; operation dispatching Negatives: may be difficulty in crossing organizational boundaries; real-time/multimedia not in original design Java/JVM Single implementation framework; mask heterogeneity Can have Java wrappers around non-java code Negatives: computational speed; concurrency Grid Computing 11 Grid Computing 12 Kent State University 3

Distributed Object Systems Jini and RMI To provide a software infrastructure that can form a distributed computing environment that offers network plug and play In Jini, applications will normally be written in Java and communicate using the Java Remote Method Invocation (RMI) mechanism Concepts Lookup: search for a service and download the code needed to access it; Discovery: spontaneously find a community and join; Leasing: time-bounded access to a service; Remote Events: service A notifies service B of A s state change. Transactions: used to ensure that a system s distributed state stays consistent Does not provide file system, scheduling or logins Grid Resource Brokers and Schedulers Batch and Scheduling Systems Condor A software package for executing batch jobs on a variety of UNIX platforms, in particular those that would otherwise be idle Features: resource location, job allocation, check pointing, migration PBS (Portable Batch System) A batch queuing and workload management system Allows sites to determine own scheduling policies SGE (Sun Grid Engine) Jobs (with requirements profile) wait in a holding area and queues located on servers provide the services for jobs LSF (Load Sharing Facility) A commercial system from Platform Computing Corp. LSF comprises distributed load sharing/ batch queuing and monitoring of resources/workloads on a network of heterogeneous computers Grid Computing 13 Grid Computing 14 Grid Resource Brokers and Schedulers (cont.) Storage Resource Broker (SRB) To provide uniform access to distributed storage across a range of storage devices Key feature SRB supports metadata associated with a distributed file system, such as location, size and creation date information SRB is attractive for Grid applications in that it deals with large volumes of data Grid Resource Brokers and Schedulers (cont.) Nimrod/G Resource Broker and GRACE Nimrod-G : a Grid broker that performs resource management and scheduling of parameter sweep and task-farming applications Components A task- farming engine: allows user- defined schedulers, customized applications or problem solving environments A scheduler: support resource discovery, selection, scheduling, and the execution of user jobs on remote resources A dispatcher: uses Globus for deploying Nimrod- Gagents on remote resources in order to manage the execution of assigned jobs Resource agents Supports user-defined deadline and budget constraints for scheduling optimization and manages the supply and demand of resources in the Grid using a set of resource trading services called GRACE (GRid Architecture for Computational Economy) Scheduling algorithms: Cost, time, cost- time optimization, conservative Grid Computing 15 Grid Computing 16 Kent State University 4

Grid Portals A Grid portal provides access to grid resources specific to a particular domain of interest The NPACI HotPage hotpage.npaci.edu A user portal that has been designed to be a single point-of-access to computer-based resources, to simplify access to resources that are distributed across member organizations and allows them to be viewed either as an integrated Grid system or as individual systems Provides information, resource access/management The SDSC Grid Port Toolkit gridport.npaci.edu A reusable portal toolkit that uses HotPage infrastructure Web portal to allow the execution of secure portal services and an application API to enable development of customized science portals Integrated Systems Cactus An open source problem-solving environment Modular structure Provide a front-end to many core backend services Globus, HDF5 file library, PETSc, visualization EU DataGrid eu-datagrid.web.cern.ch A computational and data-intensive Grid of resources for the analysis of data coming from scientific exploration LHC (Large Hadron Collider), which will operate at CERN from about 2005 to 2015 Main challenge of the LHC is providing the means to share many petabytes of distributed data over the current network infrastructure Grid Computing 17 Grid Computing 18 Integrated Systems (cont.) UNICORE - German Ministry of Education& Research Design goal A uniform and easy to use GUI An open architecture based on the concept of an abstract job A consistent security architecture Minimal interference with local administrative procedures Exploitation of existing and emerging technologies through standard Java and Web technologies Protocol Between the components is defined in terms of Java objects Low- level layer called UPL(UNICORE Protocol Layer) handles authentication, SSL communication and transfer of data as in- lined byte- streams High- level layer contains the classes to define UNICORE jobs, tasks and resource requests Integrated Systems (cont.) WebFlow A computational extension of the Web model that can act as a framework for wide-area distributed computing Design goal Build a seamless framework for publishing and reusing computational modules on the Web End users can compose distributed applications using WebFlow modules via Web browser The front-end uses applets for authoring, visualization, and control of the environment The backend of WebFlow was implemented using the Globus Toolkit Grid Computing 19 Grid Computing 20 Kent State University 5

Peer-to-Peer Computing Decentralisation To overcome a performance bottleneck and a single point of failure of client-server model P2P Computing Napster, Gnutella, Freenet, JXTA Internet Computing SETI@home, Parabon, Entropia Machines share data and resources via Internet or private networks Obstacles Require more computing power to handle communication & security Security Heterogeneous Find resources and services Peer-to-Peer Computing (cont.) JXTA An open source development community for P2P infrastructure and applications Application Service Platform Network-Enabled Operating Environment Device A specification rather than software Grid Computing 21 Grid Computing 22 The Evolution of the Grid: The Third Generation Emphasize autonomic Needs detailed knowledge of its components and status Must configure and reconfigure itself dynamically Seeks to optimize its behaviour to achieve its goal Is able to recover from malfunction Protect itself against attack Be aware of its environment Implement open standards Make optimized use of resources Grid Computing 23 Service-oriented architectures Grid Computing e-science Agents E-Business Web Services Web Services - W3C SOAP - Simple Object Access Protocol Envelope for XML data, RPC Web Services Description Language (WSDL) describes services in XML Universal Description Discovery and Integration (UDDI) Distributed registeries Grid Computing 24 Kent State University 6

Service-oriented architectures (cont.) The Open Grid Services Architecture (OGSA) Framework (Globus/IBM) Supports the creation, maintenance, and application of ensembles of services maintained by VOs (virtual organization) The standard interfaces Discovery Dynamic service creation Lifetime management Notification Manageability Simple hosting environment Globus GRAM - resource allocation & management MDS- 2 - metadirectory services GSI - Grid security infrastructure (single sign- on) Service-oriented architectures (cont.) Agents Characteristics Autonomy Social ability interact with other agents Reactivity perceive/respond to environment Pro-activeness - goal directed behavior FIPA (Foundation for Intelligent Physical Agents) Agents communicate by exchanging messages which represent speech acts, and which are encoded in an agentcommunication-language Services provide support agents Services may be implemented either as agents or as software Grid Computing 25 Grid Computing 26 Information aspects: relationship with the WWW The Web as a Grid Information Infrastructure Does information distribution architecture meet grid requirements? Version control - none in www QoS link fragility Provenance - no certified publication date Digital Rights Management - no copy protection Curation www delivery oriented, no content management Information aspects: relationship with the WWW (cont.) Expressing Content and Meta-content XML Information exchange is facilitated by the XML Designed to mark up documents The tags are defined for each application using a DTD (Document Type Definition) or an XML Schema RDF A standard way of expressing metadata RDF Schema permit definition of a vocabulary XML and RDF enable the standard expression of content and metacontent Grid Computing 27 Grid Computing 28 Kent State University 7

Live Information Systems Access Grid A collection of resources that support human collaboration across the Grid, including large-scale distributed meetings and training Current Access grid infrastructure is based on IP multicast The combination of Semantic Web technologies with live information flows is highly relevant to grid computing Summary and Discussion First generation systems involved proprietary solutions for sharing high performance computing resources Second generation systems introduced middleware to cope with scale and heterogeneity, with a focus on large scale computational power and large volumes of data Third generation systems are adopting a service-oriented approach, adopt a more holistic view of the e-science infrastructure, are metadata-enabled and may exhibit autonomic features Grid Computing 29 Grid Computing 30 The Semantic Grid Richer Semantics Semantic Web Web Semantic Grid Grid Greater Computation Grid Computing 31 Kent State University 8