Simplest Scalable Architecture



Similar documents
Simplest Scalable Architecture

Distributed Operating Systems. Cluster Systems

Linux High Availability

MOSIX: High performance Linux farm

Automatic load balancing and transparent process migration

Load Balancing in Beowulf Clusters

Infrastructure for Load Balancing on Mosix Cluster

Scalable Cluster Computing with MOSIX for LINUX

Overlapping Data Transfer With Application Execution on Clusters

System Software for High Performance Computing. Joe Izraelevitz

- Behind The Cloud -

Clusters Systems based on OS. Abstract

OpenMosix Presented by Dr. Moshe Bar and MAASK [01]

Condor and the Grid Authors: D. Thain, T. Tannenbaum, and M. Livny. Condor Provide. Why Condor? Condor Kernel. The Philosophy of Flexibility

Distributed Systems. Clusters. Paul Krzyzanowski

PARALLEL & CLUSTER COMPUTING CS 6260 PROFESSOR: ELISE DE DONCKER BY: LINA HUSSEIN

Rodrigo Fernandes de Mello, Evgueni Dodonov, José Augusto Andrade Filho

OVERVIEW. CEP Cluster Server is Ideal For: First-time users who want to make applications highly available

Building scalable and reliable systems

CS550. Distributed Operating Systems (Advanced Operating Systems) Instructor: Xian-He Sun

Open Single System Image (openssi) Linux Cluster Project Bruce J. Walker, Hewlett-Packard Abstract

The Lattice Project: A Multi-Model Grid Computing System. Center for Bioinformatics and Computational Biology University of Maryland

- An Essential Building Block for Stable and Reliable Compute Clusters

DB2 Connect for NT and the Microsoft Windows NT Load Balancing Service

Cloud Computing. Lectures 3 and 4 Grid Schedulers: Condor

RevoScaleR Speed and Scalability

3 - Introduction to Operating Systems

High Performance Cluster Support for NLB on Window

Client/Server Computing Distributed Processing, Client/Server, and Clusters

Petascale Software Challenges. Piyush Chaudhary High Performance Computing

Datacenter Operating Systems

COS 318: Operating Systems. Virtual Machine Monitors

Distributed File Systems

High Performance Computing with Linux Clusters

IT service for life science

The Google File System

Cloud Computing. Up until now

Pros and Cons of HPC Cloud Computing

The MOSIX Cluster Management System for Distributed Computing on Linux Clusters and Multi-Cluster Private Clouds

An Oracle White Paper July Oracle Primavera Contract Management, Business Intelligence Publisher Edition-Sizing Guide

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC

Batch Scheduling and Resource Management

QUADRICS IN LINUX CLUSTERS

Terminal Server Software and Hardware Requirements. Terminal Server. Software and Hardware Requirements. Datacolor Match Pigment Datacolor Tools

1.5 Distributed Systems

Chapter 18: Database System Architectures. Centralized Systems

Batch Queueing in the WINNER Resource Management System

Kerrighed: use cases. Cyril Brulebois. Kerrighed. Kerlabs

Virtuoso and Database Scalability

Cluster, Grid, Cloud Concepts

ADAM 5.5. System Requirements

Web Application s Performance Testing

Exploring Oracle E-Business Suite Load Balancing Options. Venkat Perumal IT Convergence

SERVER CLUSTERING TECHNOLOGY & CONCEPT

Bosch Video Management System High Availability with Hyper-V

Multilevel Load Balancing in NUMA Computers

Hadoop and Map-Reduce. Swati Gore

Glosim: Global System Image for Cluster Computing

Brian LaGoe, Systems Administrator Benjamin Jellema, Systems Administrator Eastern Michigan University

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture

Virtualization Technologies and Blackboard: The Future of Blackboard Software on Multi-Core Technologies

Parallels Virtuozzo Containers

Scaling from 1 PC to a super computer using Mascot

Distributed RAID Architectures for Cluster I/O Computing. Kai Hwang

Final Report. Cluster Scheduling. Submitted by: Priti Lohani

A Comparison of Distributed Systems: ChorusOS and Amoeba

REM-Rocks: A Runtime Environment Migration Scheme for Rocks based Linux HPC Clusters

Priority Pro v17: Hardware and Supporting Systems

Using Linux Clusters as VoD Servers

Best Practices for Data Sharing in a Grid Distributed SAS Environment. Updated July 2010

Distributed Systems LEEC (2005/06 2º Sem.)

Client/Server and Distributed Computing

Ralf Gerhards presented detailed plans to install and test LSF batch system on the current H1 PC-Farm and make it available to H1 users this summer.

INTRODUCTION ADVANTAGES OF RUNNING ORACLE 11G ON WINDOWS. Edward Whalen, Performance Tuning Corporation

Stream Processing on GPUs Using Distributed Multimedia Middleware

Scale and Availability Considerations for Cluster File Systems. David Noy, Symantec Corporation

Dynamic Load Balancing in a Network of Workstations

Cluster Implementation and Management; Scheduling

Virtual Machines.

SINGLE SYSTEM IMAGE (SSI)

Using Parallel Computing to Run Multiple Jobs

Active/Active HA Clustering on Shared Storage with Samba

Why Computers Are Getting Slower (and what we can do about it) Rik van Riel Sr. Software Engineer, Red Hat

Improve Business Productivity and User Experience with a SanDisk Powered SQL Server 2014 In-Memory OLTP Database

PARALLELS SERVER BARE METAL 5.0 README

Principles and characteristics of distributed systems and environments

IOS110. Virtualization 5/27/2014 1

Centralized Systems. A Centralized Computer System. Chapter 18: Database System Architectures

Contents Introduction... 5 Deployment Considerations... 9 Deployment Architectures... 11

Best Practices for Virtualised SharePoint

Oracle9i Release 2 Database Architecture on Windows. An Oracle Technical White Paper April 2003

Amoeba Distributed Operating System

Eliminate SQL Server Downtime Even for maintenance

LS DYNA Performance Benchmarks and Profiling. January 2009

Network File System (NFS) Pradipta De

Distributed Operating Systems

WITH A FUSION POWERED SQL SERVER 2014 IN-MEMORY OLTP DATABASE

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney

Building Clusters for Gromacs and other HPC applications

Transcription:

Simplest Scalable Architecture NOW Network Of Workstations

Many types of Clusters (form HP s Dr. Bruce J. Walker) High Performance Clusters Beowulf; 1000 nodes; parallel programs; MPI Load-leveling Clusters Move processes around to borrow cycles (eg. Mosix) Web-Service Clusters LVS; load-level tcp connections; Web pages and applications Storage Clusters parallel filesystems; same view of data from each node Database Clusters Oracle Parallel Server; High Availability Clusters ServiceGuard, Lifekeeper, Failsafe, heartbeat, failover clusters

Many types of Clusters (form HP s Dr. Bruce J. Walker) High Performance Clusters Beowulf; 1000 nodes; parallel programs; MPI Load-leveling Clusters Move processes around to borrow cycles (eg. Mosix) Web-Service Clusters LVS; load-level tcp connections; Web pages and applications Storage Clusters parallel filesystems; same view of data from each node Database Clusters Oracle Parallel Server; High Availability Clusters ServiceGuard, Lifekeeper, Failsafe, heartbeat, failover clusters NOW type architectures

NOW Approaches Single System View Shared Resources Virtual Machine Single Address Space

Shared System View Cluster Computing Loadbalancing clusters High availability clusters High Performance High throughput High capability

Berkeley NOW

NOW Philosophies Commodity is cheaper In 1994 1 MB RAM was $40/MB for a PC $600/MB for a Cray M90

NOW Philosophies Commodity is faster CPU 150 MHz Alpha 50MHz i860 32 MHz SS-1 MPP year 93-94 92-93 91-92 WS year 92-93 ~91 89-90

Network RAM Swapping to disk is extremely expensive 16-24 ms for a page swap on disk Network performance is much higher 700 us for page swap over the net

Network RAM 600 500 min 400 300 200 32MB+disk Network RAM All RAM 100 0 1 32 64 128 Problem Size

NOW or SuperComputer? Machine Time Cost C-90 (16) RS6000 (256) +ATM +Parallel FS +NOW protocol 27 27374 2211 205 21 $30M $4M $5M $5M $5M

NOW Projects Condor DIPC MOSIX GLUnix PVM MUNGI Amoeba

The Condor System Cluster Computing Unix and NT Operational since 1986 More than 1300 CPUs at UW-Madison Available on the web More than 150 clusters worldwide in academia and industry

What is Condor? Cluster Computing Condor converts collections of distributively owned workstations and dedicated clusters into a highthroughput computing facility. Condor uses matchmaking to make sure that everyone is happy.

What is High-Throughput Cluster Computing Computing? High-performance: CPU cycles/second under ideal circumstances. How fast can I run simulation X on this machine? High-throughput: CPU cycles/day (week, month, year?) under non-ideal circumstances. How many times can I run simulation X in the next month using all available machines?

What is High-Throughput Cluster Computing Computing? Condor does whatever it takes to run your jobs, even if some machines Crash! (or are disconnected) Run out of disk space Don t have your software installed Are frequently needed by others Are far away & admin ed by someone else

A Submit Description File Cluster Computing # Example condor_submit input file # (Lines beginning with # are comments) # NOTE: the words on the left side are not # case sensitive, but filenames are! Universe = vanilla Executable = /home/wright/condor/my_job.condor Input = my_job.stdin Output = my_job.stdout Error = my_job.stderr Arguments = -arg1 -arg2 InitialDir = /home/wright/condor/run_1 Queue

What is Matchmaking? Cluster Computing Condor uses Matchmaking to make sure that work gets done within the constraints of both users and owners. Users (jobs) have constraints: I need an Alpha with 256 MB RAM Owners (machines) have constraints: Only run jobs when I am away from my desk and never run jobs owned by Bob.

Process Checkpointing Cluster Computing Condor s Process Checkpointing mechanism saves all the state of a process into a checkpoint file Memory, CPU, I/O, etc. The process can then be restarted from right where it left off Typically no changes to your job s source code needed however, your job must be relinked with Condor s Standard Universe support library

Remote System Calls Cluster Computing I/O System calls trapped and sent back to submit machine Allows Transparent Migration Across Administrative Domains Checkpoint on machine A, restart on B No Source Code changes required Language Independent Opportunities for Application Steering Example: Condor tells customer process how to open files

DIPC DIPC Distributed Inter Process Communication Provides Sys V IPC in distributed environments (including SHMEM)

MOSIX and its characteristics Cluster Computing Software that can transform a Linux cluster of x86 based workstations and servers to run almost like an SMP Has the ability to distribute and redistribute the processes among the nodes

MOSIX Dynamic migration added to the BSD kernel Uses TCP/IP for communication between workstations Requires Homogeneous networks

MOSIX All processes start their life at the users workstation Migration is transparent and preemptive Migrated processes use local resources as much as possible and the resources on the home workstation otherwise

Process Migration in MOSIX Cluster Computing User-level User-level Local process Deputy Link Layer Remote Link Layer Kernel Kernel A local process and a migrated process

MOSIX

Mosix Make

GLUnix Global Layer Unix Pure user-level layer that takes over the role of the operating system from the point of the user New processes can then be placed where there is most available memory (CPU)

PVM Provides a virtual machine on top of existing OS on a network Processes can still access the native OS resources PVM supports heterogeneous environments!

PVM The primary concern of PVM is to provide Dynamic process creation Process management including signals Communication between processes The machine can be handled during runtime

PVM PVM Demon PVM Demon PVM Demon PVM PVM Task PVM Task PVM Task PVM Task PVM Task PVM Task

MUNGI Single Address Space Operating system Requires 64 bit architecture Designed as an object based OS Protection is implemented as capabilities, to ensure scalability MUNGI uses capability trees rather than lists

Amoeba The computer is modeled as a network of resources Processes are started where they best fit Protection is implemented as capability lists Amoeba is centered around an efficient broadcast mechanish

Amoeba

Programming NOW Dynamic load balancing Dynamic orchestration

Dynamic Load Balancing Base your applications on redundant parallelism Rely on the OS to balance the application over the CPUs Rather few applications can be orchestrated in this way

Barnes Hut Cluster Computing Galaxy simulations are still quite interresting Basic formula is: G m m 1 r 2 2 Naïve algorithm is O(n 2 )

Barnes Hut

Barnes Hut Cluster Computing O(n log n)

Balancing Barnes Hut

Dynamic Orchestration Divide your application into a job-queue Spawn workers Let the workers take and execute jobs from the queue Not all applications can be orchestrated in this way Does not scale well job-queue process may become a bottleneck

Parallel integration Cluster Computing x 2 y 2 z 2 f ( x, y, z) x 1 y 1 z 1

Parallel integration Cluster Computing Split the outer integral Jobs = range(x 1, x 2, interval) Tasks = integral with x 1 = Jobs i, x 2 =Jobs i+1 ; for i in len(jobs -1) Result = Sum(Execute(Tasks))