ADVANCED COMPUTER ARCHITECTURE: Parallelism, Scalability, Programmability

Size: px
Start display at page:

Download "ADVANCED COMPUTER ARCHITECTURE: Parallelism, Scalability, Programmability"

Transcription

1 ADVANCED COMPUTER ARCHITECTURE: Parallelism, Scalability, Programmability * Technische Hochschule Darmstadt FACHBEREiCH INTORMATIK Kai Hwang Professor of Electrical Engineering and Computer Science University of Southern California., J fl *. * Iriventar-Nr.: Sachgebiete:...< Standort: McGraw-Hill, Inc. New York St Louis San Francisco Auckland Bogota Caracas Lisbon London Madrid Mexico Milan Montreal New Delhi Paris San Juan Singapore Sydney Tokyo Toronto

2 Contents Foreword Preface xvii xix PART I THEORY OF PARALLELISM 1 Chapter 1 Parallel Computer Models The State of Computing...! Computer Development Milestones Elements of Modern Computers Evolution of Computer Architecture System Attributes to Performance Multiprocessors and Multicomputer^ Shared-Memory Multiprocessors Distributed-Memory Multicomputers A Taxonomy of MIMD Computers Multivector and SIMD Computers Vector Supercomputers SIMD Supercomputers PRAM and VLSI Models Parallel Random-Access Machines VLSI Complexity Model Architectural Development Tracks Multiple-Processor Tracks Multivector and SIMD Tracks Multithreaded and Dataflow Tracks Bibliographic Notes and Exercises 45 IX

3 : Contents Chapter 2 Program and Network Properties Conditions of Parallelism Data and Resource Dependences Hardware and Software Parallelism The Role of Compilers Program Partitioning and Scheduling Grain Sizes and Latency Grain Packing and Scheduling Static Multiprocessor Scheduling Program Flow Mechanisms ' Control Flow Versus Data Flow Demand-Driven Mechanisms Comparison of Flow Mechanisms System Interconnect Architectures Network Properties and Routing Static Connection Networks.' Dynamic Connection Networks Bibliographic Notes and Exercises 96 Chapter 3 Principles of Scalable Performance Performance Metrics and Measures Parallelism Profile in Programs Harmonic Mean Performance 'Efficiency, Utilization, and Quality Standard Performance Measures Parallel Processing Applications Massive Parallelism for Grand Challenges Application Models of Parallel Computers Scalability of Parallel Algorithms Speedup Performance Laws Amdahl's Law for a Fixed Workload Gustafson's Law for Scaled Problems Memory-Bounded Speedup Model Scalability Analysis and. Approaches Scalability Metrics and Goals Evolution of Scalable Computers Research Issues and Solutions Bibliographic Notes and Exercises 149 PART II HARDWARE TECHNOLOGIES 155

4 Contents xi Chapter 4 Processors and Memory Hierarchy Advanced Processor Technology Design Space of Processors Instruction-Set Architectures CISC Scalar Processors RISC Scalar Processors Superscalar and Vector Processors Superscalar Processors The VLIW Architecture Vector and Symbolic Processors Memory Hierarchy Technology Hierarchical Memory Technology Inclusion, Coherence, and Locality Memory Capacity Planning Virtual Memory Technology Virtual Memory Models TLB, Paging, and Segmentation Memory Replacement Policies Bibliographic Notes and Exercises 208 Chapter 5 Bus, Cache, and Shared Memory Backplane Bus Systems Backplane Bus Specification Addressing and Timing Protocols Arbitration, Transaction, and Interrupt The IEEE Futurebus+ Standards Cache Memory Organizations Cache Addressing Models Direct Mapping and Associative Caches Set-Associative and Sector Caches Cache Performance Issues Shared-Memory Organizations Interleaved Memory Organization Bandwidth and Fault Tolerance' Memory Allocation Schemes Sequential and Weak Consistency Models Atomicity and Event Ordering Sequential Consistency Model Weak Consistency Models Bibliographic Notes and Exercises 256 Chapter 6 Pipelining and Superscalar Techniques 265

5 xii Contents 6.1 Linear Pipeline Processors Asynchronous and Synchronous Models Clocking and Timing Control Speedup, Efficiency, and Throughput Nonlinear Pipeline Processors Reservation and Latency Analysis Collision-Free Scheduling Pipeline Schedule Optimization Instruction Pipeline Design Instruction Execution Phases Mechanisms for Instruction Pipelining Dynamic Instruction Scheduling Branch Handling Techniques Arithmetic Pipeline Design Computer Arithmetic Principles Static Arithmetic Pipelines Multifunctional Arithmetic Pipelines Superscalar and Superpipeline Design Superscalar Pipeline Design Superpipelined Design Supersymmetry and Design Tradeoffs Bibliographic Notes and Exercises 322 PART III PARALLEL AND SCALABLE ARCHITECTURES 329 Chapter 7 Multiprocessors and Multicomputers Multiprocessor System Interconnects ; Hierarchical Bus Systems Crossbar Switch and Multiport Memory, Multistage and Combining Networks Cache Coherence and Synchronization Mechanisms The Cache Coherence Problem Snoopy Bus Protocols Directory-Based Protocols Hardware Synchronization Mechanisms Three Generations of Multicomputers Design Choices in the Past...' Present and Future Development The Intel Paragon System Message-Passing Mechanisms Message-Routing Schemes 375

6 Contents xiii Deadlock and Virtual Channels Flow Control Strategies Multicast Routing Algorithms Bibliographic Notes and Exercises 393 Chapter 8 Multivector and SIMD Computers Vector Processing Principles Vector Instruction Types Vector-Access Memory Schemes Past and Present Supercomputers Multivector Multiprocessors. ' Performance-Directed Design Rules Cray Y-MP, C-90, and MPP Fujitsu VP2000 and VPP Mainframes and Minisupercomputers Compound Vector Processing Compound Vector Operations Vector Loops and Chaining...' Multipipeline Networking SIMD Computer Organizations Implementation Models The CM-2 Architecture The MasPar MP-1 Architecture The Connection Machine CM A Synchronized MIMD Machine The CM-5 Network Architecture Control Processors and Processing Nodes Interprocessor Communications Bibliographic Notes and Exercises 468 Chapter 9 Scalable, Multithreaded, and Dataflow Architectures Latency-Hiding Techniques Shared Virtual Memory Prefetching Techniques Distributed Coherent Caches " Scalable Coherence Interface Relaxed Memory Consistency Principles of Multithreading : Multithreading Issues and Solutions Multiple-Context Processors Multidimensional Architectures Fine-Grain Multicomputers 504

7 xiv Contents Fine-Grain Parallelism The MIT J-Machine The Caltech Mosaic C Scalable and Multithreaded Architectures The Stanford Dash Multiprocessor The Kendall Square Research KSR The Tera Multiprocessor System Dataflow and Hybrid Architectures The Evolution of Dataflow Computers The ETL/EM-4 in Japan ' The MIT/Motorola *T Prototype : Bibliographic Notes and Exercises 539 PART IV SOFTWARE FOR PARALLEL PROGRAMMING 545 Chapter 10 Parallel Models, Languages, and Compilers Parallel Programming Models Shared-Variable Model Message-Passing Model Data-Parallel Model Object-Oriented Model Functional and Logic Models 559, 10.2 Parallel Languages and Compilers Language Features for Parallelism Parallel Language Constructs Optimizing Compilers for Parallelism Dependence Analysis of Data Arrays Iteration Space and Dependence Analysis Subscript Separability and Partitioning Categorized Dependence Tests Code Optimization and Scheduling Scalar Optimization with Basic Blocks Local and Global Optimizations Vectorization and Parallelization Methods Code Generation and Scheduling Trace Scheduling Compilation Loop Parallelization and Pipelining Loop Transformation Theory Parallelization and Wavefronting Tiling and Localization Software Pipelining 610

8 Contents xv 10.6 Bibliographic Notes and Exercises 612 Chapter 11 Parallel Program Development and Environments Parallel Programming Environments Software Tools and Environments Y-MP, Paragon, and CM-5 Environments Visualization and Performance Tuning Synchronization and Multiprocessing Modes Principles of Synchronization Multiprocessor Execution Modes Multitasking on Cray Multiprocessors Shared-Variable Program Structures Locks for Protected Access Semaphores and Applications Monitors and Applications Message-Passing Program Development Distributing the Computation Synchronous Message Passing Asynchronous Message Passing Mapping Programs onto Multicomputers Domain Decomposition Techniques ' Control Decomposition Techniques Heterogeneous Processing Bibliographic Notes and Exercises 661 Chapter 12 UNIX, Mach, and OSF/1 for Parallel Computers Multiprocessor UNIX Design Goals Conventional UNIX Limitations Compatibility and Portability Address Space and Load Balancing Parallel I/O and Network Services Master-Slave and Multithreaded UNIX Master-Slave Kernels Floating-Executive Kernels Multithreaded UNIX Kernel Multicomputer UNIX Extensions Message-Passing OS Models Cosmic Environment and Reactive Kernel Intel NX/2 Kernel and Extensions Mach/OS Kernel Architecture Mach/OS Kernel Functions Multithreaded Multitasking 688

9 xvi ' Contents Message-Based Communications, Virtual Memory Management OSF/1 Architecture and Applications The OSF/1 Architecture The OSF/1 Programming Environment Improving Performance with Threads Bibliographic Notes and Exercises 712 Bibliography 717 Index 739 Answers to Selected Problems 765

Computer Organization

Computer Organization Computer Organization and Architecture Designing for Performance Ninth Edition William Stallings International Edition contributions by R. Mohan National Institute of Technology, Tiruchirappalli PEARSON

More information

UNIT 2 CLASSIFICATION OF PARALLEL COMPUTERS

UNIT 2 CLASSIFICATION OF PARALLEL COMPUTERS UNIT 2 CLASSIFICATION OF PARALLEL COMPUTERS Structure Page Nos. 2.0 Introduction 27 2.1 Objectives 27 2.2 Types of Classification 28 2.3 Flynn s Classification 28 2.3.1 Instruction Cycle 2.3.2 Instruction

More information

A Lab Course on Computer Architecture

A Lab Course on Computer Architecture A Lab Course on Computer Architecture Pedro López José Duato Depto. de Informática de Sistemas y Computadores Facultad de Informática Universidad Politécnica de Valencia Camino de Vera s/n, 46071 - Valencia,

More information

Lecture 11: Multi-Core and GPU. Multithreading. Integration of multiple processor cores on a single chip.

Lecture 11: Multi-Core and GPU. Multithreading. Integration of multiple processor cores on a single chip. Lecture 11: Multi-Core and GPU Multi-core computers Multithreading GPUs General Purpose GPUs Zebo Peng, IDA, LiTH 1 Multi-Core System Integration of multiple processor cores on a single chip. To provide

More information

Lecture 2 Parallel Programming Platforms

Lecture 2 Parallel Programming Platforms Lecture 2 Parallel Programming Platforms Flynn s Taxonomy In 1966, Michael Flynn classified systems according to numbers of instruction streams and the number of data stream. Data stream Single Multiple

More information

Concurrent Programming

Concurrent Programming Concurrent Programming Principles and Practice Gregory R. Andrews The University of Arizona Technische Hochschule Darmstadt FACHBEREICH INFCRMATIK BIBLIOTHEK Inventar-Nr.:..ZP.vAh... Sachgebiete:..?r.:..\).

More information

PARALLEL PROGRAMMING

PARALLEL PROGRAMMING PARALLEL PROGRAMMING TECHNIQUES AND APPLICATIONS USING NETWORKED WORKSTATIONS AND PARALLEL COMPUTERS 2nd Edition BARRY WILKINSON University of North Carolina at Charlotte Western Carolina University MICHAEL

More information

High Performance Computing

High Performance Computing High Performance Computing Trey Breckenridge Computing Systems Manager Engineering Research Center Mississippi State University What is High Performance Computing? HPC is ill defined and context dependent.

More information

Principles and characteristics of distributed systems and environments

Principles and characteristics of distributed systems and environments Principles and characteristics of distributed systems and environments Definition of a distributed system Distributed system is a collection of independent computers that appears to its users as a single

More information

Chapter 2 Parallel Computer Architecture

Chapter 2 Parallel Computer Architecture Chapter 2 Parallel Computer Architecture The possibility for a parallel execution of computations strongly depends on the architecture of the execution platform. This chapter gives an overview of the general

More information

COMP 422, Lecture 3: Physical Organization & Communication Costs in Parallel Machines (Sections 2.4 & 2.5 of textbook)

COMP 422, Lecture 3: Physical Organization & Communication Costs in Parallel Machines (Sections 2.4 & 2.5 of textbook) COMP 422, Lecture 3: Physical Organization & Communication Costs in Parallel Machines (Sections 2.4 & 2.5 of textbook) Vivek Sarkar Department of Computer Science Rice University vsarkar@rice.edu COMP

More information

BLM 413E - Parallel Programming Lecture 3

BLM 413E - Parallel Programming Lecture 3 BLM 413E - Parallel Programming Lecture 3 FSMVU Bilgisayar Mühendisliği Öğr. Gör. Musa AYDIN 14.10.2015 2015-2016 M.A. 1 Parallel Programming Models Parallel Programming Models Overview There are several

More information

Lecture 23: Multiprocessors

Lecture 23: Multiprocessors Lecture 23: Multiprocessors Today s topics: RAID Multiprocessor taxonomy Snooping-based cache coherence protocol 1 RAID 0 and RAID 1 RAID 0 has no additional redundancy (misnomer) it uses an array of disks

More information

Oracle Backup & Recovery

Oracle Backup & Recovery ORACLG«Oracle Press Oracle Backup & Recovery Rama Velpuri Osborne McGraw-Hill Berkeley New York St. Louis San Francisco Auckland Bogota Hamburg London Madrid Mexico City Milan Montreal New Delhi Panama

More information

Introduction to Cloud Computing

Introduction to Cloud Computing Introduction to Cloud Computing Parallel Processing I 15 319, spring 2010 7 th Lecture, Feb 2 nd Majd F. Sakr Lecture Motivation Concurrency and why? Different flavors of parallel computing Get the basic

More information

Architecture of Hitachi SR-8000

Architecture of Hitachi SR-8000 Architecture of Hitachi SR-8000 University of Stuttgart High-Performance Computing-Center Stuttgart (HLRS) www.hlrs.de Slide 1 Most of the slides from Hitachi Slide 2 the problem modern computer are data

More information

Chapter 2 Parallel Architecture, Software And Performance

Chapter 2 Parallel Architecture, Software And Performance Chapter 2 Parallel Architecture, Software And Performance UCSB CS140, T. Yang, 2014 Modified from texbook slides Roadmap Parallel hardware Parallel software Input and output Performance Parallel program

More information

Chapter 07: Instruction Level Parallelism VLIW, Vector, Array and Multithreaded Processors. Lesson 05: Array Processors

Chapter 07: Instruction Level Parallelism VLIW, Vector, Array and Multithreaded Processors. Lesson 05: Array Processors Chapter 07: Instruction Level Parallelism VLIW, Vector, Array and Multithreaded Processors Lesson 05: Array Processors Objective To learn how the array processes in multiple pipelines 2 Array Processor

More information

Operating Systems for Parallel Processing Assistent Lecturer Alecu Felician Economic Informatics Department Academy of Economic Studies Bucharest

Operating Systems for Parallel Processing Assistent Lecturer Alecu Felician Economic Informatics Department Academy of Economic Studies Bucharest Operating Systems for Parallel Processing Assistent Lecturer Alecu Felician Economic Informatics Department Academy of Economic Studies Bucharest 1. Introduction Few years ago, parallel computers could

More information

Centralized Systems. A Centralized Computer System. Chapter 18: Database System Architectures

Centralized Systems. A Centralized Computer System. Chapter 18: Database System Architectures Chapter 18: Database System Architectures Centralized Systems! Centralized Systems! Client--Server Systems! Parallel Systems! Distributed Systems! Network Types! Run on a single computer system and do

More information

Introduction to Parallel Computing. George Karypis Parallel Programming Platforms

Introduction to Parallel Computing. George Karypis Parallel Programming Platforms Introduction to Parallel Computing George Karypis Parallel Programming Platforms Elements of a Parallel Computer Hardware Multiple Processors Multiple Memories Interconnection Network System Software Parallel

More information

Vorlesung Rechnerarchitektur 2 Seite 178 DASH

Vorlesung Rechnerarchitektur 2 Seite 178 DASH Vorlesung Rechnerarchitektur 2 Seite 178 Architecture for Shared () The -architecture is a cache coherent, NUMA multiprocessor system, developed at CSL-Stanford by John Hennessy, Daniel Lenoski, Monica

More information

Study Plan Masters of Science in Computer Engineering and Networks (Thesis Track)

Study Plan Masters of Science in Computer Engineering and Networks (Thesis Track) Plan Number 2009 Study Plan Masters of Science in Computer Engineering and Networks (Thesis Track) I. General Rules and Conditions 1. This plan conforms to the regulations of the general frame of programs

More information

Fundamentals of Mobile and Pervasive Computing

Fundamentals of Mobile and Pervasive Computing Fundamentals of Mobile and Pervasive Computing Frank Adelstein Sandeep K. S. Gupta Golden G. Richard III Loren Schwiebert Technische Universitat Darmstadt FACHBEREICH INFORMATIK B1BLIOTHEK Inventar-Nr.:

More information

Computer Architecture TDTS10

Computer Architecture TDTS10 why parallelism? Performance gain from increasing clock frequency is no longer an option. Outline Computer Architecture TDTS10 Superscalar Processors Very Long Instruction Word Processors Parallel computers

More information

An Introduction to Parallel Computing/ Programming

An Introduction to Parallel Computing/ Programming An Introduction to Parallel Computing/ Programming Vicky Papadopoulou Lesta Astrophysics and High Performance Computing Research Group (http://ahpc.euc.ac.cy) Dep. of Computer Science and Engineering European

More information

LSN 2 Computer Processors

LSN 2 Computer Processors LSN 2 Computer Processors Department of Engineering Technology LSN 2 Computer Processors Microprocessors Design Instruction set Processor organization Processor performance Bandwidth Clock speed LSN 2

More information

NEURAL NETWORK FUNDAMENTALS WITH GRAPHS, ALGORITHMS, AND APPLICATIONS

NEURAL NETWORK FUNDAMENTALS WITH GRAPHS, ALGORITHMS, AND APPLICATIONS NEURAL NETWORK FUNDAMENTALS WITH GRAPHS, ALGORITHMS, AND APPLICATIONS N. K. Bose HRB-Systems Professor of Electrical Engineering The Pennsylvania State University, University Park P. Liang Associate Professor

More information

OPERATING SYSTEMS Internais and Design Principles

OPERATING SYSTEMS Internais and Design Principles OPERATING SYSTEMS Internais and Design Principles FOURTH EDITION William Stallings, Ph.D. Prentice Hall Upper Saddle River, New Jersey 07458 CONTENTS Web Site for Operating Systems: Internais and Design

More information

CMSC 611: Advanced Computer Architecture

CMSC 611: Advanced Computer Architecture CMSC 611: Advanced Computer Architecture Parallel Computation Most slides adapted from David Patterson. Some from Mohomed Younis Parallel Computers Definition: A parallel computer is a collection of processing

More information

Weighted Total Mark. Weighted Exam Mark

Weighted Total Mark. Weighted Exam Mark CMP2204 Operating System Technologies Period per Week Contact Hour per Semester Total Mark Exam Mark Continuous Assessment Mark Credit Units LH PH TH CH WTM WEM WCM CU 45 30 00 60 100 40 100 4 Rationale

More information

Computer System Design. System-on-Chip

Computer System Design. System-on-Chip Brochure More information from http://www.researchandmarkets.com/reports/2171000/ Computer System Design. System-on-Chip Description: The next generation of computer system designers will be less concerned

More information

IBM CELL CELL INTRODUCTION. Project made by: Origgi Alessandro matr. 682197 Teruzzi Roberto matr. 682552 IBM CELL. Politecnico di Milano Como Campus

IBM CELL CELL INTRODUCTION. Project made by: Origgi Alessandro matr. 682197 Teruzzi Roberto matr. 682552 IBM CELL. Politecnico di Milano Como Campus Project made by: Origgi Alessandro matr. 682197 Teruzzi Roberto matr. 682552 CELL INTRODUCTION 2 1 CELL SYNERGY Cell is not a collection of different processors, but a synergistic whole Operation paradigms,

More information

ENTERPRISE RESOURCE PLANNING

ENTERPRISE RESOURCE PLANNING ENTERPRISE RESOURCE PLANNING ~SECOND E DITION~ ENTERPRISE RESOURCE PLANNING ~SECOND E DITION~ Alexis Leon L&L Consultancy Services Pvt Ltd Kochi Tata McGraw-Hill Publishing Company Limited NEW DELHI McGraw-Hill

More information

Scalability and Classifications

Scalability and Classifications Scalability and Classifications 1 Types of Parallel Computers MIMD and SIMD classifications shared and distributed memory multicomputers distributed shared memory computers 2 Network Topologies static

More information

Tools Page 1 of 13 ON PROGRAM TRANSLATION. A priori, we have two translation mechanisms available:

Tools Page 1 of 13 ON PROGRAM TRANSLATION. A priori, we have two translation mechanisms available: Tools Page 1 of 13 ON PROGRAM TRANSLATION A priori, we have two translation mechanisms available: Interpretation Compilation On interpretation: Statements are translated one at a time and executed immediately.

More information

Chapter 18: Database System Architectures. Centralized Systems

Chapter 18: Database System Architectures. Centralized Systems Chapter 18: Database System Architectures! Centralized Systems! Client--Server Systems! Parallel Systems! Distributed Systems! Network Types 18.1 Centralized Systems! Run on a single computer system and

More information

DEC Networks and Architectures

DEC Networks and Architectures DEC Networks and Architectures Carl Malamud Intertext Publications McGraw-Hill Book Company New York St. Louis San Francisco Auckland Bogota Hamburg London Madrid Mexico Milan Montreal New Delhi Panama

More information

FPGA-based Multithreading for In-Memory Hash Joins

FPGA-based Multithreading for In-Memory Hash Joins FPGA-based Multithreading for In-Memory Hash Joins Robert J. Halstead, Ildar Absalyamov, Walid A. Najjar, Vassilis J. Tsotras University of California, Riverside Outline Background What are FPGAs Multithreaded

More information

Systolic Computing. Fundamentals

Systolic Computing. Fundamentals Systolic Computing Fundamentals Motivations for Systolic Processing PARALLEL ALGORITHMS WHICH MODEL OF COMPUTATION IS THE BETTER TO USE? HOW MUCH TIME WE EXPECT TO SAVE USING A PARALLEL ALGORITHM? HOW

More information

Enterprise Java. Where, How, When (and When Not) to Apply Java in Client/Server Business Environments. Jeffrey Savit Sean Wilcox Bhuvana Jayaraman

Enterprise Java. Where, How, When (and When Not) to Apply Java in Client/Server Business Environments. Jeffrey Savit Sean Wilcox Bhuvana Jayaraman Enterprise Java Where, How, When (and When Not) to Apply Java in Client/Server Business Environments Jeffrey Savit Sean Wilcox Bhuvana Jayaraman McGraw-Hill j New York San Francisco Washington, D.C. Auckland

More information

Operating Systems Principles

Operating Systems Principles bicfm page i Operating Systems Principles Lubomir F. Bic University of California, Irvine Alan C. Shaw University of Washington, Seattle PEARSON EDUCATION INC. Upper Saddle River, New Jersey 07458 bicfm

More information

Distributed Systems LEEC (2005/06 2º Sem.)

Distributed Systems LEEC (2005/06 2º Sem.) Distributed Systems LEEC (2005/06 2º Sem.) Introduction João Paulo Carvalho Universidade Técnica de Lisboa / Instituto Superior Técnico Outline Definition of a Distributed System Goals Connecting Users

More information

Parallel Computing. Benson Muite. benson.muite@ut.ee http://math.ut.ee/ benson. https://courses.cs.ut.ee/2014/paralleel/fall/main/homepage

Parallel Computing. Benson Muite. benson.muite@ut.ee http://math.ut.ee/ benson. https://courses.cs.ut.ee/2014/paralleel/fall/main/homepage Parallel Computing Benson Muite benson.muite@ut.ee http://math.ut.ee/ benson https://courses.cs.ut.ee/2014/paralleel/fall/main/homepage 3 November 2014 Hadoop, Review Hadoop Hadoop History Hadoop Framework

More information

Operating System Impact on SMT Architecture

Operating System Impact on SMT Architecture Operating System Impact on SMT Architecture The work published in An Analysis of Operating System Behavior on a Simultaneous Multithreaded Architecture, Josh Redstone et al., in Proceedings of the 9th

More information

Outline. Distributed DBMS

Outline. Distributed DBMS Outline Introduction Background Architecture Distributed Database Design Semantic Data Control Distributed Query Processing Distributed Transaction Management Data server approach Parallel architectures

More information

Client/Server Computing Distributed Processing, Client/Server, and Clusters

Client/Server Computing Distributed Processing, Client/Server, and Clusters Client/Server Computing Distributed Processing, Client/Server, and Clusters Chapter 13 Client machines are generally single-user PCs or workstations that provide a highly userfriendly interface to the

More information

Parallel Computing for Data Science

Parallel Computing for Data Science Parallel Computing for Data Science With Examples in R, C++ and CUDA Norman Matloff University of California, Davis USA (g) CRC Press Taylor & Francis Group Boca Raton London New York CRC Press is an imprint

More information

Effective Computing with SMP Linux

Effective Computing with SMP Linux Effective Computing with SMP Linux Multi-processor systems were once a feature of high-end servers and mainframes, but today, even desktops for personal use have multiple processors. Linux is a popular

More information

MS GRADUATE PROGRAM IN COMPUTER ENGINEERING

MS GRADUATE PROGRAM IN COMPUTER ENGINEERING MS GRADUATE PROGRAM IN COMPUTER ENGINEERING INTRODUCTION The increased interaction between computing and communication in recent years is changing the landscape of computer engineering. There is now an

More information

A Robust Dynamic Load-balancing Scheme for Data Parallel Application on Message Passing Architecture

A Robust Dynamic Load-balancing Scheme for Data Parallel Application on Message Passing Architecture A Robust Dynamic Load-balancing Scheme for Data Parallel Application on Message Passing Architecture Yangsuk Kee Department of Computer Engineering Seoul National University Seoul, 151-742, Korea Soonhoi

More information

WebLogic Server 11g Administration Handbook

WebLogic Server 11g Administration Handbook ORACLE: Oracle Press Oracle WebLogic Server 11g Administration Handbook Sam R. Alapati Mc Graw Hill New York Chicago San Francisco Lisbon London Madrid Mexico City Milan New Delhi San Juan Seoul Singapore

More information

How To Understand The Concept Of A Distributed System

How To Understand The Concept Of A Distributed System Distributed Operating Systems Introduction Ewa Niewiadomska-Szynkiewicz and Adam Kozakiewicz ens@ia.pw.edu.pl, akozakie@ia.pw.edu.pl Institute of Control and Computation Engineering Warsaw University of

More information

Symmetric Multiprocessing

Symmetric Multiprocessing Multicore Computing A multi-core processor is a processing system composed of two or more independent cores. One can describe it as an integrated circuit to which two or more individual processors (called

More information

Principles of Operating Systems CS 446/646

Principles of Operating Systems CS 446/646 Principles of Operating Systems CS 446/646 1. Introduction to Operating Systems a. Role of an O/S b. O/S History and Features c. Types of O/S Mainframe systems Desktop & laptop systems Parallel systems

More information

Making Multicore Work and Measuring its Benefits. Markus Levy, president EEMBC and Multicore Association

Making Multicore Work and Measuring its Benefits. Markus Levy, president EEMBC and Multicore Association Making Multicore Work and Measuring its Benefits Markus Levy, president EEMBC and Multicore Association Agenda Why Multicore? Standards and issues in the multicore community What is Multicore Association?

More information

The SpiceC Parallel Programming System of Computer Systems

The SpiceC Parallel Programming System of Computer Systems UNIVERSITY OF CALIFORNIA RIVERSIDE The SpiceC Parallel Programming System A Dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Computer Science

More information

SOC architecture and design

SOC architecture and design SOC architecture and design system-on-chip (SOC) processors: become components in a system SOC covers many topics processor: pipelined, superscalar, VLIW, array, vector storage: cache, embedded and external

More information

An Implementation Of Multiprocessor Linux

An Implementation Of Multiprocessor Linux An Implementation Of Multiprocessor Linux This document describes the implementation of a simple SMP Linux kernel extension and how to use this to develop SMP Linux kernels for architectures other than

More information

Building. Applications. in the Cloud. Concepts, Patterns, and Projects. AAddison-Wesley. Christopher M. Mo^ar. Cape Town Sydney.

Building. Applications. in the Cloud. Concepts, Patterns, and Projects. AAddison-Wesley. Christopher M. Mo^ar. Cape Town Sydney. Building Applications in the Cloud Concepts, Patterns, and Projects Christopher M. Mo^ar Upper Saddle River, NJ Boston AAddison-Wesley New York 'Toronto Montreal London Munich Indianapolis San Francisco

More information

High Performance Computer Architecture

High Performance Computer Architecture High Performance Computer Architecture Volker Lindenstruth Lehrstuhl für Hochleistungsrechner Archittektur Ruth-Moufang Str. 1 email: ti@compeng.de URL: www.compeng.de Telefon: 798-44100 Volker Lindenstruth

More information

The Data Access Handbook

The Data Access Handbook The Data Access Handbook Achieving Optimal Database Application Performance and Scalability John Goodson and Robert A. Steward PRENTICE HALL Upper Saddle River, NJ Boston Indianapolis San Francisco New

More information

High Performance Computing. Course Notes 2007-2008. HPC Fundamentals

High Performance Computing. Course Notes 2007-2008. HPC Fundamentals High Performance Computing Course Notes 2007-2008 2008 HPC Fundamentals Introduction What is High Performance Computing (HPC)? Difficult to define - it s a moving target. Later 1980s, a supercomputer performs

More information

Outline. Introduction. Multiprocessor Systems on Chip. A MPSoC Example: Nexperia DVP. A New Paradigm: Network on Chip

Outline. Introduction. Multiprocessor Systems on Chip. A MPSoC Example: Nexperia DVP. A New Paradigm: Network on Chip Outline Modeling, simulation and optimization of Multi-Processor SoCs (MPSoCs) Università of Verona Dipartimento di Informatica MPSoCs: Multi-Processor Systems on Chip A simulation platform for a MPSoC

More information

A Comparison of Distributed Systems: ChorusOS and Amoeba

A Comparison of Distributed Systems: ChorusOS and Amoeba A Comparison of Distributed Systems: ChorusOS and Amoeba Angelo Bertolli Prepared for MSIT 610 on October 27, 2004 University of Maryland University College Adelphi, Maryland United States of America Abstract.

More information

Software Performance and Scalability

Software Performance and Scalability Software Performance and Scalability A Quantitative Approach Henry H. Liu ^ IEEE )computer society WILEY A JOHN WILEY & SONS, INC., PUBLICATION Contents PREFACE ACKNOWLEDGMENTS xv xxi Introduction 1 Performance

More information

Lecture 1. Course Introduction

Lecture 1. Course Introduction Lecture 1 Course Introduction Welcome to CSE 262! Your instructor is Scott B. Baden Office hours (week 1) Tues/Thurs 3.30 to 4.30 Room 3244 EBU3B 2010 Scott B. Baden / CSE 262 /Spring 2011 2 Content Our

More information

ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-12: ARM

ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-12: ARM ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-12: ARM 1 The ARM architecture processors popular in Mobile phone systems 2 ARM Features ARM has 32-bit architecture but supports 16 bit

More information

Parallel Programming

Parallel Programming Parallel Programming Parallel Architectures Diego Fabregat-Traver and Prof. Paolo Bientinesi HPAC, RWTH Aachen fabregat@aices.rwth-aachen.de WS15/16 Parallel Architectures Acknowledgements Prof. Felix

More information

Parallel Programming Survey

Parallel Programming Survey Christian Terboven 02.09.2014 / Aachen, Germany Stand: 26.08.2014 Version 2.3 IT Center der RWTH Aachen University Agenda Overview: Processor Microarchitecture Shared-Memory

More information

Multi-core architectures. Jernej Barbic 15-213, Spring 2007 May 3, 2007

Multi-core architectures. Jernej Barbic 15-213, Spring 2007 May 3, 2007 Multi-core architectures Jernej Barbic 15-213, Spring 2007 May 3, 2007 1 Single-core computer 2 Single-core CPU chip the single core 3 Multi-core architectures This lecture is about a new trend in computer

More information

GEDAE TM - A Graphical Programming and Autocode Generation Tool for Signal Processor Applications

GEDAE TM - A Graphical Programming and Autocode Generation Tool for Signal Processor Applications GEDAE TM - A Graphical Programming and Autocode Generation Tool for Signal Processor Applications Harris Z. Zebrowitz Lockheed Martin Advanced Technology Laboratories 1 Federal Street Camden, NJ 08102

More information

Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes

Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes Eric Petit, Loïc Thebault, Quang V. Dinh May 2014 EXA2CT Consortium 2 WPs Organization Proto-Applications

More information

Rapid System Prototyping with FPGAs

Rapid System Prototyping with FPGAs Rapid System Prototyping with FPGAs By R.C. Coferand Benjamin F. Harding AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY TOKYO Newnes is an imprint of

More information

Study Guide. Professional vsphere 4. VCP VMware Certified. (ExamVCP4IO) Robert Schmidt. IVIC GratAf Hill

Study Guide. Professional vsphere 4. VCP VMware Certified. (ExamVCP4IO) Robert Schmidt. IVIC GratAf Hill VCP VMware Certified Professional vsphere 4 Study Guide (ExamVCP4IO) Robert Schmidt McGraw-Hill is an independent entity from VMware Inc. and is not affiliated with VMware Inc. in any manner.this study/training

More information

VISUALIZING DATA POWER VIEW. with MICROSOFT. Brian Larson. Mark Davis Dan English Paui Purington. Mc Grauu. Sydney Toronto

VISUALIZING DATA POWER VIEW. with MICROSOFT. Brian Larson. Mark Davis Dan English Paui Purington. Mc Grauu. Sydney Toronto VISUALIZING DATA with MICROSOFT POWER VIEW Brian Larson Mark Davis Dan English Paui Purington Mc Grauu New York Chicago San Francisco Lisbon London Madrid Mexico City Milan New Delhi San Juan Seoul Singapore

More information

Lecture 3: Modern GPUs A Hardware Perspective Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com

Lecture 3: Modern GPUs A Hardware Perspective Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com CSCI-GA.3033-012 Graphics Processing Units (GPUs): Architecture and Programming Lecture 3: Modern GPUs A Hardware Perspective Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com Modern GPU

More information

Web Server Architectures

Web Server Architectures Web Server Architectures CS 4244: Internet Programming Dr. Eli Tilevich Based on Flash: An Efficient and Portable Web Server, Vivek S. Pai, Peter Druschel, and Willy Zwaenepoel, 1999 Annual Usenix Technical

More information

High Performance Computing in the Multi-core Area

High Performance Computing in the Multi-core Area High Performance Computing in the Multi-core Area Arndt Bode Technische Universität München Technology Trends for Petascale Computing Architectures: Multicore Accelerators Special Purpose Reconfigurable

More information

Chapter 12: Multiprocessor Architectures. Lesson 09: Cache Coherence Problem and Cache synchronization solutions Part 1

Chapter 12: Multiprocessor Architectures. Lesson 09: Cache Coherence Problem and Cache synchronization solutions Part 1 Chapter 12: Multiprocessor Architectures Lesson 09: Cache Coherence Problem and Cache synchronization solutions Part 1 Objective To understand cache coherence problem To learn the methods used to solve

More information

Course Development of Programming for General-Purpose Multicore Processors

Course Development of Programming for General-Purpose Multicore Processors Course Development of Programming for General-Purpose Multicore Processors Wei Zhang Department of Electrical and Computer Engineering Virginia Commonwealth University Richmond, VA 23284 wzhang4@vcu.edu

More information

Interconnection Networks

Interconnection Networks Advanced Computer Architecture (0630561) Lecture 15 Interconnection Networks Prof. Kasim M. Al-Aubidy Computer Eng. Dept. Interconnection Networks: Multiprocessors INs can be classified based on: 1. Mode

More information

COS 318: Operating Systems. Virtual Machine Monitors

COS 318: Operating Systems. Virtual Machine Monitors COS 318: Operating Systems Virtual Machine Monitors Andy Bavier Computer Science Department Princeton University http://www.cs.princeton.edu/courses/archive/fall10/cos318/ Introduction Have been around

More information

Software Project Management (Second Edition)

Software Project Management (Second Edition) Software Project Management (Second Edition) Bob Hughes and Mike Cotterell, School of Information Management, University of Brighton The McGraw-Hill Companies London Burr Ridge, IL New York St Louis San

More information

Designing and Building Parallel Programs Ian Foster

Designing and Building Parallel Programs Ian Foster Designing and Building Parallel Programs Ian Foster http://www-unix.mcs.anl.gov/dbpp/text/book.html (Noviembre 2003) Preface Welcome to Designing and Building Parallel Programs! My goal in this book is

More information

List of courses MEngg (Computer Systems)

List of courses MEngg (Computer Systems) List of courses MEngg (Computer Systems) Course No. Course Title Non-Credit Courses CS-401 CS-402 CS-403 CS-404 CS-405 CS-406 Introduction to Programming Systems Design System Design using Microprocessors

More information

YarcData urika Technical White Paper

YarcData urika Technical White Paper YarcData urika Technical White Paper 2012 Cray Inc. All rights reserved. Specifications subject to change without notice. Cray is a registered trademark, YarcData, urika and Threadstorm are trademarks

More information

Control 2004, University of Bath, UK, September 2004

Control 2004, University of Bath, UK, September 2004 Control, University of Bath, UK, September ID- IMPACT OF DEPENDENCY AND LOAD BALANCING IN MULTITHREADING REAL-TIME CONTROL ALGORITHMS M A Hossain and M O Tokhi Department of Computing, The University of

More information

Design and Implementation of the Heterogeneous Multikernel Operating System

Design and Implementation of the Heterogeneous Multikernel Operating System 223 Design and Implementation of the Heterogeneous Multikernel Operating System Yauhen KLIMIANKOU Department of Computer Systems and Networks, Belarusian State University of Informatics and Radioelectronics,

More information

How To Write A Diagram

How To Write A Diagram Data Model ing Essentials Third Edition Graeme C. Simsion and Graham C. Witt MORGAN KAUFMANN PUBLISHERS AN IMPRINT OF ELSEVIER AMSTERDAM BOSTON LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE

More information

System Models for Distributed and Cloud Computing

System Models for Distributed and Cloud Computing System Models for Distributed and Cloud Computing Dr. Sanjay P. Ahuja, Ph.D. 2010-14 FIS Distinguished Professor of Computer Science School of Computing, UNF Classification of Distributed Computing Systems

More information

Chapter 2: OS Overview

Chapter 2: OS Overview Chapter 2: OS Overview CmSc 335 Operating Systems 1. Operating system objectives and functions Operating systems control and support the usage of computer systems. a. usage users of a computer system:

More information

EE482: Advanced Computer Organization Lecture #11 Processor Architecture Stanford University Wednesday, 31 May 2000. ILP Execution

EE482: Advanced Computer Organization Lecture #11 Processor Architecture Stanford University Wednesday, 31 May 2000. ILP Execution EE482: Advanced Computer Organization Lecture #11 Processor Architecture Stanford University Wednesday, 31 May 2000 Lecture #11: Wednesday, 3 May 2000 Lecturer: Ben Serebrin Scribe: Dean Liu ILP Execution

More information

This Unit: Putting It All Together. CIS 501 Computer Architecture. Sources. What is Computer Architecture?

This Unit: Putting It All Together. CIS 501 Computer Architecture. Sources. What is Computer Architecture? This Unit: Putting It All Together CIS 501 Computer Architecture Unit 11: Putting It All Together: Anatomy of the XBox 360 Game Console Slides originally developed by Amir Roth with contributions by Milo

More information

Operating Systems 4 th Class

Operating Systems 4 th Class Operating Systems 4 th Class Lecture 1 Operating Systems Operating systems are essential part of any computer system. Therefore, a course in operating systems is an essential part of any computer science

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION 1 CHAPTER 1 INTRODUCTION 1.1 MOTIVATION OF RESEARCH Multicore processors have two or more execution cores (processors) implemented on a single chip having their own set of execution and architectural recourses.

More information

OC By Arsene Fansi T. POLIMI 2008 1

OC By Arsene Fansi T. POLIMI 2008 1 IBM POWER 6 MICROPROCESSOR OC By Arsene Fansi T. POLIMI 2008 1 WHAT S IBM POWER 6 MICROPOCESSOR The IBM POWER6 microprocessor powers the new IBM i-series* and p-series* systems. It s based on IBM POWER5

More information

System Interconnect Architectures. Goals and Analysis. Network Properties and Routing. Terminology - 2. Terminology - 1

System Interconnect Architectures. Goals and Analysis. Network Properties and Routing. Terminology - 2. Terminology - 1 System Interconnect Architectures CSCI 8150 Advanced Computer Architecture Hwang, Chapter 2 Program and Network Properties 2.4 System Interconnect Architectures Direct networks for static connections Indirect

More information

Management. Oracle Fusion Middleware. 11 g Architecture and. Oracle Press ORACLE. Stephen Lee Gangadhar Konduri. Mc Grauu Hill.

Management. Oracle Fusion Middleware. 11 g Architecture and. Oracle Press ORACLE. Stephen Lee Gangadhar Konduri. Mc Grauu Hill. ORACLE Oracle Press Oracle Fusion Middleware 11 g Architecture and Management Reza Shafii Stephen Lee Gangadhar Konduri Mc Grauu Hill New York Chicago San Francisco Lisbon London Madrid Mexico City Milan

More information

Operating Systems. 05. Threads. Paul Krzyzanowski. Rutgers University. Spring 2015

Operating Systems. 05. Threads. Paul Krzyzanowski. Rutgers University. Spring 2015 Operating Systems 05. Threads Paul Krzyzanowski Rutgers University Spring 2015 February 9, 2015 2014-2015 Paul Krzyzanowski 1 Thread of execution Single sequence of instructions Pointed to by the program

More information