Some Computer Organizations and Their Effectiveness. Michael J Flynn. IEEE Transactions on Computers. Vol. c-21, No.
|
|
|
- Frederick Boone
- 10 years ago
- Views:
Transcription
1 Some Computer Organizations and Their Effectiveness Michael J Flynn IEEE Transactions on Computers. Vol. c-21, No.9, September 1972
2 Introduction Attempts to codify a computer have been from three points of view! Automata Theoritic or microscopic! Individual Problem Oriented! Global or Statistical
3 Microscopic View:! Relationships are described exhaustively! All parameters are considered irrespective of their importance in the problem Individual Poblem Oriented:! Compare Computer Organizations based on their relative performances in a peculiar environment! Such comparisons are limited because of their ad hoc nature
4 Global or Statistical Comparisons:! Made on basis of statistical tabulations of relative performances on various jobs or mixture of jobs Drawback: The analysis is of little consequence in the system as the base premise on which they were based has been changed
5 Objective of this paper: To reexamine the principal interactions within a processor system so as to generate a more macroscopic view, yet without reference to a particular environment Limitations: 1. I/O problems are not treated as a limiting resource 2. No assessment of particular instruction set 3. In spite of the fact that either (1) or (2) will dominate the assessment we will use the notion of efficiency
6 Classification: Forms of Computing Systems Stream : sequence of items (instructions or data) Based on the magnitude of interactions of the instructions and data streams, machines are classified as 1. SISD - Single Instruction, Single Data Stream 2. SIMD - Single Instruction, Multiple Data Stream 3. MISD - Multiple Instruction, Single Data Stream 4. MIMD - Multiple Instruction, Multiple Data Stream
7 Let us formalize the stream view of Computing in order to obtain a better insight. Consider a generalized system model consisting of a server and a requestor. Requestor - a Program Server a Resource In a stream of Requests, a Program at one stage can be a Resource to the Previous stage.
8 Let P be a Program a request for service. P specifies sub requests: R 1, R 2, R 3,... R n Each sub request in turn has its own subrequest resembling a tree. Any request R i is bifurcated as R = {f il,f iv } f il - logical role of a Requestor f iv - physical role of a Server
9 Logical Role of Requestor:! Define a result given an argument! Deine precendence among tasks defined by P! Transition between P and the next levels looked as logical role of R i, while actual physical service is at the leaves of the tree! Logical Role is defined by the term f i l
10 Service Role of Request:! Service role is defined by f i v! It defines the heirarchy of subrequests terminating in a primitive physical service resource. In exploring the Multiple-Stream organizations, we explore about the two considerations, 1. Latency of Interstream Communications 2. Possibilities of high computational bandwidth within a stream.
11 Interstream Communications: Two aspects of communication: In matrix representation, 1. (O X S) - Operational resource accessing Storage useful for MIMD organizations 2.(S X S) Storage accessing Storage useful for SIMD organizations There is also a (O X O ) matrix which is useful for MISD organization.
12 Alternate format of the matrix is the connection matrix An entry in the matrix is the reciprocal of normalized access time t ii / t ij Zero entry when there is no communication
13 Stream Inertia:! It's the concept of pipelining multiple instructions in a single instruction stream! The number of instructions simultaneously processed is reffered to as Confluence Factor (J)! Performance = J / L.!t L.!t Average execution time
14 System Classification: In order to summarize, a system may be classified according to, 1. The number of instruction and data streams 2. Appropriate Communication or Connection matrices 3. Stream Inertia (J) and Instruction Execution Latency (L).
15 Effectiveness in Performing the Computing Process Resolution of Entropy: (using Table Algorithm) In a computing system, let N1 words of p bits each be input, N2 words of p bits each be output. If each of input bit affects each of the output bits, the entropy of each output bit is N1 * p. There are 2 N1*p combinations and N2 * p bits for each entry. Hence the total size is N2 * p * 2 N1*p
16 As a minimum the output bits could be independent of the input bits in which cases, the entrophy Q is just N2 * p Thus, (p * N2) <= Q <= N2 * p * 2 N1*p
17 SISD and Stream Inertia In SISD organization there arises situations such as 1. Direct Composition an instruction requires an unavailable data 2. Indirect Composition - an address calculation is incomplete
18 Anticipation Factor (N): When instruction i determines a data and i+1 instruction requires that data, then there is a latency of (L-1) *!t is required. If we consider the serial latency time as L *!t, then we consider Anticipation Factor N as the number of instructions between the one determining the data and the one using it. For no turbulance, N >= J / L
19 Delay due to Turbulance: If there are M instructions with a probabilty of turbulance causing instruction p, The total time to execute these instructions would be,
20
21 Stream Inertia and effects of branch:
22 SIMD and its Effectiveness SIMD is of three types: 1. Array Processor one control unit and 'm' directly connected processors each element is independent.
23 2. Pipelined Processor same as array processor but each processor has specific operation each unit accepts a pair of operands every!t time unit.
24 3. Associative Processor Processing elements not directly addressed a processing element is activated when a match relation is satisfied between a input register and characteristic data contained in each of the processing element.
25 Difficulties of SIMD Organization: 1. Communications between processing elements 2. Vector Fitting: Matching of size between logical vector to be performed and the size of physical array which will process the vector. 3. Sequential Code: Housekeeping and Bookkeeping operations 4. Degradation due to Branching: When a branching occurs, the instructions that have already started execution will have to be flushed out if the branch is taken. 5. Performance proportional to log 2 m rather than linear due to the above factors.
26 Fitting Problem: Given a source vector of size m, in an array processor of size M, the performance is degraded if m is not divided among M processors. If m is large there is no significant degradation
27 If there are M data streams, SISD instruction takes M.L.!t time units SIMD instruction takes L.!t time units (ignoring the overlap) In order to achieve this 1/M bound, the problem should be divided into M code segments. We find that, the time to execute a block of N * instruction, where there is equal probability to take the branch or not,
28 Where, j is the level of nesting of a branch and N i,j is the number of source instructions in the primary or branch path for ith branch at level j If factor " i N i,j / N* is the probability of encountering a source instruction Minsky's Conjecture:
29 Since the performance is logarithmic and not linear, there is a performance degradation. If we consider this degradation to be due to branching, Let us find the P j and performance when it is equally likely to be at level 1,2,3... log 2 M Since no further degradation occurs beyond this level P 1 = P 2 = P j = P [log M] 1
30 Relative Performance with SISD overlapped processor, Thus SIMD organization is M times faster, Ignoring the effect of non integral values of log 2 M
31 for large M, If we assume that, Then,
32 Referring to the Vector Fitting Problem, we can interpret the number of data streams when comparing a SIMD processor to a pipelined processor. Pipelined Processors categories: There are two main types of Pipelined Processors, 1. Flushed: Until a vector operation is completed, the control unit does not issue the next operation 2. Unflushed: Next vector is issued as soon as the last element of the current operation is issued.
33 MIMD and Its Effectiveness: It is of at least two types. 1. True Multiprocessors: Many independent SI processors share storage at some level for cooperative execution of a multitask program 2. Shared Resource Multiprocessor: Skeleton processors are arranged to share system resource.
34 Problems in MIMD: 1. Communication / Composition overhead 2. Cost increases linearly with additional processors, while performance increases at a lower rate. 3. Providing a method for dynamic reconfiguration. There are two other problems, Precedence Problem and Lockout Problem Software Lockout: In a MIMD environment, A processor cannot access data due to interstream communication problems. It is termed as software lockout.the lockout time for jth processor is,
35 T ij is the communication Time p ij probability of task j accessing data stream i. MIMD Lockout is shown as,
36 Expected number of Locked-out Processors for a given number of processors (n) If a single processor has unit performance then n processors and normalized performance is given by,
37 In order to use the componants of the SISD overlapped computer, a multiple skeleton processors sharing the execution resources is considered. A single Skeleton processor has just enough resource to do load and store operations.
38 An optimal arrangement is space-time switching, which is shown as,
39 When the amount of parallelism is less than the number of processors available, a ring topology of the processor arrangement is followed which can be explained from the following diagram,
40 Some Considerations on System Resources: Execution Resources:! Execution bandwidth is always greater than the maximum performance of large systems! In SISD, Execution bandwidth is given by,
41 P i Pipelining factor for the ith unit in the SISD processor containing n different processing untis. t i - is the time required to fully execute the ith type of operation. It is given by the following diagram,
42 Instruction Control: A control area is responsible for finding the operand sink and source positions. The decision efficiency of this area is possible when it handles minimum number of interlocks and exceptional conditions and when it is being continually utilized. overlapped SIMD organizations has the simplest control structure when compared to SISD and SIMD.
43 Primary and Secondary Storage: Usually it is considered that a storage is most effeciently used when both the program and the data reside in the storage for the minimum time possible. Storage cost is given by i is the level of storage heirarchy c i is the cost per word at that level s i is the average number of words the program used t i is the time spent at that level
44 Finally, MI organizations will require both the data and the instruction to be available simultaneously at the lower levels while SIMD required only simutaneous access to M data sets and SISD has least intrinsic storage demands. Thus in general, This shows that MIMD and SIMD should have mre efficient program execution t i to have optimized the use of storage resource.
Scalability and Classifications
Scalability and Classifications 1 Types of Parallel Computers MIMD and SIMD classifications shared and distributed memory multicomputers distributed shared memory computers 2 Network Topologies static
Computer Architecture TDTS10
why parallelism? Performance gain from increasing clock frequency is no longer an option. Outline Computer Architecture TDTS10 Superscalar Processors Very Long Instruction Word Processors Parallel computers
UNIT 2 CLASSIFICATION OF PARALLEL COMPUTERS
UNIT 2 CLASSIFICATION OF PARALLEL COMPUTERS Structure Page Nos. 2.0 Introduction 27 2.1 Objectives 27 2.2 Types of Classification 28 2.3 Flynn s Classification 28 2.3.1 Instruction Cycle 2.3.2 Instruction
Lecture 2 Parallel Programming Platforms
Lecture 2 Parallel Programming Platforms Flynn s Taxonomy In 1966, Michael Flynn classified systems according to numbers of instruction streams and the number of data stream. Data stream Single Multiple
Systolic Computing. Fundamentals
Systolic Computing Fundamentals Motivations for Systolic Processing PARALLEL ALGORITHMS WHICH MODEL OF COMPUTATION IS THE BETTER TO USE? HOW MUCH TIME WE EXPECT TO SAVE USING A PARALLEL ALGORITHM? HOW
Lecture 23: Multiprocessors
Lecture 23: Multiprocessors Today s topics: RAID Multiprocessor taxonomy Snooping-based cache coherence protocol 1 RAID 0 and RAID 1 RAID 0 has no additional redundancy (misnomer) it uses an array of disks
Parallel Programming
Parallel Programming Parallel Architectures Diego Fabregat-Traver and Prof. Paolo Bientinesi HPAC, RWTH Aachen [email protected] WS15/16 Parallel Architectures Acknowledgements Prof. Felix
Chapter 2 Parallel Architecture, Software And Performance
Chapter 2 Parallel Architecture, Software And Performance UCSB CS140, T. Yang, 2014 Modified from texbook slides Roadmap Parallel hardware Parallel software Input and output Performance Parallel program
Interconnection Networks. Interconnection Networks. Interconnection networks are used everywhere!
Interconnection Networks Interconnection Networks Interconnection networks are used everywhere! Supercomputers connecting the processors Routers connecting the ports can consider a router as a parallel
Introduction to GPU Programming Languages
CSC 391/691: GPU Programming Fall 2011 Introduction to GPU Programming Languages Copyright 2011 Samuel S. Cho http://www.umiacs.umd.edu/ research/gpu/facilities.html Maryland CPU/GPU Cluster Infrastructure
Annotation to the assignments and the solution sheet. Note the following points
Computer rchitecture 2 / dvanced Computer rchitecture Seite: 1 nnotation to the assignments and the solution sheet This is a multiple choice examination, that means: Solution approaches are not assessed
INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET)
INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET) International Journal of Electronics and Communication Engineering & Technology (IJECET), ISSN 0976 ISSN 0976 6464(Print)
LSN 2 Computer Processors
LSN 2 Computer Processors Department of Engineering Technology LSN 2 Computer Processors Microprocessors Design Instruction set Processor organization Processor performance Bandwidth Clock speed LSN 2
CMSC 611: Advanced Computer Architecture
CMSC 611: Advanced Computer Architecture Parallel Computation Most slides adapted from David Patterson. Some from Mohomed Younis Parallel Computers Definition: A parallel computer is a collection of processing
Chapter 07: Instruction Level Parallelism VLIW, Vector, Array and Multithreaded Processors. Lesson 05: Array Processors
Chapter 07: Instruction Level Parallelism VLIW, Vector, Array and Multithreaded Processors Lesson 05: Array Processors Objective To learn how the array processes in multiple pipelines 2 Array Processor
Introduction to Cloud Computing
Introduction to Cloud Computing Parallel Processing I 15 319, spring 2010 7 th Lecture, Feb 2 nd Majd F. Sakr Lecture Motivation Concurrency and why? Different flavors of parallel computing Get the basic
PARALLEL PROGRAMMING
PARALLEL PROGRAMMING TECHNIQUES AND APPLICATIONS USING NETWORKED WORKSTATIONS AND PARALLEL COMPUTERS 2nd Edition BARRY WILKINSON University of North Carolina at Charlotte Western Carolina University MICHAEL
Load balancing in a heterogeneous computer system by self-organizing Kohonen network
Bull. Nov. Comp. Center, Comp. Science, 25 (2006), 69 74 c 2006 NCC Publisher Load balancing in a heterogeneous computer system by self-organizing Kohonen network Mikhail S. Tarkov, Yakov S. Bezrukov Abstract.
Windows Server Performance Monitoring
Spot server problems before they are noticed The system s really slow today! How often have you heard that? Finding the solution isn t so easy. The obvious questions to ask are why is it running slowly
Chapter 2. Multiprocessors Interconnection Networks
Chapter 2 Multiprocessors Interconnection Networks 2.1 Taxonomy Interconnection Network Static Dynamic 1-D 2-D HC Bus-based Switch-based Single Multiple SS MS Crossbar 2.2 Bus-Based Dynamic Single Bus
System Interconnect Architectures. Goals and Analysis. Network Properties and Routing. Terminology - 2. Terminology - 1
System Interconnect Architectures CSCI 8150 Advanced Computer Architecture Hwang, Chapter 2 Program and Network Properties 2.4 System Interconnect Architectures Direct networks for static connections Indirect
A Comparison of Distributed Systems: ChorusOS and Amoeba
A Comparison of Distributed Systems: ChorusOS and Amoeba Angelo Bertolli Prepared for MSIT 610 on October 27, 2004 University of Maryland University College Adelphi, Maryland United States of America Abstract.
Tools Page 1 of 13 ON PROGRAM TRANSLATION. A priori, we have two translation mechanisms available:
Tools Page 1 of 13 ON PROGRAM TRANSLATION A priori, we have two translation mechanisms available: Interpretation Compilation On interpretation: Statements are translated one at a time and executed immediately.
FAULT TOLERANCE FOR MULTIPROCESSOR SYSTEMS VIA TIME REDUNDANT TASK SCHEDULING
FAULT TOLERANCE FOR MULTIPROCESSOR SYSTEMS VIA TIME REDUNDANT TASK SCHEDULING Hussain Al-Asaad and Alireza Sarvi Department of Electrical & Computer Engineering University of California Davis, CA, U.S.A.
Assessment Plan for CS and CIS Degree Programs Computer Science Dept. Texas A&M University - Commerce
Assessment Plan for CS and CIS Degree Programs Computer Science Dept. Texas A&M University - Commerce Program Objective #1 (PO1):Students will be able to demonstrate a broad knowledge of Computer Science
Binary search tree with SIMD bandwidth optimization using SSE
Binary search tree with SIMD bandwidth optimization using SSE Bowen Zhang, Xinwei Li 1.ABSTRACT In-memory tree structured index search is a fundamental database operation. Modern processors provide tremendous
Big Data: Rethinking Text Visualization
Big Data: Rethinking Text Visualization Dr. Anton Heijs [email protected] Treparel April 8, 2013 Abstract In this white paper we discuss text visualization approaches and how these are important
Remote Copy Technology of ETERNUS6000 and ETERNUS3000 Disk Arrays
Remote Copy Technology of ETERNUS6000 and ETERNUS3000 Disk Arrays V Tsutomu Akasaka (Manuscript received July 5, 2005) This paper gives an overview of a storage-system remote copy function and the implementation
COMPUTER ORGANIZATION ARCHITECTURES FOR EMBEDDED COMPUTING
COMPUTER ORGANIZATION ARCHITECTURES FOR EMBEDDED COMPUTING 2013/2014 1 st Semester Sample Exam January 2014 Duration: 2h00 - No extra material allowed. This includes notes, scratch paper, calculator, etc.
Chapter 6: distributed systems
Chapter 6: distributed systems Strongly related to communication between processes is the issue of how processes in distributed systems synchronize. Synchronization is all about doing the right thing at
Data Mining Techniques Chapter 6: Decision Trees
Data Mining Techniques Chapter 6: Decision Trees What is a classification decision tree?.......................................... 2 Visualizing decision trees...................................................
EE361: Digital Computer Organization Course Syllabus
EE361: Digital Computer Organization Course Syllabus Dr. Mohammad H. Awedh Spring 2014 Course Objectives Simply, a computer is a set of components (Processor, Memory and Storage, Input/Output Devices)
COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters
COSC 6374 Parallel Computation Parallel I/O (I) I/O basics Spring 2008 Concept of a clusters Processor 1 local disks Compute node message passing network administrative network Memory Processor 2 Network
An Introduction to Parallel Computing/ Programming
An Introduction to Parallel Computing/ Programming Vicky Papadopoulou Lesta Astrophysics and High Performance Computing Research Group (http://ahpc.euc.ac.cy) Dep. of Computer Science and Engineering European
SOC architecture and design
SOC architecture and design system-on-chip (SOC) processors: become components in a system SOC covers many topics processor: pipelined, superscalar, VLIW, array, vector storage: cache, embedded and external
Scalable Parallel Clustering for Data Mining on Multicomputers
Scalable Parallel Clustering for Data Mining on Multicomputers D. Foti, D. Lipari, C. Pizzuti and D. Talia ISI-CNR c/o DEIS, UNICAL 87036 Rende (CS), Italy {pizzuti,talia}@si.deis.unical.it Abstract. This
Interconnection Network Design
Interconnection Network Design Vida Vukašinović 1 Introduction Parallel computer networks are interesting topic, but they are also difficult to understand in an overall sense. The topological structure
Network (Tree) Topology Inference Based on Prüfer Sequence
Network (Tree) Topology Inference Based on Prüfer Sequence C. Vanniarajan and Kamala Krithivasan Department of Computer Science and Engineering Indian Institute of Technology Madras Chennai 600036 [email protected],
DATA ANALYSIS II. Matrix Algorithms
DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where
How To Create A Multi Disk Raid
Click on the diagram to see RAID 0 in action RAID Level 0 requires a minimum of 2 drives to implement RAID 0 implements a striped disk array, the data is broken down into blocks and each block is written
ALLIED PAPER : DISCRETE MATHEMATICS (for B.Sc. Computer Technology & B.Sc. Multimedia and Web Technology)
ALLIED PAPER : DISCRETE MATHEMATICS (for B.Sc. Computer Technology & B.Sc. Multimedia and Web Technology) Subject Description: This subject deals with discrete structures like set theory, mathematical
IBM CELL CELL INTRODUCTION. Project made by: Origgi Alessandro matr. 682197 Teruzzi Roberto matr. 682552 IBM CELL. Politecnico di Milano Como Campus
Project made by: Origgi Alessandro matr. 682197 Teruzzi Roberto matr. 682552 CELL INTRODUCTION 2 1 CELL SYNERGY Cell is not a collection of different processors, but a synergistic whole Operation paradigms,
IMPROVING PERFORMANCE OF RANDOMIZED SIGNATURE SORT USING HASHING AND BITWISE OPERATORS
Volume 2, No. 3, March 2011 Journal of Global Research in Computer Science RESEARCH PAPER Available Online at www.jgrcs.info IMPROVING PERFORMANCE OF RANDOMIZED SIGNATURE SORT USING HASHING AND BITWISE
AN OVERVIEW OF QUALITY OF SERVICE COMPUTER NETWORK
Abstract AN OVERVIEW OF QUALITY OF SERVICE COMPUTER NETWORK Mrs. Amandeep Kaur, Assistant Professor, Department of Computer Application, Apeejay Institute of Management, Ramamandi, Jalandhar-144001, Punjab,
How to choose the right RAID for your Dedicated Server
Overview of RAID Let's first address, "What is RAID and what does RAID stand for?" RAID, an acronym for "Redundant Array of Independent Disks, is a storage technology that links or combines multiple hard
Project and Production Management Prof. Arun Kanda Department of Mechanical Engineering Indian Institute of Technology, Delhi
Project and Production Management Prof. Arun Kanda Department of Mechanical Engineering Indian Institute of Technology, Delhi Lecture - 9 Basic Scheduling with A-O-A Networks Today we are going to be talking
Advanced Computer Architecture-CS501. Computer Systems Design and Architecture 2.1, 2.2, 3.2
Lecture Handout Computer Architecture Lecture No. 2 Reading Material Vincent P. Heuring&Harry F. Jordan Chapter 2,Chapter3 Computer Systems Design and Architecture 2.1, 2.2, 3.2 Summary 1) A taxonomy of
Performance Analysis and Comparison of JM 15.1 and Intel IPP H.264 Encoder and Decoder
Performance Analysis and Comparison of 15.1 and H.264 Encoder and Decoder K.V.Suchethan Swaroop and K.R.Rao, IEEE Fellow Department of Electrical Engineering, University of Texas at Arlington Arlington,
A Robust Dynamic Load-balancing Scheme for Data Parallel Application on Message Passing Architecture
A Robust Dynamic Load-balancing Scheme for Data Parallel Application on Message Passing Architecture Yangsuk Kee Department of Computer Engineering Seoul National University Seoul, 151-742, Korea Soonhoi
Parallel and Distributed Computing Programming Assignment 1
Parallel and Distributed Computing Programming Assignment 1 Due Monday, February 7 For programming assignment 1, you should write two C programs. One should provide an estimate of the performance of ping-pong
2010-2011 Assessment for Master s Degree Program Fall 2010 - Spring 2011 Computer Science Dept. Texas A&M University - Commerce
2010-2011 Assessment for Master s Degree Program Fall 2010 - Spring 2011 Computer Science Dept. Texas A&M University - Commerce Program Objective #1 (PO1):Students will be able to demonstrate a broad knowledge
More on Pipelining and Pipelines in Real Machines CS 333 Fall 2006 Main Ideas Data Hazards RAW WAR WAW More pipeline stall reduction techniques Branch prediction» static» dynamic bimodal branch prediction
Visualization methods for patent data
Visualization methods for patent data Treparel 2013 Dr. Anton Heijs (CTO & Founder) Delft, The Netherlands Introduction Treparel can provide advanced visualizations for patent data. This document describes
Big Data Technology Map-Reduce Motivation: Indexing in Search Engines
Big Data Technology Map-Reduce Motivation: Indexing in Search Engines Edward Bortnikov & Ronny Lempel Yahoo Labs, Haifa Indexing in Search Engines Information Retrieval s two main stages: Indexing process
Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 13-1
Slide 13-1 Chapter 13 Disk Storage, Basic File Structures, and Hashing Chapter Outline Disk Storage Devices Files of Records Operations on Files Unordered Files Ordered Files Hashed Files Dynamic and Extendible
Principles of Operating Systems CS 446/646
Principles of Operating Systems CS 446/646 1. Introduction to Operating Systems a. Role of an O/S b. O/S History and Features c. Types of O/S Mainframe systems Desktop & laptop systems Parallel systems
Adaptive Tolerance Algorithm for Distributed Top-K Monitoring with Bandwidth Constraints
Adaptive Tolerance Algorithm for Distributed Top-K Monitoring with Bandwidth Constraints Michael Bauer, Srinivasan Ravichandran University of Wisconsin-Madison Department of Computer Sciences {bauer, srini}@cs.wisc.edu
Instruction Set Architecture
Instruction Set Architecture Consider x := y+z. (x, y, z are memory variables) 1-address instructions 2-address instructions LOAD y (r :=y) ADD y,z (y := y+z) ADD z (r:=r+z) MOVE x,y (x := y) STORE x (x:=r)
Cost Model: Work, Span and Parallelism. 1 The RAM model for sequential computation:
CSE341T 08/31/2015 Lecture 3 Cost Model: Work, Span and Parallelism In this lecture, we will look at how one analyze a parallel program written using Cilk Plus. When we analyze the cost of an algorithm
Parallel Scalable Algorithms- Performance Parameters
www.bsc.es Parallel Scalable Algorithms- Performance Parameters Vassil Alexandrov, ICREA - Barcelona Supercomputing Center, Spain Overview Sources of Overhead in Parallel Programs Performance Metrics for
Chapter 13 Disk Storage, Basic File Structures, and Hashing.
Chapter 13 Disk Storage, Basic File Structures, and Hashing. Copyright 2004 Pearson Education, Inc. Chapter Outline Disk Storage Devices Files of Records Operations on Files Unordered Files Ordered Files
Why the Network Matters
Week 2, Lecture 2 Copyright 2009 by W. Feng. Based on material from Matthew Sottile. So Far Overview of Multicore Systems Why Memory Matters Memory Architectures Emerging Chip Multiprocessors (CMP) Increasing
Interconnection Network
Interconnection Network Recap: Generic Parallel Architecture A generic modern multiprocessor Network Mem Communication assist (CA) $ P Node: processor(s), memory system, plus communication assist Network
TD 271 Rev.1 (PLEN/15)
INTERNATIONAL TELECOMMUNICATION UNION STUDY GROUP 15 TELECOMMUNICATION STANDARDIZATION SECTOR STUDY PERIOD 2009-2012 English only Original: English Question(s): 12/15 Geneva, 31 May - 11 June 2010 Source:
The mathematics of RAID-6
The mathematics of RAID-6 H. Peter Anvin 1 December 2004 RAID-6 supports losing any two drives. The way this is done is by computing two syndromes, generally referred P and Q. 1 A quick
A STUDY OF TASK SCHEDULING IN MULTIPROCESSOR ENVIROMENT Ranjit Rajak 1, C.P.Katti 2, Nidhi Rajak 3
A STUDY OF TASK SCHEDULING IN MULTIPROCESSOR ENVIROMENT Ranjit Rajak 1, C.P.Katti, Nidhi Rajak 1 Department of Computer Science & Applications, Dr.H.S.Gour Central University, Sagar, India, [email protected]
Chapter 13. Disk Storage, Basic File Structures, and Hashing
Chapter 13 Disk Storage, Basic File Structures, and Hashing Chapter Outline Disk Storage Devices Files of Records Operations on Files Unordered Files Ordered Files Hashed Files Dynamic and Extendible Hashing
Middleware and Distributed Systems. Introduction. Dr. Martin v. Löwis
Middleware and Distributed Systems Introduction Dr. Martin v. Löwis 14 3. Software Engineering What is Middleware? Bauer et al. Software Engineering, Report on a conference sponsored by the NATO SCIENCE
2) What is the structure of an organization? Explain how IT support at different organizational levels.
(PGDIT 01) Paper - I : BASICS OF INFORMATION TECHNOLOGY 1) What is an information technology? Why you need to know about IT. 2) What is the structure of an organization? Explain how IT support at different
Solution of Linear Systems
Chapter 3 Solution of Linear Systems In this chapter we study algorithms for possibly the most commonly occurring problem in scientific computing, the solution of linear systems of equations. We start
Beyond the Stars: Revisiting Virtual Cluster Embeddings
Beyond the Stars: Revisiting Virtual Cluster Embeddings Matthias Rost Technische Universität Berlin September 7th, 2015, Télécom-ParisTech Joint work with Carlo Fuerst, Stefan Schmid Published in ACM SIGCOMM
Chapter 13. Chapter Outline. Disk Storage, Basic File Structures, and Hashing
Chapter 13 Disk Storage, Basic File Structures, and Hashing Copyright 2007 Ramez Elmasri and Shamkant B. Navathe Chapter Outline Disk Storage Devices Files of Records Operations on Files Unordered Files
Broadband Networks. Prof. Dr. Abhay Karandikar. Electrical Engineering Department. Indian Institute of Technology, Bombay. Lecture - 29.
Broadband Networks Prof. Dr. Abhay Karandikar Electrical Engineering Department Indian Institute of Technology, Bombay Lecture - 29 Voice over IP So, today we will discuss about voice over IP and internet
High Performance Computing
High Performance Computing Trey Breckenridge Computing Systems Manager Engineering Research Center Mississippi State University What is High Performance Computing? HPC is ill defined and context dependent.
Chapter 11 I/O Management and Disk Scheduling
Operating Systems: Internals and Design Principles, 6/E William Stallings Chapter 11 I/O Management and Disk Scheduling Dave Bremer Otago Polytechnic, NZ 2008, Prentice Hall I/O Devices Roadmap Organization
PROBLEMS #20,R0,R1 #$3A,R2,R4
506 CHAPTER 8 PIPELINING (Corrisponde al cap. 11 - Introduzione al pipelining) PROBLEMS 8.1 Consider the following sequence of instructions Mul And #20,R0,R1 #3,R2,R3 #$3A,R2,R4 R0,R2,R5 In all instructions,
Efficiency of algorithms. Algorithms. Efficiency of algorithms. Binary search and linear search. Best, worst and average case.
Algorithms Efficiency of algorithms Computational resources: time and space Best, worst and average case performance How to compare algorithms: machine-independent measure of efficiency Growth rate Complexity
10 Gbps Line Speed Programmable Hardware for Open Source Network Applications*
10 Gbps Line Speed Programmable Hardware for Open Source Network Applications* Livio Ricciulli [email protected] (408) 399-2284 http://www.metanetworks.org *Supported by the Division of Design Manufacturing
Controlled Random Access Methods
Helsinki University of Technology S-72.333 Postgraduate Seminar on Radio Communications Controlled Random Access Methods Er Liu [email protected] Communications Laboratory 09.03.2004 Content of Presentation
The Shortcut Guide to Balancing Storage Costs and Performance with Hybrid Storage
The Shortcut Guide to Balancing Storage Costs and Performance with Hybrid Storage sponsored by Dan Sullivan Chapter 1: Advantages of Hybrid Storage... 1 Overview of Flash Deployment in Hybrid Storage Systems...
Understanding Web personalization with Web Usage Mining and its Application: Recommender System
Understanding Web personalization with Web Usage Mining and its Application: Recommender System Manoj Swami 1, Prof. Manasi Kulkarni 2 1 M.Tech (Computer-NIMS), VJTI, Mumbai. 2 Department of Computer Technology,
Lumousoft Visual Programming Language and its IDE
Lumousoft Visual Programming Language and its IDE Xianliang Lu Lumousoft Inc. Waterloo Ontario Canada Abstract - This paper presents a new high-level graphical programming language and its IDE (Integration
Chapter 2 Basic Structure of Computers. Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan
Chapter 2 Basic Structure of Computers Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan Outline Functional Units Basic Operational Concepts Bus Structures Software
Operating Systems 4 th Class
Operating Systems 4 th Class Lecture 1 Operating Systems Operating systems are essential part of any computer system. Therefore, a course in operating systems is an essential part of any computer science
SAN Conceptual and Design Basics
TECHNICAL NOTE VMware Infrastructure 3 SAN Conceptual and Design Basics VMware ESX Server can be used in conjunction with a SAN (storage area network), a specialized high speed network that connects computer
Input / Ouput devices. I/O Chapter 8. Goals & Constraints. Measures of Performance. Anatomy of a Disk Drive. Introduction - 8.1
Introduction - 8.1 I/O Chapter 8 Disk Storage and Dependability 8.2 Buses and other connectors 8.4 I/O performance measures 8.6 Input / Ouput devices keyboard, mouse, printer, game controllers, hard drive,
Study of a neural network-based system for stability augmentation of an airplane
Study of a neural network-based system for stability augmentation of an airplane Author: Roger Isanta Navarro Annex 3 ANFIS Network Development Supervisors: Oriol Lizandra Dalmases Fatiha Nejjari Akhi-Elarab
Lecture 2: Universality
CS 710: Complexity Theory 1/21/2010 Lecture 2: Universality Instructor: Dieter van Melkebeek Scribe: Tyson Williams In this lecture, we introduce the notion of a universal machine, develop efficient universal
Key Components of WAN Optimization Controller Functionality
Key Components of WAN Optimization Controller Functionality Introduction and Goals One of the key challenges facing IT organizations relative to application and service delivery is ensuring that the applications
Glossary of Object Oriented Terms
Appendix E Glossary of Object Oriented Terms abstract class: A class primarily intended to define an instance, but can not be instantiated without additional methods. abstract data type: An abstraction
Deploying De-Duplication on Ext4 File System
Deploying De-Duplication on Ext4 File System Usha A. Joglekar 1, Bhushan M. Jagtap 2, Koninika B. Patil 3, 1. Asst. Prof., 2, 3 Students Department of Computer Engineering Smt. Kashibai Navale College
CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2007 Lecture 5 - DBMS Architecture
CSE 544 Principles of Database Management Systems Magdalena Balazinska Fall 2007 Lecture 5 - DBMS Architecture References Anatomy of a database system. J. Hellerstein and M. Stonebraker. In Red Book (4th
Brocade Solution for EMC VSPEX Server Virtualization
Reference Architecture Brocade Solution Blueprint Brocade Solution for EMC VSPEX Server Virtualization Microsoft Hyper-V for 50 & 100 Virtual Machines Enabled by Microsoft Hyper-V, Brocade ICX series switch,
