TDT 4260 lecture 11 spring semester Interconnection network continued

Size: px
Start display at page:

Download "TDT 4260 lecture 11 spring semester 2013. Interconnection network continued"

Transcription

1 1 TDT 4260 lecture 11 spring semester 2013 Lasse Natvig, The CARD group Dept. of computer & information science NTNU 2 Lecture overview Interconnection network continued Routing Switch microarchitecture Dataflow computing Principles MDM in detail think differently! innovation Research method Administrativia Reading list is now in its final version Next week: Mini project presentations, Room 454 in IT-Building Last lecture 30/4, exam Saturday 25/5 at 0900 Wrap up Short presentation of projects/master theses offered by EECS/CARD/Lasse Repetition --- send to Lasse before 24/4 to ask for special topics

2 3 ARM guest lectures Thursday morning Part of the course TDT Energieffektive datamaskinsystemer Thursday 18. April, at 08:15 in auditorium F6: 1) Low power HW design. Guest lecturer: Nir Leshem (Hardware Engineering Manager, ARM) (In english) 2) Driverutvikling for Linux, driverarkitektur og debugging. Gjesteforelesere: Ørjan Eide (Senior Engineer, ARM) og Mikael Valen-Sendstad (Staff Software Architect, ARM) (In Norwegian) 4 F.5: Routing, Arbitration, Switching Routing Which of the possible paths are allowable for packets? Set of operations needed to compute a valid path Arbitration When are paths available for packets? Resolves packets requesting the same resources at the same time For every arbitration, there is a winner and possibly many losers Losers are buffered (lossless) or dropped on overflow (lossy) Switching How are paths allocated to packets? The winning packet (from arbitration) proceeds towards destination Paths can be established one fragment at a time or in their entirety

3 5 Routing Shared Media Broadcast to everyone Switched Media needs real routing. Options: Source-based routing: message specifies path to the destination (changes of direction) Virtual Circuit: circuit established from source to destination, message picks the circuit to follow Destination-based routing: message specifies destination, switch must pick the path Deterministic: always follow same path Adaptive: pick different paths to avoid congestion, failures Randomized routing: pick between several good paths to balance network load 6 Store & Forward vs Cut-Through Routing Store & Forward Routing Cut-Through Routing Source Dest Dest Time Cut-through (on blocking) Virtual cut-through (spools rest of packet into buffer) Wormhole (buffers only a few flits, leaves tail along route)

4 7 Routing mechanism Need to select output port for each input packet And fast Simple arithmetic in regular topologies Example: x, y routing in a grid with bi-directional links (first x then y) west (-x) x < 0 east (+x) x > 0 south (-y) x = 0, y < 0 north (+y) x = 0, y > 0 Unidirectional links sufficient for torus (+x, +y) Dimension-order routing (DOR) Reduce relative address of each dimension in order to avoid deadlock 8 Deadlock How can it arise? necessary conditions: shared resources incrementally allocated non-preemptible How do you handle it? constrain how channel resources are allocated (deadlock avoidance) Add a mechanism that detects likely deadlocks and fixes them (deadlock recovery)

5 9 Deadlock example 1 Red: S1 d 1 Green:S 2 d 2 Blue: S 3 d 3 Black: S 4 d 4 10 Deadlock example 1, avoided by DOR

6 11 Deadlock example 2 TRC (0,0) TRC (0,1) TRC (0,2) TRC (0,3) TRC (1,0) TRC (1,1) TRC (1,2) TRC (1,3) TRC (2,0) TRC (2,1) TRC (2,2) TRC (2,3) X X TRC (3,0) TRC (3,1) TRC (3,2) TRC (3,3) Deadlock can occur even with DOR if uni-directional links Can be solved by having two (virtual) channels 12 Arbitration (1/2) Several simultaneous requests to shared resource Ideal: Maximize usage of network resources Problem: Starvation Fairness needed Figure: Two phase arbitration. Request, Grant Poor usage

7 13 Arbitration (2/2) Three phases Multiple requests Better usage But: Increased latency 14 Switching Allocating paths for packets Two techniques: Circuit switching (connection oriented) Communication channel Allocated before first packet Packet headers don t need routing info Wastes bandwidth Packet switching (connection less) Each packet handled independently Can t guarantee response time Two types next slide

8 15 Store & Forward vs. Cut-Through Routing Time Packet switching Store & Forward Routing Circuit switching Cut-Through Routing Source Dest Dest Cut-through (on blocking) Virtual cut-through (spools rest of packet into buffer) Wormhole (buffers only a few flits, leaves tail along route, (--- only one flit in the figure above)) Switch micro architecture

9 17 Pipelined switch 18 SOMETHING ELSE

10 19 IDI Open, a challenge for you? 20 DATAFLOW COMPUTING AND MDM

11 21 Dataflow computing and computers Dataflow computing suitable for highly parallel solutions requires different HW and SW Dataflow computers Principles History Statical vs. dynamical Typical architecture pipelined ring with circulating packets Manchester Dataflow Machine (MDM) 22 Dataflow programs 1 b c e a = (b +1) x (b - c) d = c x e + f = a x d Represent computation as a graph Node = operation = instruction a d Dataflow graph Computation flows through Inherently parallel, data driven, no program counter, asynchronous f Logical processor at each node, activated by availability of operands, executed when a physical processor is available

12 23 Example data flow 24 Control flow and data flow (Traditional) control-flow Explicit control flow (manipulation of program counter (PC)) Data are communicated between instructions via shared memory locations Data is referenced via memory-address One single control thread Many parallel control threads: Explicit parallelism Data flow computers Data driven computation, that is the selection of instructions for execution is controlled by the availability of operands Implicit parallelism Programs represented as directed graphs Results are sent directly as data-packets between instructions Has normally/originally no shared memory that more than one instruction may refer to i.e. no side effects

13 25 Data flow computers, history Relatively old topic Many research projects Fundamentally different interesting Link to functional languages gave renewed interest Some prototypes built, none with outstanding performance Status 1998 Few research projects Data flow principle used many places In processors (Reservation stations, Tomasulo, TDT 4255) Chaining of DSP PE s for high performance Dataflow computers related to other architectures (anno 1986) Dataflow machine architecture Arthur H. Veen, ACM Computing Surveys December 1986 Volume 18 Issue 4

14 27 Dataflow machines, architecture and implementation (anno 1986) 28 Motivation

15 29 Static and dynamic data flow systems Static systems, does not allow concurrent reactivation of code A given part of a data flow graph can only exist in one instance at the same time Maximum one data packet exist on one line Data packets communicate directly from instruction to instruction Control packets are used as acknowledge signal from receiver to sender so it is know when it can produce a new result Dynamic systems Allows concurrent activation, e.g. the same code can be executed at the same time in different contexts What opportunities does this give for program execution? Loop unrolling, unfolding (iteration number) Simultaneous procedure calls Recursive procedures 30 Dynamic dataflow systems - implementation How can it be realized? + context-1: value = 10 context-2: value = 33 1) Tag operands with contextidentificator tagged token 2) Copying of code Needs the ability to have more than on value «on its way» between two instructions at the same time In this case not enough storage space in the receiverinstruction to store several operands Needs a unit/component where one operand can wait for its «fellow-operand» Makes «logical buffering» possible on each line

16 31 Manchester Dataflow Machine (MDM) Source: The Manchester Prototype Dataflow Computer, Gurd, Kirkham, Watson, CACM, jan 85, pp Data flow machine based on dynamic tagging of (small) data-packets (token) Approx 1-2 MIPS in MDM: Data flow programs Three levels SISAL (Fig. 2) Assembler (Fig. 3)» variables from SISAL» operators from data flow instruction-set Machine code (Fig. 1)

17 33 MDM machine code Graphical presentation Some special instructions: CGR, DUP, BRR, ADL 34 SISAL (fig.2)

18 35 Template Assembly Language (TASS), fig Execution sequence, fig. 4

19 37 Execution sequence, fig. 4, cnt d 38 Manchester Dataflow Machine Tagged data packets = token Tag: - Iteration level (for loops) - Activation name (for simultaneous procedure calls and recursion) Output Token Queue Matching Unit Instruction Store - Index (when same code operates on different parts of a data structure) Implementation Input Switch Processing Unit P0...P19 Fig. 7 og 8

20 39 MDM Matching Unit (1/2) 40 MDM Matching Unit (2/2)

21 41 Instruction Store & Processing Unit 42 MDM: System evaluation Test method: Load program, load input-token into input-queue, starts clock and release token from queue. Stops clock when first token is received at the host Execution time = f(#processors, program, input) Research goals; build knowledge about: Hardware utilization and bottlenecks Parallelism in software Data flow-mips vs. «normal MIPS" Reduce number of "variables" Different artificial situations to avoid testing too much at the same time» Micro benchmarks, e.g. program with Pby = 1.00 Test classes 1) small programs -> does not use overflow unit 2a) moderate degree of overflow 2b) extreme degree of overflow Simulator of the computer AvePara = T(1)/T(inf)

22 43 Test programs (Table II) 44 Speedup (Performance)

23 45 MDM: Problems Low efficiency when handling data structures special-hw Arbitraration to/from functional units Easily becomes a bottleneck Experiments: processors starves when a large fraction of the tokens do not give "match"» larger buffer on the output port of the matching unit Needs better compilers 46 MDM: Retrospective (1992) Manchester Data-Flow: A Progress Report, Gurd & Snelling, ICS 92. MDM, history; started in 1976, stopped in 1989 Included: Structure Store Units Throttle Unit Large programs can generate too much parallel activity that drowns the system in tokens that cannot be processed in a long time (a kind of trashing (OS-concept)) A unit in the ring that assigns unique activation names» (A part of the tag -field) From information about the load of different parts of the system can the throttling unit slow down the assignment of these so that the total load is reduced Experience Microcoded instruction set was a good choice for research/experimentation

24 47 Lecture plan Administrativia Reading list is now in its final version Note that all slides are part of the curriculum Next week: Mini project presentations, Room 454 in IT-Building Last lecture 30/4, exam Saturday 25/5 at 0900 Wrap up Short presentation of projects/master theses offered by EECS/CARD/Lasse Short presentation of mini-course TDT1: TDT1 Energy Efficient Multicore Computing Repetition --- send to Lasse before 24/4 to ask for special topics

Lecture 18: Interconnection Networks. CMU 15-418: Parallel Computer Architecture and Programming (Spring 2012)

Lecture 18: Interconnection Networks. CMU 15-418: Parallel Computer Architecture and Programming (Spring 2012) Lecture 18: Interconnection Networks CMU 15-418: Parallel Computer Architecture and Programming (Spring 2012) Announcements Project deadlines: - Mon, April 2: project proposal: 1-2 page writeup - Fri,

More information

Interconnection Network

Interconnection Network Interconnection Network Recap: Generic Parallel Architecture A generic modern multiprocessor Network Mem Communication assist (CA) $ P Node: processor(s), memory system, plus communication assist Network

More information

Interconnection Networks Programmierung Paralleler und Verteilter Systeme (PPV)

Interconnection Networks Programmierung Paralleler und Verteilter Systeme (PPV) Interconnection Networks Programmierung Paralleler und Verteilter Systeme (PPV) Sommer 2015 Frank Feinbube, M.Sc., Felix Eberhardt, M.Sc., Prof. Dr. Andreas Polze Interconnection Networks 2 SIMD systems

More information

Asynchronous Bypass Channels

Asynchronous Bypass Channels Asynchronous Bypass Channels Improving Performance for Multi-Synchronous NoCs T. Jain, P. Gratz, A. Sprintson, G. Choi, Department of Electrical and Computer Engineering, Texas A&M University, USA Table

More information

Scalability and Classifications

Scalability and Classifications Scalability and Classifications 1 Types of Parallel Computers MIMD and SIMD classifications shared and distributed memory multicomputers distributed shared memory computers 2 Network Topologies static

More information

Computer Network. Interconnected collection of autonomous computers that are able to exchange information

Computer Network. Interconnected collection of autonomous computers that are able to exchange information Introduction Computer Network. Interconnected collection of autonomous computers that are able to exchange information No master/slave relationship between the computers in the network Data Communications.

More information

Interconnection Networks

Interconnection Networks Interconnection Networks Z. Jerry Shi Assistant Professor of Computer Science and Engineering University of Connecticut * Slides adapted from Blumrich&Gschwind/ELE475 03, Peh/ELE475 * Three questions about

More information

Interconnection Networks. Interconnection Networks. Interconnection networks are used everywhere!

Interconnection Networks. Interconnection Networks. Interconnection networks are used everywhere! Interconnection Networks Interconnection Networks Interconnection networks are used everywhere! Supercomputers connecting the processors Routers connecting the ports can consider a router as a parallel

More information

Interconnection Networks

Interconnection Networks Advanced Computer Architecture (0630561) Lecture 15 Interconnection Networks Prof. Kasim M. Al-Aubidy Computer Eng. Dept. Interconnection Networks: Multiprocessors INs can be classified based on: 1. Mode

More information

Hyper Node Torus: A New Interconnection Network for High Speed Packet Processors

Hyper Node Torus: A New Interconnection Network for High Speed Packet Processors 2011 International Symposium on Computer Networks and Distributed Systems (CNDS), February 23-24, 2011 Hyper Node Torus: A New Interconnection Network for High Speed Packet Processors Atefeh Khosravi,

More information

INSTRUCTION LEVEL PARALLELISM PART VII: REORDER BUFFER

INSTRUCTION LEVEL PARALLELISM PART VII: REORDER BUFFER Course on: Advanced Computer Architectures INSTRUCTION LEVEL PARALLELISM PART VII: REORDER BUFFER Prof. Cristina Silvano Politecnico di Milano cristina.silvano@polimi.it Prof. Silvano, Politecnico di Milano

More information

Introduction to Parallel Computing. George Karypis Parallel Programming Platforms

Introduction to Parallel Computing. George Karypis Parallel Programming Platforms Introduction to Parallel Computing George Karypis Parallel Programming Platforms Elements of a Parallel Computer Hardware Multiple Processors Multiple Memories Interconnection Network System Software Parallel

More information

Topological Properties

Topological Properties Advanced Computer Architecture Topological Properties Routing Distance: Number of links on route Node degree: Number of channels per node Network diameter: Longest minimum routing distance between any

More information

A Lab Course on Computer Architecture

A Lab Course on Computer Architecture A Lab Course on Computer Architecture Pedro López José Duato Depto. de Informática de Sistemas y Computadores Facultad de Informática Universidad Politécnica de Valencia Camino de Vera s/n, 46071 - Valencia,

More information

Interconnection Network Design

Interconnection Network Design Interconnection Network Design Vida Vukašinović 1 Introduction Parallel computer networks are interesting topic, but they are also difficult to understand in an overall sense. The topological structure

More information

Interconnection Networks

Interconnection Networks CMPT765/408 08-1 Interconnection Networks Qianping Gu 1 Interconnection Networks The note is mainly based on Chapters 1, 2, and 4 of Interconnection Networks, An Engineering Approach by J. Duato, S. Yalamanchili,

More information

1. The subnet must prevent additional packets from entering the congested region until those already present can be processed.

1. The subnet must prevent additional packets from entering the congested region until those already present can be processed. Congestion Control When one part of the subnet (e.g. one or more routers in an area) becomes overloaded, congestion results. Because routers are receiving packets faster than they can forward them, one

More information

EE482: Advanced Computer Organization Lecture #11 Processor Architecture Stanford University Wednesday, 31 May 2000. ILP Execution

EE482: Advanced Computer Organization Lecture #11 Processor Architecture Stanford University Wednesday, 31 May 2000. ILP Execution EE482: Advanced Computer Organization Lecture #11 Processor Architecture Stanford University Wednesday, 31 May 2000 Lecture #11: Wednesday, 3 May 2000 Lecturer: Ben Serebrin Scribe: Dean Liu ILP Execution

More information

Lecture 15: Congestion Control. CSE 123: Computer Networks Stefan Savage

Lecture 15: Congestion Control. CSE 123: Computer Networks Stefan Savage Lecture 15: Congestion Control CSE 123: Computer Networks Stefan Savage Overview Yesterday: TCP & UDP overview Connection setup Flow control: resource exhaustion at end node Today: Congestion control Resource

More information

2010-2011 Assessment for Master s Degree Program Fall 2010 - Spring 2011 Computer Science Dept. Texas A&M University - Commerce

2010-2011 Assessment for Master s Degree Program Fall 2010 - Spring 2011 Computer Science Dept. Texas A&M University - Commerce 2010-2011 Assessment for Master s Degree Program Fall 2010 - Spring 2011 Computer Science Dept. Texas A&M University - Commerce Program Objective #1 (PO1):Students will be able to demonstrate a broad knowledge

More information

Architectural Level Power Consumption of Network on Chip. Presenter: YUAN Zheng

Architectural Level Power Consumption of Network on Chip. Presenter: YUAN Zheng Architectural Level Power Consumption of Network Presenter: YUAN Zheng Why Architectural Low Power Design? High-speed and large volume communication among different parts on a chip Problem: Power consumption

More information

find model parameters, to validate models, and to develop inputs for models. c 1994 Raj Jain 7.1

find model parameters, to validate models, and to develop inputs for models. c 1994 Raj Jain 7.1 Monitors Monitor: A tool used to observe the activities on a system. Usage: A system programmer may use a monitor to improve software performance. Find frequently used segments of the software. A systems

More information

Switched Interconnect for System-on-a-Chip Designs

Switched Interconnect for System-on-a-Chip Designs witched Interconnect for ystem-on-a-chip Designs Abstract Daniel iklund and Dake Liu Dept. of Physics and Measurement Technology Linköping University -581 83 Linköping {danwi,dake}@ifm.liu.se ith the increased

More information

System Interconnect Architectures. Goals and Analysis. Network Properties and Routing. Terminology - 2. Terminology - 1

System Interconnect Architectures. Goals and Analysis. Network Properties and Routing. Terminology - 2. Terminology - 1 System Interconnect Architectures CSCI 8150 Advanced Computer Architecture Hwang, Chapter 2 Program and Network Properties 2.4 System Interconnect Architectures Direct networks for static connections Indirect

More information

Design and Implementation of an On-Chip timing based Permutation Network for Multiprocessor system on Chip

Design and Implementation of an On-Chip timing based Permutation Network for Multiprocessor system on Chip Design and Implementation of an On-Chip timing based Permutation Network for Multiprocessor system on Chip Ms Lavanya Thunuguntla 1, Saritha Sapa 2 1 Associate Professor, Department of ECE, HITAM, Telangana

More information

Communication Networks. MAP-TELE 2011/12 José Ruela

Communication Networks. MAP-TELE 2011/12 José Ruela Communication Networks MAP-TELE 2011/12 José Ruela Network basic mechanisms Introduction to Communications Networks Communications networks Communications networks are used to transport information (data)

More information

Computer Architecture Lecture 2: Instruction Set Principles (Appendix A) Chih Wei Liu 劉 志 尉 National Chiao Tung University cwliu@twins.ee.nctu.edu.

Computer Architecture Lecture 2: Instruction Set Principles (Appendix A) Chih Wei Liu 劉 志 尉 National Chiao Tung University cwliu@twins.ee.nctu.edu. Computer Architecture Lecture 2: Instruction Set Principles (Appendix A) Chih Wei Liu 劉 志 尉 National Chiao Tung University cwliu@twins.ee.nctu.edu.tw Review Computers in mid 50 s Hardware was expensive

More information

Solution: start more than one instruction in the same clock cycle CPI < 1 (or IPC > 1, Instructions per Cycle) Two approaches:

Solution: start more than one instruction in the same clock cycle CPI < 1 (or IPC > 1, Instructions per Cycle) Two approaches: Multiple-Issue Processors Pipelining can achieve CPI close to 1 Mechanisms for handling hazards Static or dynamic scheduling Static or dynamic branch handling Increase in transistor counts (Moore s Law):

More information

Driving force. What future software needs. Potential research topics

Driving force. What future software needs. Potential research topics Improving Software Robustness and Efficiency Driving force Processor core clock speed reach practical limit ~4GHz (power issue) Percentage of sustainable # of active transistors decrease; Increase in #

More information

Why the Network Matters

Why the Network Matters Week 2, Lecture 2 Copyright 2009 by W. Feng. Based on material from Matthew Sottile. So Far Overview of Multicore Systems Why Memory Matters Memory Architectures Emerging Chip Multiprocessors (CMP) Increasing

More information

RapidIO Network Management and Diagnostics

RapidIO Network Management and Diagnostics RapidIO Network Management and Diagnostics... Is now even easier! Release 1.1 Overview RapidIO Discovery and Diagnostic Basics Loopback Diagnostic Mode (NEW) Multiple Simultaneous Routing paths (New) Controlling

More information

The Kiel Reactive Processor

The Kiel Reactive Processor The Kiel Reactive Processor Reactive Processing beyond the KEP Claus Traulsen Christian-Albrechts Universität zu Kiel Synchron 2007 29. November 2007 Claus Traulsen The Kiel Reactive Processor Slide 1

More information

Hardware Implementation of Improved Adaptive NoC Router with Flit Flow History based Load Balancing Selection Strategy

Hardware Implementation of Improved Adaptive NoC Router with Flit Flow History based Load Balancing Selection Strategy Hardware Implementation of Improved Adaptive NoC Rer with Flit Flow History based Load Balancing Selection Strategy Parag Parandkar 1, Sumant Katiyal 2, Geetesh Kwatra 3 1,3 Research Scholar, School of

More information

Performance Analysis and Optimization Tool

Performance Analysis and Optimization Tool Performance Analysis and Optimization Tool Andres S. CHARIF-RUBIAL andres.charif@uvsq.fr Performance Analysis Team, University of Versailles http://www.maqao.org Introduction Performance Analysis Develop

More information

Lecture 23: Interconnection Networks. Topics: communication latency, centralized and decentralized switches (Appendix E)

Lecture 23: Interconnection Networks. Topics: communication latency, centralized and decentralized switches (Appendix E) Lecture 23: Interconnection Networks Topics: communication latency, centralized and decentralized switches (Appendix E) 1 Topologies Internet topologies are not very regular they grew incrementally Supercomputers

More information

COMP 422, Lecture 3: Physical Organization & Communication Costs in Parallel Machines (Sections 2.4 & 2.5 of textbook)

COMP 422, Lecture 3: Physical Organization & Communication Costs in Parallel Machines (Sections 2.4 & 2.5 of textbook) COMP 422, Lecture 3: Physical Organization & Communication Costs in Parallel Machines (Sections 2.4 & 2.5 of textbook) Vivek Sarkar Department of Computer Science Rice University vsarkar@rice.edu COMP

More information

Introduction to Exploration and Optimization of Multiprocessor Embedded Architectures based on Networks On-Chip

Introduction to Exploration and Optimization of Multiprocessor Embedded Architectures based on Networks On-Chip Introduction to Exploration and Optimization of Multiprocessor Embedded Architectures based on Networks On-Chip Cristina SILVANO silvano@elet.polimi.it Politecnico di Milano, Milano (Italy) Talk Outline

More information

COMPUTER HARDWARE. Input- Output and Communication Memory Systems

COMPUTER HARDWARE. Input- Output and Communication Memory Systems COMPUTER HARDWARE Input- Output and Communication Memory Systems Computer I/O I/O devices commonly found in Computer systems Keyboards Displays Printers Magnetic Drives Compact disk read only memory (CD-ROM)

More information

Guideline for stresstest Page 1 of 6. Stress test

Guideline for stresstest Page 1 of 6. Stress test Guideline for stresstest Page 1 of 6 Stress test Objective: Show unacceptable problems with high parallel load. Crash, wrong processing, slow processing. Test Procedure: Run test cases with maximum number

More information

PROBLEMS #20,R0,R1 #$3A,R2,R4

PROBLEMS #20,R0,R1 #$3A,R2,R4 506 CHAPTER 8 PIPELINING (Corrisponde al cap. 11 - Introduzione al pipelining) PROBLEMS 8.1 Consider the following sequence of instructions Mul And #20,R0,R1 #3,R2,R3 #$3A,R2,R4 R0,R2,R5 In all instructions,

More information

TRACKER: A Low Overhead Adaptive NoC Router with Load Balancing Selection Strategy

TRACKER: A Low Overhead Adaptive NoC Router with Load Balancing Selection Strategy TRACKER: A Low Overhead Adaptive NoC Router with Load Balancing Selection Strategy John Jose, K.V. Mahathi, J. Shiva Shankar and Madhu Mutyam PACE Laboratory, Department of Computer Science and Engineering

More information

Assessment Plan for CS and CIS Degree Programs Computer Science Dept. Texas A&M University - Commerce

Assessment Plan for CS and CIS Degree Programs Computer Science Dept. Texas A&M University - Commerce Assessment Plan for CS and CIS Degree Programs Computer Science Dept. Texas A&M University - Commerce Program Objective #1 (PO1):Students will be able to demonstrate a broad knowledge of Computer Science

More information

Agenda. Michele Taliercio, Il circuito Integrato, Novembre 2001

Agenda. Michele Taliercio, Il circuito Integrato, Novembre 2001 Agenda Introduzione Il mercato Dal circuito integrato al System on a Chip (SoC) La progettazione di un SoC La tecnologia Una fabbrica di circuiti integrati 28 How to handle complexity G The engineering

More information

Agenda. Distributed System Structures. Why Distributed Systems? Motivation

Agenda. Distributed System Structures. Why Distributed Systems? Motivation Agenda Distributed System Structures CSCI 444/544 Operating Systems Fall 2008 Motivation Network structure Fundamental network services Sockets and ports Client/server model Remote Procedure Call (RPC)

More information

Scaling 10Gb/s Clustering at Wire-Speed

Scaling 10Gb/s Clustering at Wire-Speed Scaling 10Gb/s Clustering at Wire-Speed InfiniBand offers cost-effective wire-speed scaling with deterministic performance Mellanox Technologies Inc. 2900 Stender Way, Santa Clara, CA 95054 Tel: 408-970-3400

More information

Operating System Concepts. Operating System 資 訊 工 程 學 系 袁 賢 銘 老 師

Operating System Concepts. Operating System 資 訊 工 程 學 系 袁 賢 銘 老 師 Lecture 7: Distributed Operating Systems A Distributed System 7.2 Resource sharing Motivation sharing and printing files at remote sites processing information in a distributed database using remote specialized

More information

Spring 2011 Prof. Hyesoon Kim

Spring 2011 Prof. Hyesoon Kim Spring 2011 Prof. Hyesoon Kim Today, we will study typical patterns of parallel programming This is just one of the ways. Materials are based on a book by Timothy. Decompose Into tasks Original Problem

More information

Lecture 2 Parallel Programming Platforms

Lecture 2 Parallel Programming Platforms Lecture 2 Parallel Programming Platforms Flynn s Taxonomy In 1966, Michael Flynn classified systems according to numbers of instruction streams and the number of data stream. Data stream Single Multiple

More information

CHAPTER 7: The CPU and Memory

CHAPTER 7: The CPU and Memory CHAPTER 7: The CPU and Memory The Architecture of Computer Hardware, Systems Software & Networking: An Information Technology Approach 4th Edition, Irv Englander John Wiley and Sons 2010 PowerPoint slides

More information

CS 78 Computer Networks. Internet Protocol (IP) our focus. The Network Layer. Interplay between routing and forwarding

CS 78 Computer Networks. Internet Protocol (IP) our focus. The Network Layer. Interplay between routing and forwarding CS 78 Computer Networks Internet Protocol (IP) Andrew T. Campbell campbell@cs.dartmouth.edu our focus What we will lean What s inside a router IP forwarding Internet Control Message Protocol (ICMP) IP

More information

(Refer Slide Time: 00:01:16 min)

(Refer Slide Time: 00:01:16 min) Digital Computer Organization Prof. P. K. Biswas Department of Electronic & Electrical Communication Engineering Indian Institute of Technology, Kharagpur Lecture No. # 04 CPU Design: Tirning & Control

More information

Network Layer: Network Layer and IP Protocol

Network Layer: Network Layer and IP Protocol 1 Network Layer: Network Layer and IP Protocol Required reading: Garcia 7.3.3, 8.1, 8.2.1 CSE 3213, Winter 2010 Instructor: N. Vlajic 2 1. Introduction 2. Router Architecture 3. Network Layer Protocols

More information

Routing in packet-switching networks

Routing in packet-switching networks Routing in packet-switching networks Circuit switching vs. Packet switching Most of WANs based on circuit or packet switching Circuit switching designed for voice Resources dedicated to a particular call

More information

OpenFlow Based Load Balancing

OpenFlow Based Load Balancing OpenFlow Based Load Balancing Hardeep Uppal and Dane Brandon University of Washington CSE561: Networking Project Report Abstract: In today s high-traffic internet, it is often desirable to have multiple

More information

AMD Opteron Quad-Core

AMD Opteron Quad-Core AMD Opteron Quad-Core a brief overview Daniele Magliozzi Politecnico di Milano Opteron Memory Architecture native quad-core design (four cores on a single die for more efficient data sharing) enhanced

More information

Lecture 3: Evaluating Computer Architectures. Software & Hardware: The Virtuous Cycle?

Lecture 3: Evaluating Computer Architectures. Software & Hardware: The Virtuous Cycle? Lecture 3: Evaluating Computer Architectures Announcements - Reminder: Homework 1 due Thursday 2/2 Last Time technology back ground Computer elements Circuits and timing Virtuous cycle of the past and

More information

Introduction to Cloud Computing

Introduction to Cloud Computing Introduction to Cloud Computing Parallel Processing I 15 319, spring 2010 7 th Lecture, Feb 2 nd Majd F. Sakr Lecture Motivation Concurrency and why? Different flavors of parallel computing Get the basic

More information

Instruction Set Design

Instruction Set Design Instruction Set Design Instruction Set Architecture: to what purpose? ISA provides the level of abstraction between the software and the hardware One of the most important abstraction in CS It s narrow,

More information

Overview of Network Hardware and Software. CS158a Chris Pollett Jan 29, 2007.

Overview of Network Hardware and Software. CS158a Chris Pollett Jan 29, 2007. Overview of Network Hardware and Software CS158a Chris Pollett Jan 29, 2007. Outline Scales of Networks Protocol Hierarchies Scales of Networks Last day, we talked about broadcast versus point-to-point

More information

College 5, Routing, Internet. Host A. Host B. The Network Layer: functions

College 5, Routing, Internet. Host A. Host B. The Network Layer: functions CSN-s 5/1 College 5, Routing, Internet College stof 1 Inleiding: geschiedenis, OSI model, standaarden, ISOC/IETF/IRTF structuur Secties: 1.2, 1.3, 1.4, 1.5 2 Fysieke laag: Bandbreedte/bitrate Secties:

More information

Introduction to LAN/WAN. Network Layer

Introduction to LAN/WAN. Network Layer Introduction to LAN/WAN Network Layer Topics Introduction (5-5.1) Routing (5.2) (The core) Internetworking (5.5) Congestion Control (5.3) Network Layer Design Isues Store-and-Forward Packet Switching Services

More information

Quality of Service (QoS)) in IP networks

Quality of Service (QoS)) in IP networks Quality of Service (QoS)) in IP networks Petr Grygárek rek 1 Quality of Service (QoS( QoS) QoS is the ability of network to support applications without limiting it s s function or performance ITU-T T

More information

- Nishad Nerurkar. - Aniket Mhatre

- Nishad Nerurkar. - Aniket Mhatre - Nishad Nerurkar - Aniket Mhatre Single Chip Cloud Computer is a project developed by Intel. It was developed by Intel Lab Bangalore, Intel Lab America and Intel Lab Germany. It is part of a larger project,

More information

White Paper Abstract Disclaimer

White Paper Abstract Disclaimer White Paper Synopsis of the Data Streaming Logical Specification (Phase I) Based on: RapidIO Specification Part X: Data Streaming Logical Specification Rev. 1.2, 08/2004 Abstract The Data Streaming specification

More information

Chapter 14: Distributed Operating Systems

Chapter 14: Distributed Operating Systems Chapter 14: Distributed Operating Systems Chapter 14: Distributed Operating Systems Motivation Types of Distributed Operating Systems Network Structure Network Topology Communication Structure Communication

More information

Module 15: Network Structures

Module 15: Network Structures Module 15: Network Structures Background Topology Network Types Communication Communication Protocol Robustness Design Strategies 15.1 A Distributed System 15.2 Motivation Resource sharing sharing and

More information

A Dynamic Link Allocation Router

A Dynamic Link Allocation Router A Dynamic Link Allocation Router Wei Song and Doug Edwards School of Computer Science, the University of Manchester Oxford Road, Manchester M13 9PL, UK {songw, doug}@cs.man.ac.uk Abstract The connection

More information

Data Center Networks and Basic Switching Technologies

Data Center Networks and Basic Switching Technologies Data Center Networks and Basic Switching Technologies Hakim Weatherspoon Assistant Professor, Dept of Computer Science CS 5413: High Performance Systems and Networking September 15, 2014 Slides used and

More information

Stream Processing on GPUs Using Distributed Multimedia Middleware

Stream Processing on GPUs Using Distributed Multimedia Middleware Stream Processing on GPUs Using Distributed Multimedia Middleware Michael Repplinger 1,2, and Philipp Slusallek 1,2 1 Computer Graphics Lab, Saarland University, Saarbrücken, Germany 2 German Research

More information

COMMUNICATION PERFORMANCE EVALUATION AND ANALYSIS OF A MESH SYSTEM AREA NETWORK FOR HIGH PERFORMANCE COMPUTERS

COMMUNICATION PERFORMANCE EVALUATION AND ANALYSIS OF A MESH SYSTEM AREA NETWORK FOR HIGH PERFORMANCE COMPUTERS COMMUNICATION PERFORMANCE EVALUATION AND ANALYSIS OF A MESH SYSTEM AREA NETWORK FOR HIGH PERFORMANCE COMPUTERS PLAMENKA BOROVSKA, OGNIAN NAKOV, DESISLAVA IVANOVA, KAMEN IVANOV, GEORGI GEORGIEV Computer

More information

Administration. Instruction scheduling. Modern processors. Examples. Simplified architecture model. CS 412 Introduction to Compilers

Administration. Instruction scheduling. Modern processors. Examples. Simplified architecture model. CS 412 Introduction to Compilers CS 4 Introduction to Compilers ndrew Myers Cornell University dministration Prelim tomorrow evening No class Wednesday P due in days Optional reading: Muchnick 7 Lecture : Instruction scheduling pr 0 Modern

More information

Building Blocks for PRU Development

Building Blocks for PRU Development Building Blocks for PRU Development Module 1 PRU Hardware Overview This session covers a hardware overview of the PRU-ICSS Subsystem. Author: Texas Instruments, Sitara ARM Processors Oct 2014 2 ARM SoC

More information

Muhammed F. Mudawwar

Muhammed F. Mudawwar Muhammed F. Mudawwar Computer Science Department The American University in Cairo 113 Kasr el Aini Street, Cairo, Egypt Office: +20 2 797-5305 Email: mudawwar@aucegypt.edu Web: http://www.cs.aucegypt.edu/~mudawwar

More information

Building an Inexpensive Parallel Computer

Building an Inexpensive Parallel Computer Res. Lett. Inf. Math. Sci., (2000) 1, 113-118 Available online at http://www.massey.ac.nz/~wwiims/rlims/ Building an Inexpensive Parallel Computer Lutz Grosz and Andre Barczak I.I.M.S., Massey University

More information

Historically, Huge Performance Gains came from Huge Clock Frequency Increases Unfortunately.

Historically, Huge Performance Gains came from Huge Clock Frequency Increases Unfortunately. Historically, Huge Performance Gains came from Huge Clock Frequency Increases Unfortunately. Hardware Solution Evolution of Computer Architectures Micro-Scopic View Clock Rate Limits Have Been Reached

More information

LOGICAL TOPOLOGY DESIGN Practical tools to configure networks

LOGICAL TOPOLOGY DESIGN Practical tools to configure networks LOGICAL TOPOLOGY DESIGN Practical tools to configure networks Guido. A. Gavilanes February, 2010 1 Introduction to LTD " Design a topology for specific requirements " A service provider must optimize its

More information

COURSE OUTLINE Survey of Operating Systems

COURSE OUTLINE Survey of Operating Systems Butler Community College Career and Technical Education Division Skyler Lovelace New Fall 2014 Implemented Spring 2015 COURSE OUTLINE Survey of Operating Systems Course Description IN 167. Survey of Operating

More information

Smart Queue Scheduling for QoS Spring 2001 Final Report

Smart Queue Scheduling for QoS Spring 2001 Final Report ENSC 833-3: NETWORK PROTOCOLS AND PERFORMANCE CMPT 885-3: SPECIAL TOPICS: HIGH-PERFORMANCE NETWORKS Smart Queue Scheduling for QoS Spring 2001 Final Report By Haijing Fang(hfanga@sfu.ca) & Liu Tang(llt@sfu.ca)

More information

LSN 2 Computer Processors

LSN 2 Computer Processors LSN 2 Computer Processors Department of Engineering Technology LSN 2 Computer Processors Microprocessors Design Instruction set Processor organization Processor performance Bandwidth Clock speed LSN 2

More information

Chapter 4 Register Transfer and Microoperations. Section 4.1 Register Transfer Language

Chapter 4 Register Transfer and Microoperations. Section 4.1 Register Transfer Language Chapter 4 Register Transfer and Microoperations Section 4.1 Register Transfer Language Digital systems are composed of modules that are constructed from digital components, such as registers, decoders,

More information

Chapter 16: Distributed Operating Systems

Chapter 16: Distributed Operating Systems Module 16: Distributed ib System Structure, Silberschatz, Galvin and Gagne 2009 Chapter 16: Distributed Operating Systems Motivation Types of Network-Based Operating Systems Network Structure Network Topology

More information

CCNA R&S: Introduction to Networks. Chapter 5: Ethernet

CCNA R&S: Introduction to Networks. Chapter 5: Ethernet CCNA R&S: Introduction to Networks Chapter 5: Ethernet 5.0.1.1 Introduction The OSI physical layer provides the means to transport the bits that make up a data link layer frame across the network media.

More information

CROSS LAYER BASED MULTIPATH ROUTING FOR LOAD BALANCING

CROSS LAYER BASED MULTIPATH ROUTING FOR LOAD BALANCING CHAPTER 6 CROSS LAYER BASED MULTIPATH ROUTING FOR LOAD BALANCING 6.1 INTRODUCTION The technical challenges in WMNs are load balancing, optimal routing, fairness, network auto-configuration and mobility

More information

Chapter 11 I/O Management and Disk Scheduling

Chapter 11 I/O Management and Disk Scheduling Operating Systems: Internals and Design Principles, 6/E William Stallings Chapter 11 I/O Management and Disk Scheduling Dave Bremer Otago Polytechnic, NZ 2008, Prentice Hall I/O Devices Roadmap Organization

More information

Joint ITU-T/IEEE Workshop on Carrier-class Ethernet

Joint ITU-T/IEEE Workshop on Carrier-class Ethernet Joint ITU-T/IEEE Workshop on Carrier-class Ethernet Quality of Service for unbounded data streams Reactive Congestion Management (proposals considered in IEE802.1Qau) Hugh Barrass (Cisco) 1 IEEE 802.1Qau

More information

Performance Analysis of Storage Area Network Switches

Performance Analysis of Storage Area Network Switches Performance Analysis of Storage Area Network Switches Andrea Bianco, Paolo Giaccone, Enrico Maria Giraudo, Fabio Neri, Enrico Schiattarella Dipartimento di Elettronica - Politecnico di Torino - Italy e-mail:

More information

Module 5. Broadcast Communication Networks. Version 2 CSE IIT, Kharagpur

Module 5. Broadcast Communication Networks. Version 2 CSE IIT, Kharagpur Module 5 Broadcast Communication Networks Lesson 1 Network Topology Specific Instructional Objectives At the end of this lesson, the students will be able to: Specify what is meant by network topology

More information

Communication Protocol

Communication Protocol Analysis of the NXT Bluetooth Communication Protocol By Sivan Toledo September 2006 The NXT supports Bluetooth communication between a program running on the NXT and a program running on some other Bluetooth

More information

Solving Network Challenges

Solving Network Challenges Solving Network hallenges n Advanced Multicore Sos Presented by: Tim Pontius Multicore So Network hallenges Many heterogeneous cores: various protocols, data width, address maps, bandwidth, clocking, etc.

More information

On some Potential Research Contributions to the Multi-Core Enterprise

On some Potential Research Contributions to the Multi-Core Enterprise On some Potential Research Contributions to the Multi-Core Enterprise Oded Maler CNRS - VERIMAG Grenoble, France February 2009 Background This presentation is based on observations made in the Athole project

More information

28 Networks and Communication Protocols

28 Networks and Communication Protocols 113 28 Networks and ommunication Protocols Trend in computer systems: personal computing. Reasons why: ost: economies of scale. lso, avoids large initial investment in timesharing system. Performance:

More information

Graph Analytics in Big Data. John Feo Pacific Northwest National Laboratory

Graph Analytics in Big Data. John Feo Pacific Northwest National Laboratory Graph Analytics in Big Data John Feo Pacific Northwest National Laboratory 1 A changing World The breadth of problems requiring graph analytics is growing rapidly Large Network Systems Social Networks

More information

QoS Switching. Two Related Areas to Cover (1) Switched IP Forwarding (2) 802.1Q (Virtual LANs) and 802.1p (GARP/Priorities)

QoS Switching. Two Related Areas to Cover (1) Switched IP Forwarding (2) 802.1Q (Virtual LANs) and 802.1p (GARP/Priorities) QoS Switching H. T. Kung Division of Engineering and Applied Sciences Harvard University November 4, 1998 1of40 Two Related Areas to Cover (1) Switched IP Forwarding (2) 802.1Q (Virtual LANs) and 802.1p

More information

Design and Verification of Nine port Network Router

Design and Verification of Nine port Network Router Design and Verification of Nine port Network Router G. Sri Lakshmi 1, A Ganga Mani 2 1 Assistant Professor, Department of Electronics and Communication Engineering, Pragathi Engineering College, Andhra

More information

Chapter 2. Multiprocessors Interconnection Networks

Chapter 2. Multiprocessors Interconnection Networks Chapter 2 Multiprocessors Interconnection Networks 2.1 Taxonomy Interconnection Network Static Dynamic 1-D 2-D HC Bus-based Switch-based Single Multiple SS MS Crossbar 2.2 Bus-Based Dynamic Single Bus

More information

Computer Organization and Components

Computer Organization and Components Computer Organization and Components IS1500, fall 2015 Lecture 5: I/O Systems, part I Associate Professor, KTH Royal Institute of Technology Assistant Research Engineer, University of California, Berkeley

More information

Advanced Computer Architecture-CS501. Computer Systems Design and Architecture 2.1, 2.2, 3.2

Advanced Computer Architecture-CS501. Computer Systems Design and Architecture 2.1, 2.2, 3.2 Lecture Handout Computer Architecture Lecture No. 2 Reading Material Vincent P. Heuring&Harry F. Jordan Chapter 2,Chapter3 Computer Systems Design and Architecture 2.1, 2.2, 3.2 Summary 1) A taxonomy of

More information

CMSC 611: Advanced Computer Architecture

CMSC 611: Advanced Computer Architecture CMSC 611: Advanced Computer Architecture Parallel Computation Most slides adapted from David Patterson. Some from Mohomed Younis Parallel Computers Definition: A parallel computer is a collection of processing

More information

CPS104 Computer Organization and Programming Lecture 18: Input-Output. Robert Wagner

CPS104 Computer Organization and Programming Lecture 18: Input-Output. Robert Wagner CPS104 Computer Organization and Programming Lecture 18: Input-Output Robert Wagner cps 104 I/O.1 RW Fall 2000 Outline of Today s Lecture The I/O system Magnetic Disk Tape Buses DMA cps 104 I/O.2 RW Fall

More information

Preserving Message Integrity in Dynamic Process Migration

Preserving Message Integrity in Dynamic Process Migration Preserving Message Integrity in Dynamic Process Migration E. Heymann, F. Tinetti, E. Luque Universidad Autónoma de Barcelona Departamento de Informática 8193 - Bellaterra, Barcelona, Spain e-mail: e.heymann@cc.uab.es

More information