Interconnection Networks

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Interconnection Networks"

Transcription

1 Interconnection Networks Z. Jerry Shi Assistant Professor of Computer Science and Engineering University of Connecticut * Slides adapted from Blumrich&Gschwind/ELE475 03, Peh/ELE475 * Three questions about interconnection networks What is an interconnection network? A programmable system that transports data between terminals Where do you find interconnection network? Used in almost all digital systems that are large enough to have two components to connect The most common applications are in computer systems and communication switches Connection between processors and memories, I/O devices and I/O controllers Simple bus systems are used in many systems, but high processor performance demand fast interconnection networks Why are interconnection network important? Limiting factor in the performance of many systems 1

2 Architecture of Interconnection Networks How to connect the nodes up (processors, memories, router line cards, SoC modules) TOPOLOGY Which path should a message take? ROUTING AND DEADLOCKS How is the message actually forwarded from source to destination FLOW CONTROL How to build the routers ROUTER MICROARCHITECTURE How to build the links LINK ARCHITECTURE How do nodes talk to the network NETWORK INTERFACE Metrics in Interconnection Networks Performance Latency How fast data can be transported through the network Throughput How many pieces of data (messages) can be transported in each time unit Power Area Cost Fault-Tolerance Quality-of-service 2

3 Topology Interconnection networks consists of a set of shared router nodes and channels Topology refers to the arrangement of these nodes and channels Analogous to roadmap Channels (roads), packets (cars), router nodes (intersection) Topological Properties Routing Distance - number of links on route Average Distance Diameter - maximum routing distance Bisection Bandwidth is the bandwidth crossing a minimal cut that divides the network in half A network is partitioned by a set of links if their removal disconnects the graph Degree number of communication links attached to a node 3

4 Linear Arrays and Rings N-2 N-1... Linear Array Diameter? Average Distance? Bisection bandwidth? Route A -> B given by relative address R = B-A Ring? Examples: Fiber Distributed Data Interface (FDDI), Scalable Coherent Interface (SCI), FiberChannel Arbitrated Loop Multidimensional Meshes, Tori, and Hypercubes d-dimensional k-ary torus (or k-ary d-cube) N = k d Each dimension has k nodes, which can be located with a vector A k-ary d-cube can be constructed with k k-ary (d 1)-cubes The radix in each dimension may be different For example, 2,3,4-ary 3-cube d-dimensional k-ary mesh: similar to torus Cut the channels between the first and last node in every dimension Hypercube: binary d-cube The radix in all dimensions is either 0 or 1 4

5 Hypercubes Also called binary n-cubes Number of nodes N = 2 n Distance: O(logN) hops Good bisection bandwidth Complexity Out degree is n = logn 0-D 1-D 2-D 3-D 4-D 5-D! Real World 2D mesh 1824 node Paragon: 16 x 114 mesh 5

6 Properties Routing Relative distance: R = (b d-1 a d-1,..., b 0 a 0 ) Traverse r i = b i a i hops in each dimension dimension-order routing Degree? Diameter? Average Distance dk/4 for cube Bisection bandwidth? k d-1 bidirectional links Physical layout? 2D in O(N) space Higher dimension? Embeddings in two dimensions 6 x 3 x 2 Embed multiple logical dimension in one physical dimension using long wires 6

7 Topology Summary Topology Degree Diameter Ave Dist Bisection D (D P=1024 1D Array 2 N-1 N / 3 1 huge 1D Ring 2 N/2 N/4 2 2D Mesh 4 2 (N 1/2-1) 2/3 N 1/2 N 1/2 63 (21) 2D Torus 4 N 1/2 1/2 N 1/2 2N 1/2 32 (16) k-ary n-cube 2n nk/2 nk/4 nk/4 15 Hypercube n=logn n n/2 N/2 10 (5) All have some bad permutations Many popular permutations are very bad for meshes (transpose) Randomness in wiring or routing makes it hard to find a bad one! Trees Diameter and ave distance logarithmic k-ary tree, height d = log k N Address specified d-vector of radix k coordinates describing path down from root Fixed degree Route up to common ancestor and down R = B xor A let i be position of most significant 1 in R, route up i+1 levels down in direction given by low i+1 bits of B H-tree space is O(N) with O( N) long wires Bisection BW? 7

8 Fat-Trees Fatter links (really more of them) as you go up, so bisection BW scales with N Butterflies Tree with lots of roots! N log N (actually N/2 x logn) Exactly one route from any source to any dest R = A xor B, at level i use straight edge if r i =0, otherwise cross edge Bisection N/2 8

9 Benes network and Fat Tree Back-to-back butterfly can route all permutations Off line What if you just pick a random mid point? INPUT Butterfly network Inverse butterfly network OUTPUT Relationship Butterflies to Hypercubes Wiring is isomorphic Except that Butterfly always takes log n steps Many other types of multistage interconnection networks 9

10 How Many Dimensions? n = 2 or n = 3 Short wires, easy to build Many hops, low bisection bandwidth Requires traffic locality n >= 4 Harder to build, more wires, longer average length Fewer hops, better bisection bandwidth Can handle non-local traffic k-ary d-cubes provide a consistent framework for comparison N = k d Scale dimension (d) or nodes per dimension (k) Real Machines Wide links, smaller routing delay Tremendous variation 10

11 Routing Messages, Packets, Flits, Phits Flits (flow control digits) is the basic unit of bandwidth and storage allocation Phits (physical transfer digits) is the unit of information that is transferred across a channel in a single clock cycle 11

12 Typical Packet Format Trailer Error Code Data Payload Routing and Control Header digital symbol Sequence of symbols transmitted over a channel A packet consists of different types of flits Head, body, or tail The head flit carries the packet s routing information A packet has a format of HB*T* Routing Routing algorithm determines which of the possible paths are used as routes how the route is determined R: N x N C, which at each switch maps the destination node to the next channel on the route Issues: Routing mechanism arithmetic source-based port select table driven general computation Properties of the routes Deadlock free 12

13 Taxonomy of Routing Algorithms Deterministic Route determined by (source, dest), not intermediate state (i.e. traffic) Given two nodes x and y, the path R x,y is the same Oblivious Choose a route without considering any information about the network s current state Example, a random algorithm Adaptive Route influenced by traffic along the way Minimal Only selects shortest paths Example: routing on a ring Greedy Always send the packet in the shortest direction Uniform random Randomly pick a direction, with equal probability for picking either direction Weighted random Randomly pick a direction, but weight the short direction with 1 d/n where d is the shortest path Adaptive Send the packet in the direction for which local channel has the lowest load Record how many packets a channel has transmitted over the last T slots 13

14 Routing relation R: N N ρ(p) The output of the relation is an entire path There may be multiple paths R: N N ρ(c) Routing is incremental The output only indicates the channels that the packet take at the current node R: C N ρ(c) Similar to the second method Use the current channel instead of current node Adaptive Routing R: C N Σ C Essential for fault tolerance At least multipath Can improve utilization of the network Simple deterministic algorithms easily run into bad permutations Fully/partially adaptive, minimal/non-minimal Can introduce complexity or anomalies Little adaptation goes a long way! 14

15 Routing Mechanism Need to select output port for each input packet in a few cycles Simple arithmetic in regular topologies Example: x, y routing in a grid west (-x) x< 0 east (+x) x> 0 south (-y) x= 0, y < 0 north (+y) x= 0, y > 0 processor x= 0, y = 0 Reduce relative address of each dimension in order Dimension-order routing in k-ary d-cubes Calculate preferred directions then adjust one dimension each time Used in Cray T3D, which connects up to 2048 DEC Alpha processing elements Routing Mechanism (cont) P 3 P 2 P 1 P 0 Source-based Mainly used in deterministic and oblivious routing All routing decisions are made in the source and message header carries series of port selects Used and stripped en route Fast, simple, and scalable CS-2, Myrinet, MIT Artic Node-table More appropriate for adaptive routing Decide the output channel based on incoming channel and destination Can redirect traffic if one output link is congested or fails ATM, HPPI 15

16 Deadlock How can it arise? Necessary conditions: Shared resource (buffers or channels) Incrementally allocated Non-preemptible Think of a channel as a shared resource that is acquired incrementally Source buffer then destination buffer Channels along a route How do you avoid it? Deadlock avoidance: guarantee no deadlock Constrain how channel resources are allocated. Example: dimension order Deadlock recovery: deadlock is detected and corrected How do you prove that a routing algorithm is deadlock free? Deadlock Freedom Resources are logically associated with channels Messages introduce dependences between resources as they move forward Need to articulate the possible dependences that can arise between channels Show that there are no cycles in Channel Dependence Graph Find a numbering of channel resources such that every legal route follows a monotonic sequence => No traffic pattern can lead to deadlock Network need not be acyclic, on channel dependence graph All deadlock avoidance techniques use some form of resource ordering 16

17 Deadlock Recovery Detection Determining exactly whether the network is deadlocked is difficult Most practical detection mechanism are conservative May have false positives Timeout counters Reset when making progress Recovery Regressive: packets or connections that are deadlocked are removed Progressive: keep the packets or connections in escape buffer Potentially has better performance Routing using the escape buffer is designed to be deadlock-free Flow Control Flow control determines how a network s resources are allocated Resources: channel bandwidth, buffer capacity, etc. Good flow control: achieves a high fraction of ideal bandwidth and delivers packets with low, predictable latency Can also be viewed as a problem of contention resolution Problem is there because we are sharing resources Processor: Resources in a processor: ALUs, registers How to run as many operations, optimizing use of ALUs and registers Network Resources in a network: Buffers, links How to forward as many messages, optimizing use of buffers and links 17

18 Contention Two packets trying to use the same link at the same time Limited buffering Drop? Flow control protocols Bufferless Dropping Misrouting Circuit switching Header traverses the network and reserves resources Data are then sent through the reserved path Buffered Store-and-forward Virtual cut-through Wormhole Virtual-channel 18

19 Simplest Flow Control: Dropping If two things arrive and I don t have resources, drop one of them Flow control protocol on the Internet Not used in interconnection networks why? Time-space Diagram: Dropping 19

20 Next Simplest Flow Control: Misrouting If only one message can enter the network at each node, and one message can exit the network at each node, the network can never be congested. Right? Philosophy behind misrouting: intentionally route away from congestion No need for buffering Circuit Switching Bufferless Probe that sets up path through network If the request flit is blocked, it is held in place (not dropped) Reserve all links Data are then sent through links Simple router Similar to the dropping case Need only one register to buffer the header When is this good? When is it not? 20

21 Time-space Diagram: Circuit Switching Store-and-Forward Buffered flow control: flits can be stored in routing nodes Flits arriving on cycle i do not have to leave on cycle i + 1 Make intermediate stops and wait till the whole packet has arrived before you move on Two resources must be allocated to the packet A packet-sized buffer at the other side of the channel Exclusive use of the channel Other packets can use intermediate links Pros and cons? 21

22 Time-space Diagram: Store-and-Forward With store-and-forward, packets do no have to be divided into flits Virtual Cut-through Why wait till entire message has arrived at each intermediate stop? The head of the message can dash off first Of course, the two resources must be allocated When the head gets blocked, whole message gets blocked at the intermediate node 22

23 Time-space Diagram: Virtual Cut-through Wormhole Similar to virtual cut-through, but channel and buffers are allocated to flits rather than packets When the head flit arrives, it must acquire three resources before being forwarded to the next node A virtual channel for the packet State bits indicating the output channel, state of virtual channel (Idle, waiting for resources, or active), and other information One flit buffer One flit of channel bandwidth Body flits do not need to acquire virtual channels But still needs to allocate flit buffer and channel bandwidth The tail flit releases the virtual channel Channel is owned by a packet, but buffers are allocated on a flit-by-flit basis When a flit cannot acquire a buffer, the channel goes idle 23

24 Time-space Diagram: Wormhole Virtual Channel Associates several virtual channels with a single physical channel When a packet blocks, instead of holding on to physical links so others cannot use them, hold on to virtual links The head flit needs three resources to advance A virtual channel, a downstream flit buffer, and channel bandwidth Subsequent body flits uses the same virtual channel But still needs to allocate flit buffer and channel bandwidth However, these flits are not guaranteed access to the channel bandwidth Lanes on the highway You have to compete with other cars 24

25 Time-space diagram: virtual-channel Arbitration may not be fair It can be winner-take-all Link-level flow control Given that you can t drop packets, how to manage the buffers? When can you send stuff forward, when not? Three techniques Credit-based: upstream router keeps a count of the number of free flit buffer in each virtual channel downstream On/off: a single bit indicate whether the upstream node can send or not Ack/nack: upstream node optimistically sends flits when they are available and downstream node sends back ack or nack Flit-Reservation Reduces buffer turnaround time 25

26 Link-level flow control Short Links F/E Ready/Ack Req F/E Source Data Destination Long links Several flits on the wire Buffer turnaround time A flits leaves downstream node. Credit is sent to the current node. Credit is processed and a flip is sent to downstream node. Downstream node receives the flip hold pipeline delay wire delay buffer use release hold credit delay pipeline delay wire delay buffer use release credit delay Buffer turnaround time 26

27 Flit-reservation flow control Hides the overhead by separating the control and data networks Control flits race ahead to reserve network resources Can also streamlines the delivery of credits Allows zero buffer turnaround time Not always possible to reserve resources The control head flit is similar to a typical head flit, but with an additional field shows the time offset to the first data flit Routing node knows when the data flit will arrive, and starts to prepare buffer now Router (switch) microarchitecture: What s in a router? It s a system as well Logic State machines, Arbiters, Allocators Control the movement through router Idle, Routing, Waiting for resources, Active Memory Buffers Store flits before forwarding them SRAMs, registers, processor memory Communication Switches Transfer flits from input to output ports Crossbars, multiple crossbars, fully-connected, bus 27

28 Typical Router Design Input Ports Receiver Input Buffer Output Buffer Transmiter Output Ports Cross-bar Control Routing, Scheduling Router Components Output ports Transmitter (typically drives clock and data) Input ports Synchronizer and aligns data signal with local clock domain Essentially a FIFO buffer Crossbar Connects each input to any output Degree limited by area or pinout Buffering Control logic Complexity depends on routing logic and scheduling algorithm Determine output port for each incoming packet Arbitrate among inputs directed at same output 28

29 Buffer Organizations Input buffers Buffering at each input port, stores flits till they get to leave through switch to next hop Central buffers A central memory shared among every port Functions as switch as well Output buffers Flits flow right through to output port Highest throughput, no head-of-line blocking Input Buffered Router Input Ports R0 Output Ports R1 R2 Cross-bar R3 Scheduling Independent routing logic per input FSM Scheduler logic arbitrates each output Priority, FIFO, or random Head-of-line blocking problem If an earlier flit is missed, the later flits hold the buffer 29

30 Output Buffered Router Input Ports R0 Output Ports R1 Output Ports R2 Output Ports R3 Output Ports Control Commit to output - limited adaptivity Switch has to handle input line speeds Virtual-channel Router 30

31 Virtual-channel Router Packet head, body, tail flits Head Routing output port Request and arbitrate for next VC Request and arbitrate for switch path Request and arbitrate for buffer Traverse switch Body Request and arbitrate for switch path Request and arbitrate for buffer Traverse switch Tail Request and arbitrate for switch path Request and arbitrate for buffer Traverse switch Release switch path State machines Control the state of the router Each input channel G: Global State: is it idle? routing? waiting for VC? buffer? R: Output port Filled by routing O: Output VC Filled by VC allocation P: Head and tail queue pointers C: Credits Each output channel G: Global state: Idle? Active? Waiting for credits? I: Input VC that is sending flits to this output port C: Credit count 31

32 Pipelining of a typical virtual channel router Cycle Head flit RC VA SA ST Body flit 1 SA ST Boyd flit 2 SA ST Tail flit SA ST Cycle 0: Head flits arrives. G will change to R on the next cycle Cycle 1: RC(Routing computation). R and G (=V) will be updated on the next cycle Cycle 2: VA(Virtual channel allocation). On the next cycle, O and G (=A) will be updated. The state of output channel will be updated Cycle 3: SA: Switch allocation Cycle 4: ST: Switch traversal Output arbiters N requesters (inputs) trying to get a single resource under contention (output) N:1 arbiter for each output Several types of arbiters Fixed priority arbiter Variable priority arbiter Oblivious arbiter Round robin arbiter 32

33 Fixed Priority Arbiter Variable Priority Arbiter A one-hot priority signal p selects the highest priority Only one of the p s can be 1 33

34 Variable Priority Arbiters Oblivious Not dependent on previous grants or requests Rotating priorities Random priorities Variable Priority Arbiters Round robin Request that was last served should have lowest priority Serve all other requests first before returning to this requestor If a grant is issued this cycle, the request next to the one receiving the grant will have the highest priority on the next cycle 34

35 Allocators NxM allocator: N requestors fighting for M resources Results: A grant can be asserted only if the corresponding request is asserted At most one grant for each input may be asserted At most one grant for each resource may be asserted Allocators In Routers VC Allocator Input VCs requesting for a range of output VCs E.g. a packet of VC0 arrives at East input port. It s destined for west output port, and would like to get any of the VCs of that output port. Switch Allocator Input VCs of an input port request for different output ports (e.g. One s going North, another s going West) 35

36 Simplest Allocators: Separable Approximate with two stages of arbitration One on inputs, one on outputs. They can be in either order. Separable Allocator Example: Dumb arbiters that always choose the first request 36

37 Switches The fabric that directs flits from one input port to another output port Design issue: number of input and output ports, and speedups Speedup: the ratio of the total input bandwidth to the netowk s ideal capacity (the best throughput) Tradeoff between cost (delay, area, power) and performance (throughput) Tradeoff between leaving it up to allocation or simplifying the job for allocators Crossbar switches Input speedup = 1 Input speedup = 2 37

38 Effect of input speedup With a random allocator Throughput is the fraction of capacity Several flit buffer organizations Central Simple logical view There are actually two switches: MUX in and demux out Problems: bandwidth and latency Separate memory per input port Virtual channels associated with a physical channel can share buffer 38

39 Virtual Channel (VC) Buffer Organization One buffer per VC Allows switches to access multiple VC associated with one PC, but leads to poor memory utilization. Approximations: A small amount of output ports on a single buffer Divide VCs among buffers Memory Interleaving! Case Study: Alpha router 39

40 Alpha router Torus Virtual cut-through (316 packet buffers) Adaptive routing: prefer to continue in the same dimension Deadlock avoidance Coherence: Requests may fill up buffers, stalling acks (Solution: Virtual channel class, order) Network: Escape virtual channel Router microarchitecture 40

41 Router microarchitecture Network Interface How a processor sends data to the network Shared memory cache-coherent multiprocessors Interfaces caches with networks Message-passing multiprocessors Interfaces processor pipeline with networks Dedicated register (or two registers) Register map Memory map Virtual memory map I/O interrupt + DMA 41

42 Cache-coherent SMP processor-network interface Highly optimized interface: from load/ store to messages in a few cycles Request is placed in memory request register Tag: how to handle the reply, e.g., store the data in R24 Type: cacheable or not; read or write Cache hit: place in reply register right away Cache miss: enter miss status holding register (MSHR) Use this to merge reads/writes as well Number of MSHRs == number of pending memory references (4 to 32) Cache-coherent SMP memory-network interface Messages from the network initialize transaction status holding register (TSHR) Messages may be queued TSHR tracks the status of pending memory operations Example: For a non-cacheable read, the TSHR status changes: Read pending (waiting for bank) Bank activated (waiting for data) Read complete (preparing message) Idle (the reply message sent) 42

43 Message-passing multiprocessors: Dedicated register Send Move a value to the network out register Special MOV instruction for the last word to terminate the packet Read Block on the register until packet arrives, or test register and retry later Pros: fast Cons: Long messages: processor becoming DMA engine! Security: hold the register forever Register map Send a message atomically from a subset of the processor s general purpose register Cons: Long messages have to be segmented Pressures on general purpose register Processors are still DMA engines 43

44 I/O interface Most common interface today, in PCs, Clusters of workstations (e.g. Infiniband, Myrinet, PCI) Software-level messaging: Interrupt triggers handler Handler sets up DMA DMA engine constructs packets from memory and sends out to network Physical-memory-mapped or virtual-memory-mapped Case Study: Princeton SHRIMP Where: I/O bus How: Virtual memory map 44

45 Virtual memory mapping Map_network(My_virtual_addr_range,Your_virtual_addr_range) Each virtual page -> local physical page -> remote physical page -> remote virtual address Store to these virtual addresses => network Virtual memory map (SHRIMP) 45

46 Case Study: M-Machine Multicomputer Experimental multicomputer built at MIT and Standford 2-D torus Multi-ALU processor (MAP) chip 46

Interconnection Network

Interconnection Network Interconnection Network Recap: Generic Parallel Architecture A generic modern multiprocessor Network Mem Communication assist (CA) $ P Node: processor(s), memory system, plus communication assist Network

More information

Scalable Interconnection Networks

Scalable Interconnection Networks Scalable Interconnection Networks 1 Scalable, High Performance Network At Core of Parallel Computer Architecture Requirements and trade-offs at many levels Elegant mathematical structure Deep relationships

More information

Lecture 18: Interconnection Networks. CMU 15-418: Parallel Computer Architecture and Programming (Spring 2012)

Lecture 18: Interconnection Networks. CMU 15-418: Parallel Computer Architecture and Programming (Spring 2012) Lecture 18: Interconnection Networks CMU 15-418: Parallel Computer Architecture and Programming (Spring 2012) Announcements Project deadlines: - Mon, April 2: project proposal: 1-2 page writeup - Fri,

More information

Interconnection Network Design

Interconnection Network Design Interconnection Network Design Vida Vukašinović 1 Introduction Parallel computer networks are interesting topic, but they are also difficult to understand in an overall sense. The topological structure

More information

Interconnection Networks. Interconnection Networks. Interconnection networks are used everywhere!

Interconnection Networks. Interconnection Networks. Interconnection networks are used everywhere! Interconnection Networks Interconnection Networks Interconnection networks are used everywhere! Supercomputers connecting the processors Routers connecting the ports can consider a router as a parallel

More information

System Interconnect Architectures. Goals and Analysis. Network Properties and Routing. Terminology - 2. Terminology - 1

System Interconnect Architectures. Goals and Analysis. Network Properties and Routing. Terminology - 2. Terminology - 1 System Interconnect Architectures CSCI 8150 Advanced Computer Architecture Hwang, Chapter 2 Program and Network Properties 2.4 System Interconnect Architectures Direct networks for static connections Indirect

More information

Topological Properties

Topological Properties Advanced Computer Architecture Topological Properties Routing Distance: Number of links on route Node degree: Number of channels per node Network diameter: Longest minimum routing distance between any

More information

Interconnect. Jesús Labarta. Index

Interconnect. Jesús Labarta. Index Interconnect Jesús Labarta Index 1 Interconnection networks Need to send messages (commands/responses, message passing) Processors Memory Node Node Interconnection networks Components Links Switches Network

More information

Lecture 2 Parallel Programming Platforms

Lecture 2 Parallel Programming Platforms Lecture 2 Parallel Programming Platforms Flynn s Taxonomy In 1966, Michael Flynn classified systems according to numbers of instruction streams and the number of data stream. Data stream Single Multiple

More information

Lecture 23: Interconnection Networks. Topics: communication latency, centralized and decentralized switches (Appendix E)

Lecture 23: Interconnection Networks. Topics: communication latency, centralized and decentralized switches (Appendix E) Lecture 23: Interconnection Networks Topics: communication latency, centralized and decentralized switches (Appendix E) 1 Topologies Internet topologies are not very regular they grew incrementally Supercomputers

More information

Interconnection Networks

Interconnection Networks CMPT765/408 08-1 Interconnection Networks Qianping Gu 1 Interconnection Networks The note is mainly based on Chapters 1, 2, and 4 of Interconnection Networks, An Engineering Approach by J. Duato, S. Yalamanchili,

More information

CS 6290 Many-core & Interconnect. Milos Prvulovic Fall 2007

CS 6290 Many-core & Interconnect. Milos Prvulovic Fall 2007 CS 6290 Many-core & Interconnect Milos Prvulovic Fall 2007 Interconnection Networks Classification: Shared Medium or Switched Shared Media Networks Need arbitration to decide who gets to talk Arbitration

More information

Interconnection Networks Programmierung Paralleler und Verteilter Systeme (PPV)

Interconnection Networks Programmierung Paralleler und Verteilter Systeme (PPV) Interconnection Networks Programmierung Paralleler und Verteilter Systeme (PPV) Sommer 2015 Frank Feinbube, M.Sc., Felix Eberhardt, M.Sc., Prof. Dr. Andreas Polze Interconnection Networks 2 SIMD systems

More information

Asynchronous Bypass Channels

Asynchronous Bypass Channels Asynchronous Bypass Channels Improving Performance for Multi-Synchronous NoCs T. Jain, P. Gratz, A. Sprintson, G. Choi, Department of Electrical and Computer Engineering, Texas A&M University, USA Table

More information

Hyper Node Torus: A New Interconnection Network for High Speed Packet Processors

Hyper Node Torus: A New Interconnection Network for High Speed Packet Processors 2011 International Symposium on Computer Networks and Distributed Systems (CNDS), February 23-24, 2011 Hyper Node Torus: A New Interconnection Network for High Speed Packet Processors Atefeh Khosravi,

More information

Design and Implementation of an On-Chip timing based Permutation Network for Multiprocessor system on Chip

Design and Implementation of an On-Chip timing based Permutation Network for Multiprocessor system on Chip Design and Implementation of an On-Chip timing based Permutation Network for Multiprocessor system on Chip Ms Lavanya Thunuguntla 1, Saritha Sapa 2 1 Associate Professor, Department of ECE, HITAM, Telangana

More information

Why the Network Matters

Why the Network Matters Week 2, Lecture 2 Copyright 2009 by W. Feng. Based on material from Matthew Sottile. So Far Overview of Multicore Systems Why Memory Matters Memory Architectures Emerging Chip Multiprocessors (CMP) Increasing

More information

Chapter 2. Multiprocessors Interconnection Networks

Chapter 2. Multiprocessors Interconnection Networks Chapter 2 Multiprocessors Interconnection Networks 2.1 Taxonomy Interconnection Network Static Dynamic 1-D 2-D HC Bus-based Switch-based Single Multiple SS MS Crossbar 2.2 Bus-Based Dynamic Single Bus

More information

Scalability and Classifications

Scalability and Classifications Scalability and Classifications 1 Types of Parallel Computers MIMD and SIMD classifications shared and distributed memory multicomputers distributed shared memory computers 2 Network Topologies static

More information

Introduction to Parallel Computing. George Karypis Parallel Programming Platforms

Introduction to Parallel Computing. George Karypis Parallel Programming Platforms Introduction to Parallel Computing George Karypis Parallel Programming Platforms Elements of a Parallel Computer Hardware Multiple Processors Multiple Memories Interconnection Network System Software Parallel

More information

Hardware Implementation of Improved Adaptive NoC Router with Flit Flow History based Load Balancing Selection Strategy

Hardware Implementation of Improved Adaptive NoC Router with Flit Flow History based Load Balancing Selection Strategy Hardware Implementation of Improved Adaptive NoC Rer with Flit Flow History based Load Balancing Selection Strategy Parag Parandkar 1, Sumant Katiyal 2, Geetesh Kwatra 3 1,3 Research Scholar, School of

More information

COMP 422, Lecture 3: Physical Organization & Communication Costs in Parallel Machines (Sections 2.4 & 2.5 of textbook)

COMP 422, Lecture 3: Physical Organization & Communication Costs in Parallel Machines (Sections 2.4 & 2.5 of textbook) COMP 422, Lecture 3: Physical Organization & Communication Costs in Parallel Machines (Sections 2.4 & 2.5 of textbook) Vivek Sarkar Department of Computer Science Rice University vsarkar@rice.edu COMP

More information

Scaling 10Gb/s Clustering at Wire-Speed

Scaling 10Gb/s Clustering at Wire-Speed Scaling 10Gb/s Clustering at Wire-Speed InfiniBand offers cost-effective wire-speed scaling with deterministic performance Mellanox Technologies Inc. 2900 Stender Way, Santa Clara, CA 95054 Tel: 408-970-3400

More information

CS252 S05. Connecting Multiple Computers. CMSC 411 Computer Systems Architecture Lecture 21 Networking. Connection-Based vs.

CS252 S05. Connecting Multiple Computers. CMSC 411 Computer Systems Architecture Lecture 21 Networking. Connection-Based vs. Connecting Multiple Computers CMSC 411 Computer Systems Architecture Lecture 21 Networking Shared Media vs. Switched: pairs communicate at same time: point-to-point connections Aggregate BW in ed network

More information

Lecture Overview. Multiple Processors. Multiple processors. Continuous need for faster computers

Lecture Overview. Multiple Processors. Multiple processors. Continuous need for faster computers Lecture Overview Multiple processors Multiprocessors UMA versus NUMA Hardware configurations OS configurations Process scheduling Multicomputers Interconnection configurations Network interface User-level

More information

Introduction to Exploration and Optimization of Multiprocessor Embedded Architectures based on Networks On-Chip

Introduction to Exploration and Optimization of Multiprocessor Embedded Architectures based on Networks On-Chip Introduction to Exploration and Optimization of Multiprocessor Embedded Architectures based on Networks On-Chip Cristina SILVANO silvano@elet.polimi.it Politecnico di Milano, Milano (Italy) Talk Outline

More information

Architectural Level Power Consumption of Network on Chip. Presenter: YUAN Zheng

Architectural Level Power Consumption of Network on Chip. Presenter: YUAN Zheng Architectural Level Power Consumption of Network Presenter: YUAN Zheng Why Architectural Low Power Design? High-speed and large volume communication among different parts on a chip Problem: Power consumption

More information

On-Chip Interconnection Networks Low-Power Interconnect

On-Chip Interconnection Networks Low-Power Interconnect On-Chip Interconnection Networks Low-Power Interconnect William J. Dally Computer Systems Laboratory Stanford University ISLPED August 27, 2007 ISLPED: 1 Aug 27, 2007 Outline Demand for On-Chip Networks

More information

Components: Interconnect Page 1 of 18

Components: Interconnect Page 1 of 18 Components: Interconnect Page 1 of 18 PE to PE interconnect: The most expensive supercomputer component Possible implementations: FULL INTERCONNECTION: The ideal Usually not attainable Each PE has a direct

More information

A Dynamic Link Allocation Router

A Dynamic Link Allocation Router A Dynamic Link Allocation Router Wei Song and Doug Edwards School of Computer Science, the University of Manchester Oxford Road, Manchester M13 9PL, UK {songw, doug}@cs.man.ac.uk Abstract The connection

More information

TDT 4260 lecture 11 spring semester 2013. Interconnection network continued

TDT 4260 lecture 11 spring semester 2013. Interconnection network continued 1 TDT 4260 lecture 11 spring semester 2013 Lasse Natvig, The CARD group Dept. of computer & information science NTNU 2 Lecture overview Interconnection network continued Routing Switch microarchitecture

More information

Quality of Service (QoS) for Asynchronous On-Chip Networks

Quality of Service (QoS) for Asynchronous On-Chip Networks Quality of Service (QoS) for synchronous On-Chip Networks Tomaz Felicijan and Steve Furber Department of Computer Science The University of Manchester Oxford Road, Manchester, M13 9PL, UK {felicijt,sfurber}@cs.man.ac.uk

More information

Low-Cost Router Microarchitecture for On-Chip Networks

Low-Cost Router Microarchitecture for On-Chip Networks Low-Cost Router Microarchitecture for On-Chip Networks John Kim KAIST Department of Computer Science Daejeon, Korea jjk12@cs.kaist.ac.kr ABSTRACT On-chip networks are critical to the scaling of future

More information

TRACKER: A Low Overhead Adaptive NoC Router with Load Balancing Selection Strategy

TRACKER: A Low Overhead Adaptive NoC Router with Load Balancing Selection Strategy TRACKER: A Low Overhead Adaptive NoC Router with Load Balancing Selection Strategy John Jose, K.V. Mahathi, J. Shiva Shankar and Madhu Mutyam PACE Laboratory, Department of Computer Science and Engineering

More information

Lecture 13: Router Implementation. CSE 123: Computer Networks Stefan Savage

Lecture 13: Router Implementation. CSE 123: Computer Networks Stefan Savage Lecture 13: Router Implementation CSE 123: Computer Networks Stefan Savage Last week Routing Intra-domain» Distance vector» Link state Inter-domain» BGP (path vector) Multicast» One-to-many communication»

More information

Distributed Elastic Switch Architecture for efficient Networks-on-FPGAs

Distributed Elastic Switch Architecture for efficient Networks-on-FPGAs Distributed Elastic Switch Architecture for efficient Networks-on-FPGAs Antoni Roca, Jose Flich Parallel Architectures Group Universitat Politechnica de Valencia (UPV) Valencia, Spain Giorgos Dimitrakopoulos

More information

Packetization and routing analysis of on-chip multiprocessor networks

Packetization and routing analysis of on-chip multiprocessor networks Journal of Systems Architecture 50 (2004) 81 104 www.elsevier.com/locate/sysarc Packetization and routing analysis of on-chip multiprocessor networks Terry Tao Ye a, *, Luca Benini b, Giovanni De Micheli

More information

Introduction to Multiprocessors (Part I) Prof. Cristina Silvano Politecnico di Milano

Introduction to Multiprocessors (Part I) Prof. Cristina Silvano Politecnico di Milano Introduction to Multiprocessors (Part I) Prof. Cristina Silvano Politecnico di Milano Outline Key issues to design multiprocessors Interconnection network Centralized shared-memory architectures Distributed

More information

Computer Network. Interconnected collection of autonomous computers that are able to exchange information

Computer Network. Interconnected collection of autonomous computers that are able to exchange information Introduction Computer Network. Interconnected collection of autonomous computers that are able to exchange information No master/slave relationship between the computers in the network Data Communications.

More information

Switched Interconnect for System-on-a-Chip Designs

Switched Interconnect for System-on-a-Chip Designs witched Interconnect for ystem-on-a-chip Designs Abstract Daniel iklund and Dake Liu Dept. of Physics and Measurement Technology Linköping University -581 83 Linköping {danwi,dake}@ifm.liu.se ith the increased

More information

Computer Organization and Architecture

Computer Organization and Architecture Computer Organization and Architecture Chapter 3 Top-Level View of System Function and Interconnection Computer Components Von Neumann Architecture Data and Instructions stored in single r/w memory Contents

More information

- Nishad Nerurkar. - Aniket Mhatre

- Nishad Nerurkar. - Aniket Mhatre - Nishad Nerurkar - Aniket Mhatre Single Chip Cloud Computer is a project developed by Intel. It was developed by Intel Lab Bangalore, Intel Lab America and Intel Lab Germany. It is part of a larger project,

More information

CS 78 Computer Networks. Internet Protocol (IP) our focus. The Network Layer. Interplay between routing and forwarding

CS 78 Computer Networks. Internet Protocol (IP) our focus. The Network Layer. Interplay between routing and forwarding CS 78 Computer Networks Internet Protocol (IP) Andrew T. Campbell campbell@cs.dartmouth.edu our focus What we will lean What s inside a router IP forwarding Internet Control Message Protocol (ICMP) IP

More information

Parallel Architectures Group Grupo de Arquitecturas Paralelas (GAP)

Parallel Architectures Group Grupo de Arquitecturas Paralelas (GAP) Handling in Interconnection Deadlock Networks Parallel Architectures Group Switching Techniques, Adaptive Routing and Jose Duato de Ingeniera de Sistemas, Computadores y Automatica Dept. Politecnica de

More information

From Hypercubes to Dragonflies a short history of interconnect

From Hypercubes to Dragonflies a short history of interconnect From Hypercubes to Dragonflies a short history of interconnect William J. Dally Computer Science Department Stanford University IAA Workshop July 21, 2008 IAA: # Outline The low-radix era High-radix routers

More information

Chapter 11 I/O Management and Disk Scheduling

Chapter 11 I/O Management and Disk Scheduling Operating Systems: Internals and Design Principles, 6/E William Stallings Chapter 11 I/O Management and Disk Scheduling Dave Bremer Otago Polytechnic, NZ 2008, Prentice Hall I/O Devices Roadmap Organization

More information

Introduction to Local Area Networks

Introduction to Local Area Networks For Summer Training on Computer Networking visit Introduction to Local Area Networks Prepared by : Swapan Purkait Director Nettech Private Limited swapan@nettech.in + 91 93315 90003 Introduction A local

More information

Chapter 4 Multi-Stage Interconnection Networks The general concept of the multi-stage interconnection network, together with its routing properties, have been used in the preceding chapter to describe

More information

Parallel and Distributed Computing Chapter 5: Basic Communications Operations

Parallel and Distributed Computing Chapter 5: Basic Communications Operations Parallel and Distributed Computing Chapter 5: Basic Communications Operations Jun Zhang Laboratory for High Performance Computing & Computer Simulation Department of Computer Science University of Kentucky

More information

SAN Conceptual and Design Basics

SAN Conceptual and Design Basics TECHNICAL NOTE VMware Infrastructure 3 SAN Conceptual and Design Basics VMware ESX Server can be used in conjunction with a SAN (storage area network), a specialized high speed network that connects computer

More information

Router Construction. Workstation-Based. Switching Hardware Design Goals throughput (depends on traffic model) scalability (a function of n) Buffering

Router Construction. Workstation-Based. Switching Hardware Design Goals throughput (depends on traffic model) scalability (a function of n) Buffering Workstation-Based Outline Router Construction Switched Fabrics IP Routers Tag Switching Aggregate bandwidth 1/2 of the I/O bus bandwidth capacity shared among all hosts connected to switch example: 1Gbps

More information

Annotation to the assignments and the solution sheet. Note the following points

Annotation to the assignments and the solution sheet. Note the following points Computer rchitecture 2 / dvanced Computer rchitecture Seite: 1 nnotation to the assignments and the solution sheet This is a multiple choice examination, that means: Solution approaches are not assessed

More information

Computer Organization & Architecture Lecture #19

Computer Organization & Architecture Lecture #19 Computer Organization & Architecture Lecture #19 Input/Output The computer system s I/O architecture is its interface to the outside world. This architecture is designed to provide a systematic means of

More information

Circuit-Switched Coherence

Circuit-Switched Coherence Circuit-Switched Coherence Natalie Enright Jerger*, Li-Shiuan Peh +, Mikko Lipasti* *University of Wisconsin - Madison + Princeton University 2 nd IEEE International Symposium on Networks-on-Chip Motivation

More information

Module 15: Network Structures

Module 15: Network Structures Module 15: Network Structures Background Topology Network Types Communication Communication Protocol Robustness Design Strategies 15.1 A Distributed System 15.2 Motivation Resource sharing sharing and

More information

PCI Express Basics Ravi Budruk Senior Staff Engineer and Partner MindShare, Inc.

PCI Express Basics Ravi Budruk Senior Staff Engineer and Partner MindShare, Inc. PCI Express Basics Ravi Budruk Senior Staff Engineer and Partner MindShare, Inc. Copyright 2007, PCI-SIG, All Rights Reserved 1 PCI Express Introduction PCI Express architecture is a high performance,

More information

Synchronization. Todd C. Mowry CS 740 November 24, 1998. Topics. Locks Barriers

Synchronization. Todd C. Mowry CS 740 November 24, 1998. Topics. Locks Barriers Synchronization Todd C. Mowry CS 740 November 24, 1998 Topics Locks Barriers Types of Synchronization Mutual Exclusion Locks Event Synchronization Global or group-based (barriers) Point-to-point tightly

More information

Performance of Switching Networks

Performance of Switching Networks Performance of Switching Networks (A general view based on a simple model) J-P Dufey, CERN Outline: Overview and Definitions Non Blocking vs Blocking Switches Input vs Output Queueing Simulation Model

More information

Distributed Computing over Communication Networks: Topology. (with an excursion to P2P)

Distributed Computing over Communication Networks: Topology. (with an excursion to P2P) Distributed Computing over Communication Networks: Topology (with an excursion to P2P) Some administrative comments... There will be a Skript for this part of the lecture. (Same as slides, except for today...

More information

Introduction to LAN/WAN. Network Layer

Introduction to LAN/WAN. Network Layer Introduction to LAN/WAN Network Layer Topics Introduction (5-5.1) Routing (5.2) (The core) Internetworking (5.5) Congestion Control (5.3) Network Layer Design Isues Store-and-Forward Packet Switching Services

More information

Data Center Network Topologies: FatTree

Data Center Network Topologies: FatTree Data Center Network Topologies: FatTree Hakim Weatherspoon Assistant Professor, Dept of Computer Science CS 5413: High Performance Systems and Networking September 22, 2014 Slides used and adapted judiciously

More information

InfiniBand Clustering

InfiniBand Clustering White Paper InfiniBand Clustering Delivering Better Price/Performance than Ethernet 1.0 Introduction High performance computing clusters typically utilize Clos networks, more commonly known as Fat Tree

More information

Chapter 18: Database System Architectures. Centralized Systems

Chapter 18: Database System Architectures. Centralized Systems Chapter 18: Database System Architectures! Centralized Systems! Client--Server Systems! Parallel Systems! Distributed Systems! Network Types 18.1 Centralized Systems! Run on a single computer system and

More information

On-Chip Communication Architectures

On-Chip Communication Architectures On-Chip Communication Architectures Networks-on-Chip ICS 295 Sudeep Pasricha and Nikil Dutt Slides based on book chapter 12 1 Outline Introduction NoC Topology Switching strategies Routing algorithms Flow

More information

Vorlesung Rechnerarchitektur 2 Seite 178 DASH

Vorlesung Rechnerarchitektur 2 Seite 178 DASH Vorlesung Rechnerarchitektur 2 Seite 178 Architecture for Shared () The -architecture is a cache coherent, NUMA multiprocessor system, developed at CSL-Stanford by John Hennessy, Daniel Lenoski, Monica

More information

Introduction. Abusayeed Saifullah. CS 5600 Computer Networks. These slides are adapted from Kurose and Ross

Introduction. Abusayeed Saifullah. CS 5600 Computer Networks. These slides are adapted from Kurose and Ross Introduction Abusayeed Saifullah CS 5600 Computer Networks These slides are adapted from Kurose and Ross Roadmap 1.1 what is the Inter? 1.2 work edge end systems, works, links 1.3 work core packet switching,

More information

What is a bus? A Bus is: Advantages of Buses. Disadvantage of Buses. Master versus Slave. The General Organization of a Bus

What is a bus? A Bus is: Advantages of Buses. Disadvantage of Buses. Master versus Slave. The General Organization of a Bus Datorteknik F1 bild 1 What is a bus? Slow vehicle that many people ride together well, true... A bunch of wires... A is: a shared communication link a single set of wires used to connect multiple subsystems

More information

Principles and characteristics of distributed systems and environments

Principles and characteristics of distributed systems and environments Principles and characteristics of distributed systems and environments Definition of a distributed system Distributed system is a collection of independent computers that appears to its users as a single

More information

Low-Overhead Hard Real-time Aware Interconnect Network Router

Low-Overhead Hard Real-time Aware Interconnect Network Router Low-Overhead Hard Real-time Aware Interconnect Network Router Michel A. Kinsy! Department of Computer and Information Science University of Oregon Srinivas Devadas! Department of Electrical Engineering

More information

LOAD-BALANCED ROUTING IN INTERCONNECTION NETWORKS

LOAD-BALANCED ROUTING IN INTERCONNECTION NETWORKS LOAD-BALANCED ROUTING IN INTERCONNECTION NETWORKS A DISSERTATION SUBMITTED TO THE DEPARTMENT OF ELECTRICAL ENGINEERING AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY IN PARTIAL FULFILLMENT

More information

Cray Gemini Interconnect. Technical University of Munich Parallel Programming Class of SS14 Denys Sobchyshak

Cray Gemini Interconnect. Technical University of Munich Parallel Programming Class of SS14 Denys Sobchyshak Cray Gemini Interconnect Technical University of Munich Parallel Programming Class of SS14 Denys Sobchyshak Outline 1. Introduction 2. Overview 3. Architecture 4. Gemini Blocks 5. FMA & BTA 6. Fault tolerance

More information

Transport Layer Protocols

Transport Layer Protocols Transport Layer Protocols Version. Transport layer performs two main tasks for the application layer by using the network layer. It provides end to end communication between two applications, and implements

More information

The Internet. Charging for Internet. What does 1000M and 200M mean? Dr. Hayden Kwok-Hay So

The Internet. Charging for Internet. What does 1000M and 200M mean? Dr. Hayden Kwok-Hay So The Internet CCST9015 Feb 6, 2013 What does 1000M and 200M mean? Dr. Hayden Kwok-Hay So Department of Electrical and Electronic Engineering 2 Charging for Internet One is charging for speed (How fast the

More information

Chapter 14: Distributed Operating Systems

Chapter 14: Distributed Operating Systems Chapter 14: Distributed Operating Systems Chapter 14: Distributed Operating Systems Motivation Types of Distributed Operating Systems Network Structure Network Topology Communication Structure Communication

More information

Chapter 16: Distributed Operating Systems

Chapter 16: Distributed Operating Systems Module 16: Distributed ib System Structure, Silberschatz, Galvin and Gagne 2009 Chapter 16: Distributed Operating Systems Motivation Types of Network-Based Operating Systems Network Structure Network Topology

More information

Operating System Concepts. Operating System 資 訊 工 程 學 系 袁 賢 銘 老 師

Operating System Concepts. Operating System 資 訊 工 程 學 系 袁 賢 銘 老 師 Lecture 7: Distributed Operating Systems A Distributed System 7.2 Resource sharing Motivation sharing and printing files at remote sites processing information in a distributed database using remote specialized

More information

Data Communications & Computer Networks. Circuit and Packet Switching

Data Communications & Computer Networks. Circuit and Packet Switching Data Communications & Computer Networks Chapter 9 Circuit and Packet Switching Fall 2008 Agenda Preface Circuit Switching Softswitching Packet Switching Home Exercises ACOE312 Circuit and packet switching

More information

Overview of Changes to PCI Express 2.1 and 3.0

Overview of Changes to PCI Express 2.1 and 3.0 Overview of Changes to PCI Express 2.1 and 3.0 By Mike Jackson, Senior Staff Architect, MindShare, Inc. The PCISIG has released the 2.1 specification whereas the 3.0 specification release has been delayed

More information

Chapter 12: Multiprocessor Architectures. Lesson 04: Interconnect Networks

Chapter 12: Multiprocessor Architectures. Lesson 04: Interconnect Networks Chapter 12: Multiprocessor Architectures Lesson 04: Interconnect Networks Objective To understand different interconnect networks To learn crossbar switch, hypercube, multistage and combining networks

More information

Recursive Partitioning Multicast: A Bandwidth-Efficient Routing for Networks-On-Chip

Recursive Partitioning Multicast: A Bandwidth-Efficient Routing for Networks-On-Chip Recursive Partitioning Multicast: A Bandwidth-Efficient Routing for Networks-On-Chip Lei Wang, Yuho Jin, Hyungjun Kim and Eun Jung Kim Department of Computer Science and Engineering Texas A&M University

More information

DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING NATIONAL INSTITUTE OF TECHNOLOGY ROURKELA EFFICIENT ROUTER DESIGN FOR NETWORK ON CHIP

DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING NATIONAL INSTITUTE OF TECHNOLOGY ROURKELA EFFICIENT ROUTER DESIGN FOR NETWORK ON CHIP DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING NATIONAL INSTITUTE OF TECHNOLOGY ROURKELA EFFICIENT ROUTER DESIGN FOR NETWORK ON CHIP SWAPNA S 2013 EFFICIENT ROUTER DESIGN FOR NETWORK ON CHIP A

More information

TCP over Multi-hop Wireless Networks * Overview of Transmission Control Protocol / Internet Protocol (TCP/IP) Internet Protocol (IP)

TCP over Multi-hop Wireless Networks * Overview of Transmission Control Protocol / Internet Protocol (TCP/IP) Internet Protocol (IP) TCP over Multi-hop Wireless Networks * Overview of Transmission Control Protocol / Internet Protocol (TCP/IP) *Slides adapted from a talk given by Nitin Vaidya. Wireless Computing and Network Systems Page

More information

Routing in packet-switching networks

Routing in packet-switching networks Routing in packet-switching networks Circuit switching vs. Packet switching Most of WANs based on circuit or packet switching Circuit switching designed for voice Resources dedicated to a particular call

More information

Communication Networks. MAP-TELE 2011/12 José Ruela

Communication Networks. MAP-TELE 2011/12 José Ruela Communication Networks MAP-TELE 2011/12 José Ruela Network basic mechanisms Introduction to Communications Networks Communications networks Communications networks are used to transport information (data)

More information

The proliferation of the raw processing

The proliferation of the raw processing TECHNOLOGY CONNECTED Advances with System Area Network Speeds Data Transfer between Servers with A new network switch technology is targeted to answer the phenomenal demands on intercommunication transfer

More information

Local Area Networks transmission system private speedy and secure kilometres shared transmission medium hardware & software

Local Area Networks transmission system private speedy and secure kilometres shared transmission medium hardware & software Local Area What s a LAN? A transmission system, usually private owned, very speedy and secure, covering a geographical area in the range of kilometres, comprising a shared transmission medium and a set

More information

QoS Switching. Two Related Areas to Cover (1) Switched IP Forwarding (2) 802.1Q (Virtual LANs) and 802.1p (GARP/Priorities)

QoS Switching. Two Related Areas to Cover (1) Switched IP Forwarding (2) 802.1Q (Virtual LANs) and 802.1p (GARP/Priorities) QoS Switching H. T. Kung Division of Engineering and Applied Sciences Harvard University November 4, 1998 1of40 Two Related Areas to Cover (1) Switched IP Forwarding (2) 802.1Q (Virtual LANs) and 802.1p

More information

A Detailed and Flexible Cycle-Accurate Network-on-Chip Simulator

A Detailed and Flexible Cycle-Accurate Network-on-Chip Simulator A Detailed and Flexible Cycle-Accurate Network-on-Chip Simulator Nan Jiang Stanford University qtedq@cva.stanford.edu James Balfour Google Inc. jbalfour@google.com Daniel U. Becker Stanford University

More information

Flattened Butterfly : A Cost-Efficient Topology for High-Radix Networks

Flattened Butterfly : A Cost-Efficient Topology for High-Radix Networks Flattened : A Cost-Efficient Topology for High-Radix Networks John Kim, William J. Dally Computer Systems Laboratory Stanford University, Stanford, CA 9435 {jjk12, billd}@cva.stanford.edu Dennis Abts Cray

More information

Definition. A Historical Example

Definition. A Historical Example Overlay Networks This lecture contains slides created by Ion Stoica (UC Berkeley). Slides used with permission from author. All rights remain with author. Definition Network defines addressing, routing,

More information

Fiber Channel Over Ethernet (FCoE)

Fiber Channel Over Ethernet (FCoE) Fiber Channel Over Ethernet (FCoE) Using Intel Ethernet Switch Family White Paper November, 2008 Legal INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR

More information

SCALABILITY AND AVAILABILITY

SCALABILITY AND AVAILABILITY SCALABILITY AND AVAILABILITY Real Systems must be Scalable fast enough to handle the expected load and grow easily when the load grows Available available enough of the time Scalable Scale-up increase

More information

Interconnection Networks

Interconnection Networks Advanced Computer Architecture (0630561) Lecture 15 Interconnection Networks Prof. Kasim M. Al-Aubidy Computer Eng. Dept. Interconnection Networks: Multiprocessors INs can be classified based on: 1. Mode

More information

Today. Finishing up inter-domain routing. Review of end-to-end forwarding. How we build routers. Economics of peering/settlement

Today. Finishing up inter-domain routing. Review of end-to-end forwarding. How we build routers. Economics of peering/settlement Today Finishing up inter-domain routing Economics of peering/settlement Review of end-to-end forwarding How we build routers 1 A History of Settlement The telephone world LECs (local exchange carriers)

More information

Switch Fabric Implementation Using Shared Memory

Switch Fabric Implementation Using Shared Memory Order this document by /D Switch Fabric Implementation Using Shared Memory Prepared by: Lakshmi Mandyam and B. Kinney INTRODUCTION Whether it be for the World Wide Web or for an intra office network, today

More information

William Stallings Data and Computer Communications. Chapter 15 Internetwork Protocols

William Stallings Data and Computer Communications. Chapter 15 Internetwork Protocols William Stallings Data and Computer Communications Chapter 15 Internetwork Protocols Internetworking Terms (1) Communications Network Facility that provides data transfer service An internet Collection

More information

Real-Time (Paradigms) (51)

Real-Time (Paradigms) (51) Real-Time (Paradigms) (51) 5. Real-Time Communication Data flow (communication) in embedded systems : Sensor --> Controller Controller --> Actor Controller --> Display Controller Controller Major

More information

Efficient Built-In NoC Support for Gather Operations in Invalidation-Based Coherence Protocols

Efficient Built-In NoC Support for Gather Operations in Invalidation-Based Coherence Protocols Universitat Politècnica de València Master Thesis Efficient Built-In NoC Support for Gather Operations in Invalidation-Based Coherence Protocols Author: Mario Lodde Advisor: Prof. José Flich Cardo A thesis

More information

Journal of Parallel and Distributed Computing 61, 11481179 (2001) doi:10.1006jpdc.2001.1747, available online at http:www.idealibrary.com on Adaptive Routing on the New Switch Chip for IBM SP Systems Bulent

More information

Chapter 1: Introduction. Chapter 1: roadmap. Our goal: Overview:

Chapter 1: Introduction. Chapter 1: roadmap. Our goal: Overview: Chapter 1: Introduction Our goal: get feel and terminology more depth, detail later in course approach: use Internet as example Overview: what s the Internet what s a protocol? network edge network core

More information