Lecture 23: Interconnection Networks. Topics: communication latency, centralized and decentralized switches (Appendix E)



Similar documents
Topological Properties

Lecture 24: WSC, Datacenters. Topics: network-on-chip wrap-up, warehouse-scale computing and datacenters (Sections )

System Interconnect Architectures. Goals and Analysis. Network Properties and Routing. Terminology - 2. Terminology - 1

Interconnection Networks Programmierung Paralleler und Verteilter Systeme (PPV)

Components: Interconnect Page 1 of 18

Interconnection Network

Interconnection Networks. Interconnection Networks. Interconnection networks are used everywhere!

Interconnection Network Design

Scalability and Classifications

Parallel Programming

COMP 422, Lecture 3: Physical Organization & Communication Costs in Parallel Machines (Sections 2.4 & 2.5 of textbook)

Lecture 2 Parallel Programming Platforms

Why the Network Matters

Lecture 18: Interconnection Networks. CMU : Parallel Computer Architecture and Programming (Spring 2012)

Interconnection Networks

Introduction to Parallel Computing. George Karypis Parallel Programming Platforms

Chapter 2. Multiprocessors Interconnection Networks

Hyper Node Torus: A New Interconnection Network for High Speed Packet Processors

Interconnection Networks

Chapter 12: Multiprocessor Architectures. Lesson 04: Interconnect Networks

Scaling 10Gb/s Clustering at Wire-Speed

Distributed Computing over Communication Networks: Topology. (with an excursion to P2P)

From Hypercubes to Dragonflies a short history of interconnect

Asynchronous Bypass Channels

Annotation to the assignments and the solution sheet. Note the following points

Interconnection Networks

Data Center Network Topologies: FatTree

On-Chip Interconnection Networks Low-Power Interconnect

Data Center Switch Fabric Competitive Analysis

Chapter 2 Parallel Architecture, Software And Performance

Introduction to Exploration and Optimization of Multiprocessor Embedded Architectures based on Networks On-Chip


Large-Scale Distributed Systems. Datacenter Networks. COMP6511A Spring 2014 HKUST. Lin Gu

Cray Gemini Interconnect. Technical University of Munich Parallel Programming Class of SS14 Denys Sobchyshak

Interconnection Network of OTA-based FPAA

Large Scale Clustering with Voltaire InfiniBand HyperScale Technology

causeddroppingofthetos-basedroutingrequirementfromtheospfspecication.

InfiniBand Clustering

Ring Protection: Wrapping vs. Steering

Advanced Computer Networks. Datacenter Network Fabric

A Source Identification Scheme against DDoS Attacks in Cluster Interconnects

Behavior Analysis of Multilayer Multistage Interconnection Network With Extra Stages

Definition. A Historical Example

Introduction to LAN/WAN. Network Layer

Routing in packet-switching networks

Introduction to Infiniband. Hussein N. Harake, Performance U! Winter School

Module 5. Broadcast Communication Networks. Version 2 CSE IIT, Kharagpur

Data Center Networks

Parallel Computing. Benson Muite. benson.

Network Structure or Topology

Lecture 6 Types of Computer Networks and their Topologies Three important groups of computer networks: LAN, MAN, WAN

Infrastructure Components: Hub & Repeater. Network Infrastructure. Switch: Realization. Infrastructure Components: Switch

How To Build A Low Cost Data Center Network With Two Ports And A Backup Port

How To Understand The Concept Of A Distributed System

Industrial Ethernet How to Keep Your Network Up and Running A Beginner s Guide to Redundancy Standards

Switched Interconnect for System-on-a-Chip Designs

CHAPTER 8 CONCLUSION AND FUTURE ENHANCEMENTS

Overview. Swarms in nature. Fish, birds, ants, termites, Introduction to swarm intelligence principles Particle Swarm Optimization (PSO)

2. What is the maximum value of each octet in an IP address? A. 28 B. 255 C. 256 D. None of the above

The fat-stack and universal routing in interconnection networks

The Advantages and Disadvantages of Network Computing Nodes

Dual-Centric Data Center Network Architectures

Lecture 2.1 : The Distributed Bellman-Ford Algorithm. Lecture 2.2 : The Destination Sequenced Distance Vector (DSDV) protocol

Network Design. Yiannos Mylonas

Overview of Network Hardware and Software. CS158a Chris Pollett Jan 29, 2007.

Installation Guide for Dolphin PCI-SCI Adapters

LOAD-BALANCED ROUTING IN INTERCONNECTION NETWORKS

Datagram-based network layer: forwarding; routing. Additional function of VCbased network layer: call setup.

OpenFlow based Load Balancing for Fat-Tree Networks with Multipath Support

Advanced Computer Networks. High Performance Networking I

COMPUTERS ARE YOUR FUTURE CHAPTER 7 NETWORKS: COMMUNICATING AND SHARING RESOURCES

ECE 358: Computer Networks. Solutions to Homework #4. Chapter 4 - The Network Layer

Scalable Source Routing

Hyacinth An IEEE based Multi-channel Wireless Mesh Network

Optimizing Enterprise Network Bandwidth For Security Applications. Improving Performance Using Antaira s Management Features

Designing HP SAN Networking Solutions

The Butterfly, Cube-Connected-Cycles and Benes Networks

Towards a Design Space Exploration Methodology for System-on-Chip

Parallel Architectures and Interconnection

Small-World Datacenters

3D On-chip Data Center Networks Using Circuit Switches and Packet Switches

College 5, Routing, Internet. Host A. Host B. The Network Layer: functions

SAN Conceptual and Design Basics

Networks. Connecting Computers. Measures for connection speed. Ethernet. Collision detection. Ethernet protocol

Technology-Driven, Highly-Scalable Dragonfly Topology

A network is a group of devices (Nodes) connected by media links. A node can be a computer, printer or any other device capable of sending and

Router Architectures

Bandwidth-based load-balancing with failover. The easy way. We need more bandwidth.

Wide Area Networks. Learning Objectives. LAN and WAN. School of Business Eastern Illinois University. (Week 11, Thursday 3/22/2007)

QUALITY OF SERVICE METRICS FOR DATA TRANSMISSION IN MESH TOPOLOGIES

Hardware Implementation of Improved Adaptive NoC Router with Flit Flow History based Load Balancing Selection Strategy

Physical Data Organization

Level 2 Routing: LAN Bridges and Switches

WAN Wide Area Networks. Packet Switch Operation. Packet Switches. COMP476 Networked Computer Systems. WANs are made of store and forward switches.

Transcription:

Lecture 23: Interconnection Networks Topics: communication latency, centralized and decentralized switches (Appendix E) 1

Topologies Internet topologies are not very regular they grew incrementally Supercomputers have regular interconnect topologies and trade off cost for high bandwidth Nodes can be connected with centralized switch: all nodes have input and output wires going to a centralized chip that internally handles all routing decentralized switch: each node is connected to a switch that routes data to one of a few neighbors 2

Centralized Crossbar Switch P0 P1 P2 P3 P4 Crossbar switch P5 P6 P7 3

Centralized Crossbar Switch P0 P1 P2 P3 P4 P5 P6 P7 4

Crossbar Properties Assuming each node has one input and one output, a crossbar can provide maximum bandwidth: N messages can be sent as long as there are N unique sources and N unique destinations Maximum overhead: WN 2 internal switches, where W is data width and N is number of nodes To reduce overhead, use smaller switches as building blocks trade off overhead for lower effective bandwidth 5

Switch with Omega Network 000 001 010 011 100 101 110 P0 P1 P2 P3 P4 P5 P6 000 001 010 011 100 101 110 111 P7 111 6

Omega Network Properties The switch complexity is now O(N log N) Contention increases: P0 P5 and P1 P7 cannot happen concurrently (this was possible in a crossbar) To deal with contention, can increase the number of levels (redundant paths) by mirroring the network, we can route from P0 to P5 via N intermediate nodes, while increasing complexity by a factor of 2 7

Tree Network Complexity is O(N) Can yield low latencies when communicating with neighbors Can build a fat tree by having multiple incoming and outgoing links P0 P1 P2 P3 P4 P5 P6 P7 8

Bisection Bandwidth Split N nodes into two groups of N/2 nodes such that the bandwidth between these two groups is minimum: that is the bisection bandwidth Why is it relevant: if traffic is completely random, the probability of a message going across the two halves is ½ if all nodes send a message, the bisection bandwidth will have to be N/2 The concept of bisection bandwidth confirms that the tree network is not suited for random traffic patterns, but for localized traffic patterns 9

Distributed Switches: Ring Each node is connected to a 3x3 switch that routes messages between the node and its two neighbors Effectively a repeated bus: multiple messages in transit Disadvantage: bisection bandwidth of 2 and N/2 hops on average 10

Distributed Switch Options Performance can be increased by throwing more hardware at the problem: fully-connected switches: every switch is connected to every other switch: N 2 wiring complexity, N 2 /4 bisection bandwidth Most commercial designs adopt a point between the two extremes (ring and fully-connected): Grid: each node connects with its N, E, W, S neighbors Torus: connections wrap around Hypercube: links between nodes whose binary names differ in a single bit 11

Topology Examples Grid Torus Hypercube Criteria Bus Ring 2Dtorus 6-cube Fully connected Performance Bisection bandwidth Cost Ports/switch Total links 12

Topology Examples Grid Torus Hypercube Criteria Bus Ring 2Dtorus 6-cube Fully connected Performance Bisection bandwidth Cost Ports/switch Total links 1 1 2 16 32 1024 3 128 5 192 7 256 64 2080 13

k-ary d-cube Consider a k-ary d-cube: a d-dimension array with k elements in each dimension, there are links between elements that differ in one dimension by 1 (mod k) Number of nodes N = k d Number of switches : Switch degree : Number of links : Pins per node : Avg. routing distance: Diameter : Bisection bandwidth : Switch complexity : Should we minimize or maximize dimension? 14

k-ary d-cube Consider a k-ary d-cube: a d-dimension array with k elements in each dimension, there are links between elements that differ in one dimension by 1 (mod k) Number of nodes N = k d (with no wraparound) Number of switches : Switch degree : Number of links : Pins per node : N 2d + 1 Nd 2wd Avg. routing distance: Diameter : Bisection bandwidth : Switch complexity : Should we minimize or maximize dimension? d(k-1)/2 d(k-1) 2wk d-1 (2d + 1) 2 15

Title Bullet 16