Interconnection Networks Programmierung Paralleler und Verteilter Systeme (PPV)

Similar documents
Components: Interconnect Page 1 of 18

System Interconnect Architectures. Goals and Analysis. Network Properties and Routing. Terminology - 2. Terminology - 1

Interconnection Networks. Interconnection Networks. Interconnection networks are used everywhere!

Topological Properties

Lecture 23: Interconnection Networks. Topics: communication latency, centralized and decentralized switches (Appendix E)

Introduction to Parallel Computing. George Karypis Parallel Programming Platforms

Lecture 2 Parallel Programming Platforms

Interconnection Network

Parallel Programming

Lecture 18: Interconnection Networks. CMU : Parallel Computer Architecture and Programming (Spring 2012)

COMP 422, Lecture 3: Physical Organization & Communication Costs in Parallel Machines (Sections 2.4 & 2.5 of textbook)

Interconnection Network Design

Interconnection Networks

Scalability and Classifications


Why the Network Matters

Chapter 2. Multiprocessors Interconnection Networks

Behavior Analysis of Multilayer Multistage Interconnection Network With Extra Stages

Scaling 10Gb/s Clustering at Wire-Speed

Chapter 12: Multiprocessor Architectures. Lesson 04: Interconnect Networks

Chapter 15: Distributed Structures. Topology

Hyper Node Torus: A New Interconnection Network for High Speed Packet Processors

Parallel Architectures and Interconnection

Annotation to the assignments and the solution sheet. Note the following points

Architectural Level Power Consumption of Network on Chip. Presenter: YUAN Zheng

Interconnection Networks

Communication Networks. MAP-TELE 2011/12 José Ruela

Network Structure or Topology

Module 5. Broadcast Communication Networks. Version 2 CSE IIT, Kharagpur

MULTISTAGE INTERCONNECTION NETWORKS: A TRANSITION TO OPTICAL

InfiniBand Clustering

Interconnection Networks

Lecture 6 Types of Computer Networks and their Topologies Three important groups of computer networks: LAN, MAN, WAN

Principles and characteristics of distributed systems and environments

Chapter 2 Parallel Architecture, Software And Performance

Data Center Network Topologies: FatTree

Mixed-Criticality Systems Based on Time- Triggered Ethernet with Multiple Ring Topologies. University of Siegen Mohammed Abuteir, Roman Obermaisser

Lecture 24: WSC, Datacenters. Topics: network-on-chip wrap-up, warehouse-scale computing and datacenters (Sections )

Large-Scale Distributed Systems. Datacenter Networks. COMP6511A Spring 2014 HKUST. Lin Gu

Infrastructure Components: Hub & Repeater. Network Infrastructure. Switch: Realization. Infrastructure Components: Switch

INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET)

Large Scale Clustering with Voltaire InfiniBand HyperScale Technology

Industrial Ethernet How to Keep Your Network Up and Running A Beginner s Guide to Redundancy Standards

Distributed Computing over Communication Networks: Topology. (with an excursion to P2P)

Non-blocking Switching in the Cloud Computing Era

Introduction to Exploration and Optimization of Multiprocessor Embedded Architectures based on Networks On-Chip

On-Chip Interconnection Networks Low-Power Interconnect

Distributed Elastic Switch Architecture for efficient Networks-on-FPGAs

Chapter 14: Distributed Operating Systems

Tolerating Multiple Faults in Multistage Interconnection Networks with Minimal Extra Stages

A network is a group of devices (Nodes) connected by media links. A node can be a computer, printer or any other device capable of sending and

Interconnection Network of OTA-based FPAA

Real-time Processor Interconnection Network for FPGA-based Multiprocessor System-on-Chip (MPSoC)

Reconfigurable Computing. Reconfigurable Architectures. Chapter 3.2

Improved Irregular Augmented Shuffle Multistage Interconnection Network

Systolic Computing. Fundamentals

Chapter 16: Distributed Operating Systems

NETWORKING TECHNOLOGIES

Design and Implementation of an On-Chip timing based Permutation Network for Multiprocessor system on Chip

How To Understand The Concept Of A Distributed System

Optimizing Enterprise Network Bandwidth For Security Applications. Improving Performance Using Antaira s Management Features

Load balancing in a heterogeneous computer system by self-organizing Kohonen network

Hardware Implementation of Improved Adaptive NoC Router with Flit Flow History based Load Balancing Selection Strategy

Module 15: Network Structures

JOB READY ASSESSMENT BLUEPRINT COMPUTER NETWORKING FUNDAMENTALS - PILOT. Test Code: 4514 Version: 01

Data Center Switch Fabric Competitive Analysis

SDN and Data Center Networks


On the Placement of Management and Control Functionality in Software Defined Networks

Computer Networking: A Survey

Solution: start more than one instruction in the same clock cycle CPI < 1 (or IPC > 1, Instructions per Cycle) Two approaches:

Computer Network. Interconnected collection of autonomous computers that are able to exchange information

Operating System Concepts. Operating System 資 訊 工 程 學 系 袁 賢 銘 老 師

CMSC 611: Advanced Computer Architecture

The Butterfly, Cube-Connected-Cycles and Benes Networks

ECE 358: Computer Networks. Solutions to Homework #4. Chapter 4 - The Network Layer

Synchronization. Todd C. Mowry CS 740 November 24, Topics. Locks Barriers

Local Area Networks transmission system private speedy and secure kilometres shared transmission medium hardware & software

2 Basic Concepts. Contents

Fault-Tolerant Routing Algorithm for BSN-Hypercube Using Unsafety Vectors

SOC architecture and design

Distributed communication-aware load balancing with TreeMatch in Charm++

A Source Identification Scheme against DDoS Attacks in Cluster Interconnects

TRILL Large Layer 2 Network Solution

PortLand:! A Scalable Fault-Tolerant Layer 2 Data Center Network Fabric

Installation Guide for Dolphin PCI-SCI Adapters

Router Architectures

Computer Systems Structure Input/Output

Computer Networks. Definition of LAN. Connection of Network. Key Points of LAN. Lecture 06 Connecting Networks

524 Computer Networks

Using High Availability Technologies Lesson 12

Intel Ethernet Switch Load Balancing System Design Using Advanced Features in Intel Ethernet Switch Family

Computer Networks Vs. Distributed Systems

Transcription:

Interconnection Networks Programmierung Paralleler und Verteilter Systeme (PPV) Sommer 2015 Frank Feinbube, M.Sc., Felix Eberhardt, M.Sc., Prof. Dr. Andreas Polze

Interconnection Networks 2 SIMD systems demand structured connectivity Processor-to-processor interaction Processor-to-memory interaction Static network Point-to-point links, fixed route Dynamic network Consists of links and switching elements Flexible configuration of processor interaction

Interconnection networks Optimization criteria Connectivity ideally direct links between any two stations High number of parallel connections Cost model Production cost - # connections operational cost distance among PEs Bus networks, switching networks, point-to-point interconnects 3

4 Interconnection Networks

Interconnection Networks 5 [Peter Newman] Dynamic networks are built from a graph of configurable switching elements General packet switching network counts as irregular static network

Interconnection Networks Network Interfaces Processors talk to the network via a network interface connector (NIC) hardware Network interfaces attached to the interconnect Cluster vs. tightly-coupled multi-computer SIMD hardware bundles NIC with the processor Switching elements map a fixed number of inputs to outputs Total number of ports is the degree of the switch The cost of a switch grows as square of the degree The peripheral hardware grows linearly as the degree

Interconnection Networks A variety of network topologies proposed and implemented Each topology has a performance / cost tradeoff Commercial machines often implement hybrids Optimize packaging and costs Metrics for an interconnection network graph Diameter: Maximum distance between any two nodes Connectivity: Minimum number of edges that must be removed to get two independent graphs Link width / weight: Transfer capacity of an edge Bisection width: Minimum transfer capacity given between any two halves of the graph Costs: Number of edges in the network Often optimization for connectivity metric

Bus Systems 8 Static interconnect technology Shared communication path, broadcasting of information Diameter: O(1) Connectivity: O(1) Bisection width: O(1) Costs: O(p)

Bus network Optimal #connection per PE: 1 Constant distance among any two PEs 9

Crossbar switch " (Kreuzschienenverteiler) Arbitrary number of permutations Collision-free data exchange High cost, quadratic growth n * (n-1) connection points 10

11 Crossbar Switch

Multistage Interconnection Networks 12 Connection by switching elements Typical solution to connect processing and memory elements Can implement sorting or shuffling in the network routing

Omega Network 13 Inputs are crossed or not, depending on routing logic Destination-tag routing: Use positional bit for switch decision XOR-tag routing: Use positional bit of XOR result for decision For N PE s, N/2 switches per stage, log 2 N stages Decrease bottleneck probability on parallel communication

Delta networks Only n/2 log n deltaswitches Limited cost Not all possible permutations operational in parallel 14

Delta Networks operation 15 Stage n checks bit k of the destination tag Possible effect of output port contention and path contention 0 1 2 3 4 5 6 7

Clos coupling networks Combination of delta network and crossbar C.Clos, A Study of Nonblocking Switching Networks, Bell System Technical Journal, vol. 32, no. 2, 1953, pp. 406-424(19) 16

Fat-Tree networks PEs arranged as leafs on a binary tree Capacity of tree (links) doubles on each layer 17

Point-to-point networks: " ring and fully connected graph Ring has only two connections per PE (almost optimal) Fully connected graph optimal connectivity (but high cost) 18

Mesh and Torus Compromise between cost and connectivity 19

Cubic Mesh PEs are arranged in a cubic fashion Each PE has 6 links to neighbors 20

Hypercube Dimensions 0-4, recursive definition 21

Binary tree, quadtree Logarithmic cost Problem of bottleneck at root node 22

Shuffle-Exchange network Logarithmic cost Uni-directional shuffle network + bi-directional exchange network 23

Plus-Minus-Network PM 2i 2*m-1 separate unidirectional interconnection networks 24

ParProg Hardware 53 PT 2010 Systolic Arrays 25 Experimental " architecture Approaches Data flow Common clock Maximum signal Systolic Arrays path restricted by Data frequency flow architectures Problem: Single faulty common clock maximum element signal breaks path the restricted complete by array frequency Fault contention: single faulty processing element will break entire machine

Comparison Network Diameter Bisection Width Arc Connectivity Cost (No. of links) Completely-connected Star Complete binary tree Linear array 2-D mesh, no wraparound 2-D wraparound mesh Hypercube Wraparound k-ary d-cube

Comparison Network Diameter Bisection Width Arc Connectivity Cost (No. of links) Crossbar Omega Network Dynamic Tree

Comparison of networks 28