On-Chip Interconnection Networks Low-Power Interconnect
|
|
- Grant Flowers
- 8 years ago
- Views:
Transcription
1 On-Chip Interconnection Networks Low-Power Interconnect William J. Dally Computer Systems Laboratory Stanford University ISLPED August 27, 2007 ISLPED: 1 Aug 27, 2007
2 Outline Demand for On-Chip Networks What is Unique about On-Chip Networks The Power Problem Enabling Technologies Circuits - set the constraints Topology Micro-Architecture Summary ISLPED: 2 Aug 27, 2007
3 Urgent Demand for OCINs ISLPED: 3 Aug 27, 2007
4 The Future is CMPs OCINs are a Critical Component ISLPED: 4 Aug 27, 2007
5 Example CMP OCIN ISLPED: 5 Aug 27, 2007
6 Growing Complexity of SoCs Demands an On-Chip Interconnection Network Avner Goren TI EPF 2004 ISLPED: 6 Aug 27, 2007
7 So, what s different about on-chip networks? ISLPED: 7 Aug 27, 2007
8 State of Off-Chip Networks ISLPED: 8 Aug 27, 2007
9 Technology Trends bandwidth per router node (Gb/s) BlackWidow Torus Routing Chip Intel ipsc/2 J-Machine CM-5 Intel Paragon XP Cray T3D MIT Alewife IBM Vulcan Cray T3E SGI Origin 2000 AlphaServer GS320 IBM SP Switch2 Quadrics QsNet Cray X1 Velio 3003 IBM HPS SGI Altix 3000 Cray XT3 YARC year ISLPED: 9 Aug 27, 2007
10 Some History MARS Router 1984 Torus Routing Chip 1985 MDP 1991 Network Design Frame 1988 Robert Mullins Reliable Router 1994 ISLPED: 10 MAP 1998 Imagine 2002 YARC 2006 Aug 27, 2007
11 Some very good books ISLPED: 11 Aug 27, 2007
12 Topology Summary of Off-Chip Networks* Fit to packaging and signaling technology High-radix - Clos or FlatBfly gives lowest cost Routing Global adaptive routing balances load w/o destroying locality Flow control Virtual channels/virtual cut-through *oversimplified ISLPED: 12 Aug 27, 2007
13 So, what s different about on-chip networks? ISLPED: 13 Aug 27, 2007
14 Cost, Channels, Workload are Different Cost Off-chip: cost is channels - pins, connectors, cables, optics On-chip: cost is Si area and Power (storage and switches), wires plentiful Drives networks with many long, wide channels, few buffers Channel Characteristics On-chip RC lines - need a repeater every 1mm (or less) Short distance - low latency Can put logic in repeaters, motivates low-latency routers Workload CMP cache traffic SoC isochronous flows Design issues Floorplanning Different constraints motivate some surprising differences in design. ISLPED: 14 Aug 27, 2007
15 The Power Problem ISLPED: 15 Aug 27, 2007
16 Power is Dominated by Interconnect In 65nm A 32 bit add takes 0.8pJ Moving a 32 bit word 1mm takes 13pJ A 64-core CMP at 1GHz Estimated BW demand of 6.4GW/s (~200Gb/s) Average distance for each transfer of 10mm (each way) Power due to network ~50W Power due to 64 adds at 1GHz ~50mW Power due to 64 RISC processors at 1GHz ~13W Need to reduce network power by at least 10x ISLPED: 16 Aug 27, 2007
17 OCINs are an Enabling Technology Modular Standard interface enables design reuse Optimized Encapsulates circuits and wiring. Enables optimization. Efficient Just the wires needed to reach the destination are toggled. ISLPED: 17 Aug 27, 2007
18 OCINs vs Buses Most CMPs and SOCs today use one or more buses for interconnect A bus is just one type of OCIN Low bandwidth (one transation at a time) High concentration (all processors share one link) Inefficient (all wire segments toggled even for short communication) For both buses and other OCINs Control path is dominated by arbiters/allocators Data path is dominated by data movement and buffering Can do much better by searching more of the network design space ISLPED: 18 Aug 27, 2007
19 Circuits set Cost & Area Constraints for Architecture Can do substantially (10x-100x) better than default circuits ISLPED: 19 Aug 27, 2007
20 Enabling Technology is a Prerequisite Channels, Buffers, Switches Topology Routing Flow Control Microarchitecture ISLPED: 20 Aug 27, 2007
21 Channels 10x to 100x power reduction Eq signaling for faster propagation and increased repeater distance (D & P Chapter 8, Heaton 01) Elastic channels provide free buffers (Mizuno 01) Send 4-8 bits per cycle per wire (assuming 20FO4 cycle) ISLPED: 21 Aug 27, 2007
22 Buffers Dense arrays (vs. Flip-Flops or Latches) 1/10 area/bit 1/10 power for low-swing read Low-swing write 1/10 power for writes. Low-swing read - can keep swing low through muxes. ISLPED: 22 Aug 27, 2007
23 Switches Low-swing bit lines Operate at channel rate Reduces area and hence power Equalized drive Buffered crosspoints Integral allocation ISLPED: 23 Aug 27, 2007
24 Circuits Impact Architecture With standard-cell approach Power is approximately evenly split between channels, buffers, and routers With efficient circuits Channels 1/30, buffers 1/3 Routers dominate Routing >> Buffering >> Propagating Motivates topologies with fewer hops, longer channels. Just propagate bits - avoid buffering, really avoid routing ISLPED: 24 Aug 27, 2007
25 Properties of these elements drives optimal network organization ISLPED: 25 Aug 27, 2007
26 On-Chip Interconnection Network System = Processor Tiles Source: Balfour and Dally, ICS 06 ISLPED: 26 Aug 27, 2007
27 On-Chip Interconnection Network (2) System = Processor Tiles + Channels Source: Balfour and Dally, ICS 06 ISLPED: 27 Aug 27, 2007
28 Interconnection Network (3) System = Processor Tiles + Channels + Routers Source: Balfour and Dally, ICS 06 ISLPED: 28 Aug 27, 2007
29 Router Architecture Input-queued Virtual Channel Speculative Pipeline Source: Balfour and Dally, ICS 06 ISLPED: 29 Aug 27, 2007
30 Router Area Accurate modeling requires floorplan Source: Balfour and Dally, ICS 06 ISLPED: 30 Aug 27, 2007
31 Torus Source: Balfour and Dally, ICS 06 ISLPED: 31 Aug 27, 2007
32 Concentrated Mesh Source: Balfour and Dally, ICS 06 ISLPED: 32 Aug 27, 2007
33 Express Links Source: Balfour and Dally, ICS 06 ISLPED: 33 Aug 27, 2007
34 Network Replication Abundant wire resources build second network Resource allocation tradeoff Wide: [+] Serialization Latency [+] Router Energy Efficiency [ ] Router Area Replicated: [+] Decoupled Resources [+] Area Efficiency [?] Energy Efficiency [ ] Serialization Latency [+] SCALABLE Source: Balfour and Dally, ICS 06 ISLPED: 34 Aug 27, 2007
35 Energy Efficiency Network Energy Completion Time (normalized to Torus network) Concentrated Mesh (Replicated) Torus Fat-Tree Fat-Tree with Taper Mesh (Replicated) Source: Balfour and Dally, ICS 06 ISLPED: 35 Aug 27, 2007
36 Large differences in efficiency. Optimal topology not obvious, not regular and very sensitive to properties of network elements ISLPED: 36 Aug 27, 2007
37 Where is Energy Expended? Power [W] Buffers Switch Channels 0 Concentrated Mesh (Replicated) Torus Source: Balfour and Dally, ICS 06 ISLPED: 37 Aug 27, 2007
38 Flow Control Trade channel bandwidth (cheap) for buffer space (expensive) Make buffers shallow Compensate for lower duty factor by overprovisioning channels Little cost in energy Circuit switching (no buffers) Elastic buffers - use free buffers in the channels ISLPED: 38 Aug 27, 2007
39 Flow Control in an On-Chip FlatBfly S X D ISLPED: 39 Aug 27, 2007
40 View as Two Buffered Links S X D S X D ISLPED: 40 Aug 27, 2007
41 Channels Have Repeaters S X D ISLPED: 41 Aug 27, 2007
42 Buffers Decouple Channel Allocation in Time S X D S X D ISLPED: 42 Aug 27, 2007
43 Circuit Switching S X D S X D ISLPED: 43 Aug 27, 2007
44 With Elastic Buffers S X D S X D ISLPED: 44 Aug 27, 2007
45 Summary ISLPED: 45 Aug 27, 2007
46 Comments on Shekar s Talk There is more to life than meshes and buses Optimal networks, like flattened butterflies are better than either Buses weren t even good on boards Slow, no parallelism, excess power, no locality Even Shekar s numbers say buses are worse W vs 20-80W Differential signaling can be applied to networks too. Routers don t take 3-5 clocks - many examples of one clock routers See Mullins, ISCA 2004 Arbiters for circuit switched network no different than routers for packet switched But drops packets on collisions - packet switching more efficient Yes, embrace parallelism - with well-designed networks We re making progress - in December Shekar was saying buses, now he s advanced to circuit switching ISLPED: 46 Aug 27, 2007
47 Summary OCINs critically important Vital component of CMPs, SoCs Less mature technology than other components Naïve implementations consume significant Power Power is dominated by interconnect - operations are cheap Very different than off-chip networks Cost, Channels, Workloads, Design Issues Efficient network elements are enabling technology Energy and area efficient channels, buffers, & switches Change the equation for network design: channels << buffers << routers Topology Minimize diameter Concentrated Mesh with Express Channels Flattened Butterfly Bus is one extreme, but almost never the right answer Flow Control Minimize buffers at switch points Use elastic buffers to minimize latency Efficient OCIN designs yield significant gains ISLPED: 47 Aug 27, 2007
48 Backup Slides ISLPED: 48 Aug 27, 2007
49 On-Chip Flattened Butterfly Conventional 2D Mesh 2D Flattened Butterfly Source: Kim and Dally, to appear ISLPED: 49 Aug 27, 2007
50 On-Chip Flattened Butterfly dimension 1 Layout Mapping dimension 2 Source: Kim and Dally, to appear ISLPED: 50 Aug 27, 2007
51 Bypass Channels Conventional Flattened Butterfly Flattened Butterfly with Bypass Channels connected to local router Source: Kim and Dally, to appear ISLPED: 51 Aug 27, 2007
52 Latency Comparison Latency (normalized to mesh network) bitrev tornado UR MESH CMESH FBFLY-MIN FBFLY- NONMIN transpose randperm bitcomp FBFLY-BYP Source: Kim and Dally, to appear ISLPED: 52 Aug 27, 2007
53 Power Comparison Power (W) Memory Crossbar Channel 2 0 MESH CMESH FBFLY-MIN FBFLY- NONMIN Source: Kim and Dally, to appear FBFLY-BYP ISLPED: 53 Aug 27, 2007
From Hypercubes to Dragonflies a short history of interconnect
From Hypercubes to Dragonflies a short history of interconnect William J. Dally Computer Science Department Stanford University IAA Workshop July 21, 2008 IAA: # Outline The low-radix era High-radix routers
More informationLecture 18: Interconnection Networks. CMU 15-418: Parallel Computer Architecture and Programming (Spring 2012)
Lecture 18: Interconnection Networks CMU 15-418: Parallel Computer Architecture and Programming (Spring 2012) Announcements Project deadlines: - Mon, April 2: project proposal: 1-2 page writeup - Fri,
More informationInterconnection Networks. Interconnection Networks. Interconnection networks are used everywhere!
Interconnection Networks Interconnection Networks Interconnection networks are used everywhere! Supercomputers connecting the processors Routers connecting the ports can consider a router as a parallel
More informationNaveen Muralimanohar Rajeev Balasubramonian Norman P Jouppi
Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0 Naveen Muralimanohar Rajeev Balasubramonian Norman P Jouppi University of Utah & HP Labs 1 Large Caches Cache hierarchies
More informationArchitectural Level Power Consumption of Network on Chip. Presenter: YUAN Zheng
Architectural Level Power Consumption of Network Presenter: YUAN Zheng Why Architectural Low Power Design? High-speed and large volume communication among different parts on a chip Problem: Power consumption
More informationInterconnection Network
Interconnection Network Recap: Generic Parallel Architecture A generic modern multiprocessor Network Mem Communication assist (CA) $ P Node: processor(s), memory system, plus communication assist Network
More informationTechnology-Driven, Highly-Scalable Dragonfly Topology
Technology-Driven, Highly-Scalable Dragonfly Topology By William J. Dally et al ACAL Group Seminar Raj Parihar parihar@ece.rochester.edu Motivation Objective: In interconnect network design Minimize (latency,
More informationSystem Interconnect Architectures. Goals and Analysis. Network Properties and Routing. Terminology - 2. Terminology - 1
System Interconnect Architectures CSCI 8150 Advanced Computer Architecture Hwang, Chapter 2 Program and Network Properties 2.4 System Interconnect Architectures Direct networks for static connections Indirect
More informationFrom Bus and Crossbar to Network-On-Chip. Arteris S.A.
From Bus and Crossbar to Network-On-Chip Arteris S.A. Copyright 2009 Arteris S.A. All rights reserved. Contact information Corporate Headquarters Arteris, Inc. 1741 Technology Drive, Suite 250 San Jose,
More informationCray Gemini Interconnect. Technical University of Munich Parallel Programming Class of SS14 Denys Sobchyshak
Cray Gemini Interconnect Technical University of Munich Parallel Programming Class of SS14 Denys Sobchyshak Outline 1. Introduction 2. Overview 3. Architecture 4. Gemini Blocks 5. FMA & BTA 6. Fault tolerance
More informationHyper Node Torus: A New Interconnection Network for High Speed Packet Processors
2011 International Symposium on Computer Networks and Distributed Systems (CNDS), February 23-24, 2011 Hyper Node Torus: A New Interconnection Network for High Speed Packet Processors Atefeh Khosravi,
More informationPhotonic Networks for Data Centres and High Performance Computing
Photonic Networks for Data Centres and High Performance Computing Philip Watts Department of Electronic Engineering, UCL Yury Audzevich, Nick Barrow-Williams, Robert Mullins, Simon Moore, Andrew Moore
More informationLecture 23: Interconnection Networks. Topics: communication latency, centralized and decentralized switches (Appendix E)
Lecture 23: Interconnection Networks Topics: communication latency, centralized and decentralized switches (Appendix E) 1 Topologies Internet topologies are not very regular they grew incrementally Supercomputers
More informationWhy the Network Matters
Week 2, Lecture 2 Copyright 2009 by W. Feng. Based on material from Matthew Sottile. So Far Overview of Multicore Systems Why Memory Matters Memory Architectures Emerging Chip Multiprocessors (CMP) Increasing
More informationDistributed Elastic Switch Architecture for efficient Networks-on-FPGAs
Distributed Elastic Switch Architecture for efficient Networks-on-FPGAs Antoni Roca, Jose Flich Parallel Architectures Group Universitat Politechnica de Valencia (UPV) Valencia, Spain Giorgos Dimitrakopoulos
More informationPower Reduction Techniques in the SoC Clock Network. Clock Power
Power Reduction Techniques in the SoC Network Low Power Design for SoCs ASIC Tutorial SoC.1 Power Why clock power is important/large» Generally the signal with the highest frequency» Typically drives a
More informationTRACKER: A Low Overhead Adaptive NoC Router with Load Balancing Selection Strategy
TRACKER: A Low Overhead Adaptive NoC Router with Load Balancing Selection Strategy John Jose, K.V. Mahathi, J. Shiva Shankar and Madhu Mutyam PACE Laboratory, Department of Computer Science and Engineering
More informationInterconnection Network Design
Interconnection Network Design Vida Vukašinović 1 Introduction Parallel computer networks are interesting topic, but they are also difficult to understand in an overall sense. The topological structure
More informationTopological Properties
Advanced Computer Architecture Topological Properties Routing Distance: Number of links on route Node degree: Number of channels per node Network diameter: Longest minimum routing distance between any
More informationLecture 2 Parallel Programming Platforms
Lecture 2 Parallel Programming Platforms Flynn s Taxonomy In 1966, Michael Flynn classified systems according to numbers of instruction streams and the number of data stream. Data stream Single Multiple
More informationInterconnection Networks
Advanced Computer Architecture (0630561) Lecture 15 Interconnection Networks Prof. Kasim M. Al-Aubidy Computer Eng. Dept. Interconnection Networks: Multiprocessors INs can be classified based on: 1. Mode
More informationCOMP 422, Lecture 3: Physical Organization & Communication Costs in Parallel Machines (Sections 2.4 & 2.5 of textbook)
COMP 422, Lecture 3: Physical Organization & Communication Costs in Parallel Machines (Sections 2.4 & 2.5 of textbook) Vivek Sarkar Department of Computer Science Rice University vsarkar@rice.edu COMP
More informationAsynchronous Bypass Channels
Asynchronous Bypass Channels Improving Performance for Multi-Synchronous NoCs T. Jain, P. Gratz, A. Sprintson, G. Choi, Department of Electrical and Computer Engineering, Texas A&M University, USA Table
More informationWhat is a System on a Chip?
What is a System on a Chip? Integration of a complete system, that until recently consisted of multiple ICs, onto a single IC. CPU PCI DSP SRAM ROM MPEG SoC DRAM System Chips Why? Characteristics: Complex
More informationDesign and Implementation of an On-Chip timing based Permutation Network for Multiprocessor system on Chip
Design and Implementation of an On-Chip timing based Permutation Network for Multiprocessor system on Chip Ms Lavanya Thunuguntla 1, Saritha Sapa 2 1 Associate Professor, Department of ECE, HITAM, Telangana
More informationInterconnection Networks Programmierung Paralleler und Verteilter Systeme (PPV)
Interconnection Networks Programmierung Paralleler und Verteilter Systeme (PPV) Sommer 2015 Frank Feinbube, M.Sc., Felix Eberhardt, M.Sc., Prof. Dr. Andreas Polze Interconnection Networks 2 SIMD systems
More informationLow Power AMD Athlon 64 and AMD Opteron Processors
Low Power AMD Athlon 64 and AMD Opteron Processors Hot Chips 2004 Presenter: Marius Evers Block Diagram of AMD Athlon 64 and AMD Opteron Based on AMD s 8 th generation architecture AMD Athlon 64 and AMD
More informationInterconnection Networks
Interconnection Networks Z. Jerry Shi Assistant Professor of Computer Science and Engineering University of Connecticut * Slides adapted from Blumrich&Gschwind/ELE475 03, Peh/ELE475 * Three questions about
More informationUse-it or Lose-it: Wearout and Lifetime in Future Chip-Multiprocessors
Use-it or Lose-it: Wearout and Lifetime in Future Chip-Multiprocessors Hyungjun Kim, 1 Arseniy Vitkovsky, 2 Paul V. Gratz, 1 Vassos Soteriou 2 1 Department of Electrical and Computer Engineering, Texas
More informationComponents: Interconnect Page 1 of 18
Components: Interconnect Page 1 of 18 PE to PE interconnect: The most expensive supercomputer component Possible implementations: FULL INTERCONNECTION: The ideal Usually not attainable Each PE has a direct
More informationIntroduction to Exploration and Optimization of Multiprocessor Embedded Architectures based on Networks On-Chip
Introduction to Exploration and Optimization of Multiprocessor Embedded Architectures based on Networks On-Chip Cristina SILVANO silvano@elet.polimi.it Politecnico di Milano, Milano (Italy) Talk Outline
More informationIntel Itanium Quad-Core Architecture for the Enterprise. Lambert Schaelicke Eric DeLano
Intel Itanium Quad-Core Architecture for the Enterprise Lambert Schaelicke Eric DeLano Agenda Introduction Intel Itanium Roadmap Intel Itanium Processor 9300 Series Overview Key Features Pipeline Overview
More informationPerformance Evaluation of 2D-Mesh, Ring, and Crossbar Interconnects for Chip Multi- Processors. NoCArc 09
Performance Evaluation of 2D-Mesh, Ring, and Crossbar Interconnects for Chip Multi- Processors NoCArc 09 Jesús Camacho Villanueva, José Flich, José Duato Universidad Politécnica de Valencia December 12,
More informationDesign and Implementation of an On-Chip Permutation Network for Multiprocessor System-On-Chip
Design and Implementation of an On-Chip Permutation Network for Multiprocessor System-On-Chip Manjunath E 1, Dhana Selvi D 2 M.Tech Student [DE], Dept. of ECE, CMRIT, AECS Layout, Bangalore, Karnataka,
More informationPCI Express Overview. And, by the way, they need to do it in less time.
PCI Express Overview Introduction This paper is intended to introduce design engineers, system architects and business managers to the PCI Express protocol and how this interconnect technology fits into
More informationA Low Latency Router Supporting Adaptivity for On-Chip Interconnects
A Low Latency Supporting Adaptivity for On-Chip Interconnects Jongman Kim Dongkook Park T. Theocharides N. Vijaykrishnan Chita R. Das Department of Computer Science and Engineering The Pennsylvania State
More informationLow-Overhead Hard Real-time Aware Interconnect Network Router
Low-Overhead Hard Real-time Aware Interconnect Network Router Michel A. Kinsy! Department of Computer and Information Science University of Oregon Srinivas Devadas! Department of Electrical Engineering
More informationChapter 1 Reading Organizer
Chapter 1 Reading Organizer After completion of this chapter, you should be able to: Describe convergence of data, voice and video in the context of switched networks Describe a switched network in a small
More informationComputer Systems Structure Input/Output
Computer Systems Structure Input/Output Peripherals Computer Central Processing Unit Main Memory Computer Systems Interconnection Communication lines Input Output Ward 1 Ward 2 Examples of I/O Devices
More informationinter-chip and intra-chip Harm Dorren and Oded Raz
Will photonics penetrate into inter-chip and intra-chip communications? Harm Dorren and Oded Raz On-chip interconnect networks R X R X R X R X R T X R XT X T X T X T X X Electronic on-chip interconnect
More informationA Low-Radix and Low-Diameter 3D Interconnection Network Design
A Low-Radix and Low-Diameter 3D Interconnection Network Design Yi Xu,YuDu, Bo Zhao, Xiuyi Zhou, Youtao Zhang, Jun Yang Dept. of Electrical and Computer Engineering Dept. of Computer Science University
More informationParallel Programming
Parallel Programming Parallel Architectures Diego Fabregat-Traver and Prof. Paolo Bientinesi HPAC, RWTH Aachen fabregat@aices.rwth-aachen.de WS15/16 Parallel Architectures Acknowledgements Prof. Felix
More informationLeveraging Torus Topology with Deadlock Recovery for Cost-Efficient On-Chip Network
Leveraging Torus Topology with Deadlock ecovery for Cost-Efficient On-Chip Network Minjeong Shin, John Kim Department of Computer Science KAIST Daejeon, Korea {shinmj, jjk}@kaist.ac.kr Abstract On-chip
More information- Hubs vs. Switches vs. Routers -
1 Layered Communication - Hubs vs. Switches vs. Routers - Network communication models are generally organized into layers. The OSI model specifically consists of seven layers, with each layer representing
More information8 Gbps CMOS interface for parallel fiber-optic interconnects
8 Gbps CMOS interface for parallel fiberoptic interconnects Barton Sano, Bindu Madhavan and A. F. J. Levi Department of Electrical Engineering University of Southern California Los Angeles, California
More informationInterconnection Network of OTA-based FPAA
Chapter S Interconnection Network of OTA-based FPAA 5.1 Introduction Aside from CAB components, a number of different interconnect structures have been proposed for FPAAs. The choice of an intercmmcclion
More informationChapter 2 Heterogeneous Multicore Architecture
Chapter 2 Heterogeneous Multicore Architecture 2.1 Architecture Model In order to satisfy the high-performance and low-power requirements for advanced embedded systems with greater fl exibility, it is
More informationA Dynamic Link Allocation Router
A Dynamic Link Allocation Router Wei Song and Doug Edwards School of Computer Science, the University of Manchester Oxford Road, Manchester M13 9PL, UK {songw, doug}@cs.man.ac.uk Abstract The connection
More informationGlobal Foundation Services
Global Foundation Services Introduction I work in Global Foundations Services (GFS) Lead R&D for Microsoft s next-generation end-to-end solutions for the cloud infrastructure We take a long term view
More informationA Detailed and Flexible Cycle-Accurate Network-on-Chip Simulator
A Detailed and Flexible Cycle-Accurate Network-on-Chip Simulator Nan Jiang Stanford University qtedq@cva.stanford.edu James Balfour Google Inc. jbalfour@google.com Daniel U. Becker Stanford University
More informationARCHITECTING EFFICIENT INTERCONNECTS FOR LARGE CACHES
... ARCHITECTING EFFICIENT INTERCONNECTS FOR LARGE CACHES WITH CACTI 6.0... INTERCONNECTS PLAY AN INCREASINGLY IMPORTANT ROLE IN DETERMINING THE POWER AND PERFORMANCE CHARACTERISTICS OF MODERN PROCESSORS.
More informationParallel Programming Survey
Christian Terboven 02.09.2014 / Aachen, Germany Stand: 26.08.2014 Version 2.3 IT Center der RWTH Aachen University Agenda Overview: Processor Microarchitecture Shared-Memory
More informationScaling 10Gb/s Clustering at Wire-Speed
Scaling 10Gb/s Clustering at Wire-Speed InfiniBand offers cost-effective wire-speed scaling with deterministic performance Mellanox Technologies Inc. 2900 Stender Way, Santa Clara, CA 95054 Tel: 408-970-3400
More informationOpenSoC Fabric: On-Chip Network Generator
OpenSoC Fabric: On-Chip Network Generator Using Chisel to Generate a Parameterizable On-Chip Interconnect Fabric Farzad Fatollahi-Fard, David Donofrio, George Michelogiannakis, John Shalf MODSIM 2014 Presentation
More informationSAN Conceptual and Design Basics
TECHNICAL NOTE VMware Infrastructure 3 SAN Conceptual and Design Basics VMware ESX Server can be used in conjunction with a SAN (storage area network), a specialized high speed network that connects computer
More informationCCNA R&S: Introduction to Networks. Chapter 5: Ethernet
CCNA R&S: Introduction to Networks Chapter 5: Ethernet 5.0.1.1 Introduction The OSI physical layer provides the means to transport the bits that make up a data link layer frame across the network media.
More informationNetwork Architecture and Topology
1. Introduction 2. Fundamentals and design principles 3. Network architecture and topology 4. Network control and signalling 5. Network components 5.1 links 5.2 switches and routers 6. End systems 7. End-to-end
More information- Nishad Nerurkar. - Aniket Mhatre
- Nishad Nerurkar - Aniket Mhatre Single Chip Cloud Computer is a project developed by Intel. It was developed by Intel Lab Bangalore, Intel Lab America and Intel Lab Germany. It is part of a larger project,
More informationIntroduction to Infiniband. Hussein N. Harake, Performance U! Winter School
Introduction to Infiniband Hussein N. Harake, Performance U! Winter School Agenda Definition of Infiniband Features Hardware Facts Layers OFED Stack OpenSM Tools and Utilities Topologies Infiniband Roadmap
More informationOC By Arsene Fansi T. POLIMI 2008 1
IBM POWER 6 MICROPROCESSOR OC By Arsene Fansi T. POLIMI 2008 1 WHAT S IBM POWER 6 MICROPOCESSOR The IBM POWER6 microprocessor powers the new IBM i-series* and p-series* systems. It s based on IBM POWER5
More informationRecursive Partitioning Multicast: A Bandwidth-Efficient Routing for Networks-On-Chip
Recursive Partitioning Multicast: A Bandwidth-Efficient Routing for Networks-On-Chip Lei Wang, Yuho Jin, Hyungjun Kim and Eun Jung Kim Department of Computer Science and Engineering Texas A&M University
More informationLatency Considerations for 10GBase-T PHYs
Latency Considerations for PHYs Shimon Muller Sun Microsystems, Inc. March 16, 2004 Orlando, FL Outline Introduction Issues and non-issues PHY Latency in The Big Picture Observations Summary and Recommendations
More informationArchitectures and Platforms
Hardware/Software Codesign Arch&Platf. - 1 Architectures and Platforms 1. Architecture Selection: The Basic Trade-Offs 2. General Purpose vs. Application-Specific Processors 3. Processor Specialisation
More informationPacketization and routing analysis of on-chip multiprocessor networks
Journal of Systems Architecture 50 (2004) 81 104 www.elsevier.com/locate/sysarc Packetization and routing analysis of on-chip multiprocessor networks Terry Tao Ye a, *, Luca Benini b, Giovanni De Micheli
More informationInterconnection Networks
CMPT765/408 08-1 Interconnection Networks Qianping Gu 1 Interconnection Networks The note is mainly based on Chapters 1, 2, and 4 of Interconnection Networks, An Engineering Approach by J. Duato, S. Yalamanchili,
More informationSeaMicro SM10000-64 Server
SeaMicro SM10000-64 Server Building Datacenter Servers Using Cell Phone Chips Ashutosh Dhodapkar, Gary Lauterbach, Sean Lie, Ashutosh Dhodapkar, Gary Lauterbach, Sean Lie, Dhiraj Mallick, Jim Bauman, Sundar
More informationSOC architecture and design
SOC architecture and design system-on-chip (SOC) processors: become components in a system SOC covers many topics processor: pipelined, superscalar, VLIW, array, vector storage: cache, embedded and external
More informationHow PCI Express Works (by Tracy V. Wilson)
1 How PCI Express Works (by Tracy V. Wilson) http://computer.howstuffworks.com/pci-express.htm Peripheral Component Interconnect (PCI) slots are such an integral part of a computer's architecture that
More informationInfrastructure Components: Hub & Repeater. Network Infrastructure. Switch: Realization. Infrastructure Components: Switch
Network Infrastructure or building computer networks more complex than e.g. a short bus, some additional components are needed. They can be arranged hierarchically regarding their functionality: Repeater
More informationIntel Ethernet Switch Load Balancing System Design Using Advanced Features in Intel Ethernet Switch Family
Intel Ethernet Switch Load Balancing System Design Using Advanced Features in Intel Ethernet Switch Family White Paper June, 2008 Legal INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL
More informationClocking. Figure by MIT OCW. 6.884 - Spring 2005 2/18/05 L06 Clocks 1
ing Figure by MIT OCW. 6.884 - Spring 2005 2/18/05 L06 s 1 Why s and Storage Elements? Inputs Combinational Logic Outputs Want to reuse combinational logic from cycle to cycle 6.884 - Spring 2005 2/18/05
More informationDriving force. What future software needs. Potential research topics
Improving Software Robustness and Efficiency Driving force Processor core clock speed reach practical limit ~4GHz (power issue) Percentage of sustainable # of active transistors decrease; Increase in #
More informationICRI-CI Retreat Architecture track
ICRI-CI Retreat Architecture track Uri Weiser June 5 th 2015 - Funnel: Memory Traffic Reduction for Big Data & Machine Learning (Uri) - Accelerators for Big Data & Machine Learning (Ran) - Machine Learning
More informationNetwork Design. Yiannos Mylonas
Network Design Yiannos Mylonas Physical Topologies There are two parts to the topology definition: the physical topology, which is the actual layout of the wire (media), and the logical topology, which
More informationQuality of Service (QoS) for Asynchronous On-Chip Networks
Quality of Service (QoS) for synchronous On-Chip Networks Tomaz Felicijan and Steve Furber Department of Computer Science The University of Manchester Oxford Road, Manchester, M13 9PL, UK {felicijt,sfurber}@cs.man.ac.uk
More informationTesting of Digital System-on- Chip (SoC)
Testing of Digital System-on- Chip (SoC) 1 Outline of the Talk Introduction to system-on-chip (SoC) design Approaches to SoC design SoC test requirements and challenges Core test wrapper P1500 core test
More informationApplying the Benefits of Network on a Chip Architecture to FPGA System Design
Applying the Benefits of on a Chip Architecture to FPGA System Design WP-01149-1.1 White Paper This document describes the advantages of network on a chip (NoC) architecture in Altera FPGA system design.
More informationCommunicating with devices
Introduction to I/O Where does the data for our CPU and memory come from or go to? Computers communicate with the outside world via I/O devices. Input devices supply computers with data to operate on.
More informationComputer Network. Interconnected collection of autonomous computers that are able to exchange information
Introduction Computer Network. Interconnected collection of autonomous computers that are able to exchange information No master/slave relationship between the computers in the network Data Communications.
More informationDesigning HP SAN Networking Solutions
Exam : HP0-J65 Title : Designing HP SAN Networking Solutions Version : Demo 1 / 6 1.To install additional switches, you must determine the ideal ISL ratio. Which ISL ratio range is recommended for less
More informationMaximizing Server Storage Performance with PCI Express and Serial Attached SCSI. Article for InfoStor November 2003 Paul Griffith Adaptec, Inc.
Filename: SAS - PCI Express Bandwidth - Infostor v5.doc Maximizing Server Storage Performance with PCI Express and Serial Attached SCSI Article for InfoStor November 2003 Paul Griffith Adaptec, Inc. Server
More informationSwitched Interconnect for System-on-a-Chip Designs
witched Interconnect for ystem-on-a-chip Designs Abstract Daniel iklund and Dake Liu Dept. of Physics and Measurement Technology Linköping University -581 83 Linköping {danwi,dake}@ifm.liu.se ith the increased
More informationCommunication Networks. MAP-TELE 2011/12 José Ruela
Communication Networks MAP-TELE 2011/12 José Ruela Network basic mechanisms Introduction to Communications Networks Communications networks Communications networks are used to transport information (data)
More informationInfiniBand Clustering
White Paper InfiniBand Clustering Delivering Better Price/Performance than Ethernet 1.0 Introduction High performance computing clusters typically utilize Clos networks, more commonly known as Fat Tree
More informationEfficient Built-In NoC Support for Gather Operations in Invalidation-Based Coherence Protocols
Universitat Politècnica de València Master Thesis Efficient Built-In NoC Support for Gather Operations in Invalidation-Based Coherence Protocols Author: Mario Lodde Advisor: Prof. José Flich Cardo A thesis
More informationStorage at a Distance; Using RoCE as a WAN Transport
Storage at a Distance; Using RoCE as a WAN Transport Paul Grun Chief Scientist, System Fabric Works, Inc. (503) 620-8757 pgrun@systemfabricworks.com Why Storage at a Distance the Storage Cloud Following
More informationReconfigurable Computing. Reconfigurable Architectures. Chapter 3.2
Reconfigurable Architectures Chapter 3.2 Prof. Dr.-Ing. Jürgen Teich Lehrstuhl für Hardware-Software-Co-Design Coarse-Grained Reconfigurable Devices Recall: 1. Brief Historically development (Estrin Fix-Plus
More informationThe OSI & Internet layering models
CSE 123 Computer Networks Fall 2009 Lecture 2: Protocols & Layering Today What s a protocol? Organizing protocols via layering Encoding layers in packets The OSI & Internet layering models The end-to-end
More informationPCI Express* Ethernet Networking
White Paper Intel PRO Network Adapters Network Performance Network Connectivity Express* Ethernet Networking Express*, a new third-generation input/output (I/O) standard, allows enhanced Ethernet network
More informationA Heuristic for Peak Power Constrained Design of Network-on-Chip (NoC) based Multimode Systems
A Heuristic for Peak Power Constrained Design of Network-on-Chip (NoC) based Multimode Systems Praveen Bhojwani, Rabi Mahapatra, un Jung Kim and Thomas Chen* Computer Science Department, Texas A&M University,
More informationIntroduction to System-on-Chip
Introduction to System-on-Chip COE838: Systems-on-Chip Design http://www.ee.ryerson.ca/~courses/coe838/ Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer Engineering Ryerson University
More informationOpenSPARC T1 Processor
OpenSPARC T1 Processor The OpenSPARC T1 processor is the first chip multiprocessor that fully implements the Sun Throughput Computing Initiative. Each of the eight SPARC processor cores has full hardware
More informationAnnotation to the assignments and the solution sheet. Note the following points
Computer rchitecture 2 / dvanced Computer rchitecture Seite: 1 nnotation to the assignments and the solution sheet This is a multiple choice examination, that means: Solution approaches are not assessed
More informationCHIP MULTIPROCESSOR COHERENCE AND INTERCONNECT SYSTEM DESIGN
CHIP MULTIPROCESSOR COHERENCE AND INTERCONNECT SYSTEM DESIGN by Natalie D. Enright Jerger A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Electrical
More informationPeak Power Control for a QoS Capable On-Chip Network
Peak Power Control for a QoS Capable On-Chip Network Yuho Jin 1, Eun Jung Kim 1, Ki Hwan Yum 2 1 Texas A&M University 2 University of Texas at San Antonio {yuho,ejkim}@cs.tamu.edu yum@cs.utsa.edu Abstract
More informationSCSI vs. Fibre Channel White Paper
SCSI vs. Fibre Channel White Paper 08/27/99 SCSI vs. Fibre Channel Over the past decades, computer s industry has seen radical change in key components. Limitations in speed, bandwidth, and distance have
More informationRedundancy in enterprise storage networks using dual-domain SAS configurations
Redundancy in enterprise storage networks using dual-domain SAS configurations technology brief Abstract... 2 Introduction... 2 Why dual-domain SAS is important... 2 Single SAS domain... 3 Dual-domain
More informationLizy Kurian John Electrical and Computer Engineering Department, The University of Texas as Austin
BUS ARCHITECTURES Lizy Kurian John Electrical and Computer Engineering Department, The University of Texas as Austin Keywords: Bus standards, PCI bus, ISA bus, Bus protocols, Serial Buses, USB, IEEE 1394
More informationScalability and Classifications
Scalability and Classifications 1 Types of Parallel Computers MIMD and SIMD classifications shared and distributed memory multicomputers distributed shared memory computers 2 Network Topologies static
More informationChapter 2. Multiprocessors Interconnection Networks
Chapter 2 Multiprocessors Interconnection Networks 2.1 Taxonomy Interconnection Network Static Dynamic 1-D 2-D HC Bus-based Switch-based Single Multiple SS MS Crossbar 2.2 Bus-Based Dynamic Single Bus
More information