High Performance Computer Architecture



Similar documents
Program Optimization for Multi-core Architectures

Vorlesung Rechnerarchitektur 2 Seite 178 DASH

CSC475 Distributed and Cloud Computing Pre- or Co-requisite: CSC280

Lecture 23: Multiprocessors

Weighted Total Mark. Weighted Exam Mark

ADVANCED COMPUTER ARCHITECTURE

How To Understand The Concept Of A Distributed System

Distributed Systems LEEC (2005/06 2º Sem.)

Distributed Operating Systems

Distributed Systems. REK s adaptation of Prof. Claypool s adaptation of Tanenbaum s Distributed Systems Chapter 1

Introduction to Cloud Computing

Distributed Systems and Recent Innovations: Challenges and Benefits

Lecture 4. Parallel Programming II. Homework & Reading. Page 1. Projects handout On Friday Form teams, groups of two

Lizy Kurian John Electrical and Computer Engineering Department, The University of Texas as Austin

CMSC 611: Advanced Computer Architecture

DIRECT PH.D. (POST B.S.) IN COMPUTER SCIENCE PROGRAM

Parallel Programming Survey

Switched Interconnect for System-on-a-Chip Designs

ADVANCED COMPUTER ARCHITECTURE: Parallelism, Scalability, Programmability

Hadoop Parallel Data Processing

Principles and characteristics of distributed systems and environments

Virtual machine interface. Operating system. Physical machine interface

Interconnection Networks

Chapter 12: Multiprocessor Architectures. Lesson 09: Cache Coherence Problem and Cache synchronization solutions Part 1

Lehrstuhl für Rechnertechnik und Rechnerorganisation (LRR-TUM) Annual Report 1998/1999

Computer Architecture Syllabus of Qualifying Examination

Study Plan Masters of Science in Computer Engineering and Networks (Thesis Track)

Computer Engineering: Incoming MS Student Orientation Requirements & Course Overview

Gildart Haase School of Computer Sciences and Engineering

Middleware and Distributed Systems. Introduction. Dr. Martin v. Löwis

Performance Metrics and Scalability Analysis. Performance Metrics and Scalability Analysis

Parallel Programming

Introduction to Cloud Computing

COMP 422, Lecture 3: Physical Organization & Communication Costs in Parallel Machines (Sections 2.4 & 2.5 of textbook)

CS550. Distributed Operating Systems (Advanced Operating Systems) Instructor: Xian-He Sun

Web DNS Peer-to-peer systems (file sharing, CDNs, cycle sharing)

Multi-Threading Performance on Commodity Multi-Core Processors

IV Distributed Databases - Motivation & Introduction -

Multi-core architectures. Jernej Barbic , Spring 2007 May 3, 2007

COMPUTER SCIENCE AND ENGINEERING - Microprocessor Systems - Mitchell Aaron Thornton

EMC VPLEX FAMILY. Continuous Availability and data Mobility Within and Across Data Centers

UCC1: New Course Transmittal Form

Making Multicore Work and Measuring its Benefits. Markus Levy, president EEMBC and Multicore Association

Standardized Syllabus for the College of Engineering

Client/Server Computing Distributed Processing, Client/Server, and Clusters

Distributed Data Stores

Why the Network Matters

Lecture 2 Parallel Programming Platforms

A Comparison of Distributed Systems: ChorusOS and Amoeba

A Lab Course on Computer Architecture

2.1 What are distributed systems? What are systems? Different kind of systems How to distribute systems? 2.2 Communication concepts

CS6204 Advanced Topics in Networking

Control 2004, University of Bath, UK, September 2004

Distributed Systems. Examples. Advantages and disadvantages. CIS 505: Software Systems. Introduction to Distributed Systems

Cloud Computing and Robotics for Disaster Management

Performance evaluation

Distribution transparency. Degree of transparency. Openness of distributed systems

Introduction to Parallel Computing. George Karypis Parallel Programming Platforms

High Performance Computing. Course Notes HPC Fundamentals

Annotation to the assignments and the solution sheet. Note the following points

Operating System Multilevel Load Balancing

Distributed Systems Lecture 1 1

Stage III courses COMPSCI 314

LinuxWorld Conference & Expo Server Farms and XML Web Services

- Nishad Nerurkar. - Aniket Mhatre

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

An Undergraduate Distributed Computing Course

Big Data Storage Options for Hadoop Sam Fineberg, HP Storage

Introduction to GPU Programming Languages

Performance Evaluation of 2D-Mesh, Ring, and Crossbar Interconnects for Chip Multi- Processors. NoCArc 09

Cloud Computing. Theory and Practice. Dan C. Marinescu. Morgan Kaufmann is an imprint of Elsevier HEIDELBERG LONDON AMSTERDAM BOSTON

Client/Server and Distributed Computing

Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks

Big Data Storage Architecture Design in Cloud Computing

Chapter 1: Distributed Systems: What is a distributed system? Fall 2008 Jussi Kangasharju

PARALLEL & CLUSTER COMPUTING CS 6260 PROFESSOR: ELISE DE DONCKER BY: LINA HUSSEIN

Exploiting Transparent Remote Memory Access for Non-Contiguous- and One-Sided-Communication

Web Service Based Data Management for Grid Applications

Storage Virtualization from clusters to grid

Symmetric Multiprocessing

Big Data Technology Map-Reduce Motivation: Indexing in Search Engines

Embedded Internet and the Internet of Things WS 12/13

Middleware: Past and Present a Comparison

High Performance Computing

D A T A M I N I N G C L A S S I F I C A T I O N

Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes

Multilevel Load Balancing in NUMA Computers

Transcription:

High Performance Computer Architecture Volker Lindenstruth Lehrstuhl für Hochleistungsrechner Archittektur Ruth-Moufang Str. 1 email: ti@compeng.de URL: www.compeng.de Telefon: 798-44100 Volker Lindenstruth (www.compeng.de) 22. April 2010 Copyright, Goethe Uni, Alle Rechte vorbehalten

Goals for the course In-depth understanding of the architecture and design of modern high performance computers and their efficient programming technology forces fundamental architectural issues» naming, replication, communication, synchronization basic design techniques» cache coherence, protocols, networks, pipelining, methods of evaluation underlying engineering trade-offs Programming models and methods from moderate to very large scale across the hardware/software boundary learn to use parallel computer (projects in class) learn using MP and SAS programming models Volker Lindenstruth (www.compeng.de) 22. April 2010 Copyright, Goethe Uni, Alle Rechte vorbehalten L00-2

Contents Fundamentals and Introduction Why Parallel Architecture; Evolution of Parallel Machines; Parallel Software Basics; Programming for Performance Scaling Parallel Programs for Multiprocessors Vectorization, Methodology and Examples; Working Sets, Cache Sizes, and Node Granularity Issues for Large-Scale Multiprocessors; Workload-Driven Architectural Evaluation; Scaling Small-Scale Shared Memory Cache Coherence; Memory Consistency; Snooping Protocols; Synchronization; Design Tradeoffs; Implementation Large-Scale Scalable Distributed-Memory Multiprocessors Realizing Programming Models on Large-Scale Distributed-Memory Multiprocessors; Desing of Large-Scale Distributed-Memory Multiprocessors; Architecture of Intel Paragon; Desing of Large-Scale Shared Physical Address Space; Architecture of T3D; Large-Scale Shared Address Space Multiprocessors; Memory Consistency Models; Large-scale CC Designs; Case Studies: Large Scale CC-NUMA Machines, COMA Latency Tolerance In message passing and distributed shared memory; block data transfers; long latency events; precommunication in SAS; multithreadding Scalable Interconnection Networks Design Space of Interconnection Networks; Routing; Synchronization; Case Studies: Myrinet, SCI, Reflective Memories Cluster Computing Applications, Distributed mass storage, fault tolerance, autonomous computing Volker Lindenstruth (www.compeng.de) 22. April 2010 Copyright, Goethe Uni, Alle Rechte vorbehalten L00-3

Literature In preperation for this course the following bucks have been used: David Culler and J.P. Singh with Anoop Gupta: Parallel Computer Architecture: A Hardware/Software Approach Morgan Kaufmann Publishers, Inc, ISBN 1-55860-343-3 G. Coulouris, et al, Distributed Systems, 3rd ed., Addison Wesley, 2001 A. Tanenbaum, M. v. Steen, Distributed Systems, Prentice Hall, 2002 N.A. Lynch, Distributed Algorithms, Morgan Kaufmann Publ., 1996 R. Guerraoui, L. Rodrigues, Introduction to Reliable Distributed Programming, Springer, 2006 G. Weikum, G. Vossen, Transactional Information Systems: Theory, Algorithms, and the Practice of Concurrency Control, Morgan Kaufmann Publ. John L. Hennesey, David Patterson: Computer Architecture a Quantitative Approach ISBN 1-55880- 069-8 Volker Lindenstruth (www.compeng.de) 22. April 2010 Copyright, Goethe Uni, Alle Rechte vorbehalten L00-4

Acknowledgement This lecture is based on the book and corresponding course by Prof. Dr. David Culler, UC Berkeley. The vast majority of slides and course material has been borrowed from his course. Additional material has been taken from Prof. Dr. Alexander Reinefeld, ZIB Berlin. Further contributors are Mathias Bach, Mathias Kretz and other members of the chair. Volker Lindenstruth (www.compeng.de) 22. April 2010 Copyright, Goethe Uni, Alle Rechte vorbehalten L00-5