Introduction to Multi-Core

Size: px
Start display at page:

Download "Introduction to Multi-Core"

Transcription

1 Introduction to Multi-Core Baskaran Ganesan Sr. Design Engineer Digital Enterprise Group, Intel Corporation Foundation for Advancement of Education and Research (FAER) 1

2 Topics 1.CPU (semiconductor) HISTORY (SESSION-1) a. Moore s Law b. Transistor scaling c. Scaling limitations & impact d. What then? - Dual core e. The new era - ARCHITECTURE (SESSION-2) a. Core Architecture - Core basics, Platform architecture, Core architecture b. Multi-core architecture c. Multi-core challenges d. Closing notes Foundation for Advancement of Education and Research (FAER) 2

3 Moore s Law Foundation for Advancement of Education and Research (FAER) 3

4 Moore s law at work Transistor Size Transistor Count CPU Arch technology Manufacturing technology Compute Power SW/IT eco-system Volume Market CPU Cost Foundation for Advancement of Education and Research (FAER) 4

5 Historical Driving Forces Shrinking Geometry Increased Frequency Feature Size (um) Frequency (MHz) Processor 2300 Transistors Processor IBM PC i386 Processor Pentium Processor 32-bit 3.1M transistors 2005 Montecito 1.7B Transistors Foundation for Advancement of Education and Research (FAER) 5

6 Scale Factors (loosely defined) Voltage scale-factor: Rate at which the transistor voltage decreases with respect to a change in transistor dimensions Frequency scale-factor: Rate at which the transistor frequency increases with respect to a change in transistor dimensions Cost scale-factor: Rate at which the per-transistor cost decreases with respect to a change in transistor dimensions Count scale-factor: Rate at which the transistor count increases with respect to a change in transistor dimensions Foundation for Advancement of Education and Research (FAER) 6

7 Scaling: More data Foundation for Advancement of Education and Research (FAER) 7

8 The Act of Balancing Delivered Performance = Instructions Per Cycle (IPC) * Frequency Goal is higher performance and lower power Power α C dynamic * V * V * Frequency Foundation for Advancement of Education and Research (FAER) 8

9 Scaling at its best Pentium 4 Processor 386 Processor May MHz core 275, µ transistors ~1.2 SPECint Years 200x 200x/11x 1000x August 27, GHz core 55 Million 0.13µ transistors 1249 SPECint2000 Foundation for Advancement of Education and Research (FAER) 9

10 Architectural Innovations Serial, sequential execution Overlapped execution (pipelining) Multi-stage, deep pipelining Control-speculative execution Data-speculative execution Super-scalar execution Out-of-order execution Vector computing Addressing extensions Application specific instructions Multi-level on-chip caching Memory disambiguation Register renaming Score-boarding Hardware data prefetching Many decades of computer architecture focused on Instruction-Level Parallelism (ILP) enhancement Foundation for Advancement of Education and Research (FAER) 10

11 The Challenges Power Limitations Diminishing Voltage Scaling 10 Supply 1 Voltage (V) 0.7um 0.5um 0.35um ~30% 0.25um 0.18um 0.13um 90nm 65nm 45nm slowing 30nm Power = Capacitance x Voltage 2 x Frequency also Power ~ Voltage 3 Foundation for Advancement of Education and Research (FAER) 11

12 Heat Dissipation Projected 10,000 Sun s s Surface 1,000 Rocket Nozzle Power Density (W/cm2) Nuclear Reactor 386 Hot Plate 486 Pentium processors Foundation for Advancement of Education and Research (FAER) 12

13 What then? Performance Power 1.00x Max Frequency Foundation for Advancement of Education and Research (FAER) 13

14 Over-clocking 1.73x Performance Power 1.13x 1.00x Over-clocked (+20%) Max Frequency Foundation for Advancement of Education and Research (FAER) 14

15 Under-clocking 1.73x Performance Power 1.13x 1.00x 0.87x 0.51x Over-clocked (+20%) Max Frequency Under-clocked (-20%) Foundation for Advancement of Education and Research (FAER) 15

16 Multi-Core Energy-Efficient Performance 1.73x Dual-Core Performance Power 1.73x 1.13x 1.00x 1.02x Over-clocked (+20%) Max Frequency Dual-core (-20%) Relative single-core frequency and Vcc Foundation for Advancement of Education and Research (FAER) 16

17 Dual core with voltage scaling A 15% Reduction In Voltage Yields RULE OF THUMB Frequency Reduction 15% Power Reduction 45% Performance Reduction 10% SINGLE CORE DUAL CORE Area = 1 Voltage = 1 Freq = 1 Power = 1 Perf = 1 Area = 2 Voltage = 0.85 Freq = 0.85 Power = 1 Perf = ~1.8 Foundation for Advancement of Education and Research (FAER) 17

18 Intel: Dual & Quad Cores Foundation for Advancement of Education and Research (FAER) 18

19 A New Era THE OLD Performance Equals Frequency THE NEW Performance Equals IPC Multi-Core Power Efficiency Microarchitecture Advancements Unconstrained Power Voltage Scaling Foundation for Advancement of Education and Research (FAER) 19

20 Trade-off equations - Power is costly; Transistors, relatively cheap - Frequency alone is not important; Efficiency IS - Performance-per-watt is critical; per-core performance is not quite - Computation is relatively easy; Memory accesses are NOT Foundation for Advancement of Education and Research (FAER) 20

21 Q & A Foundation for Advancement of Education and Research (FAER) 21

22 Topics 1. CPU (semiconductor) HISTORY (SESSION-1) a. Moore s Law b. Transistor scaling c. Scaling limitations & impact d. What then? - Dual core e. The new era - ARCHITECTURE (SESSION-2) a. Core Architecture - Core basics, Platform architecture, Core architecture b. Multi-core architecture c. Multi-core challenges d. Closing notes Foundation for Advancement of Education and Research (FAER) 22

23 Typical PC Architecture Foundation for Advancement of Education and Research (FAER) 23

24 Processor Resources - Caches: L0, L1, L2 etc (Different levels of caches) - General Purpose Registers (For SW programming) - Segment Registers & TLB (for memory management) - FP registers, XMM registers - System Flags - Control and Data registers, Debug registers, MSRs - Many more Foundation for Advancement of Education and Research (FAER) 24

25 CMP/SMP/HT CMP: Chip Multi Processing, refers to multiple physical core engines that have unique resources Unique: L0/L1 Cache, TLBs, Instruction Pointer, GP Regs Shared: L2 Cache SMP: Refers to multiple threads that share all resources (time muxed) Shared: L0/L1/L2 Caches, TLBs Unique: Instruction Pointer, GP Regs Hyper Threading: Refers to multiple threads that share more resources (L0/L1 Cache for example); May/May not be part of a CMP core SW Threading: Application (SW) level threading of processes on one/more physical core engines Foundation for Advancement of Education and Research (FAER) 25

26 Core Architecture (Prescott) Foundation for Advancement of Education and Research (FAER) 26

27 Core Architecture (Xeon Dual Core) Foundation for Advancement of Education and Research (FAER) 27

28 Multi-core platform (Freescale: embedded) Foundation for Advancement of Education and Research (FAER) 28

29 Multi-Core platform (RMI-XLR: embedded) Foundation for Advancement of Education and Research (FAER) 29

30 Tilera 64 core CPU Foundation for Advancement of Education and Research (FAER) 30

31 Tilera Platform Foundation for Advancement of Education and Research (FAER) 31

32 Tera-scale Computing Performance IPS = Instruction per second TIPS GIPS MIPS KIPS 3D & Video Mult- Media Text Kilobytes RMS Single Core Multi-core Megabytes Entertainment Tera-scale Gigabytes Dataset Size Learning & Travel RMS Applications Recognition Terabytes Mining Synthesis Personal Media Creation and Management Health Foundation for Advancement of Education and Research (FAER) 32

33 Intel Polaris (80-core) Foundation for Advancement of Education and Research (FAER) 33

34 Foundation for Advancement of Education and Research (FAER) 34

35 Multi-Core: what next? Foundation for Advancement of Education and Research (FAER) 35

36 Connecting multiple cores Foundation for Advancement of Education and Research (FAER) 36

37 Platform Architecture (multi-core) External I/F Foundation for Advancement of Education and Research (FAER) 37

38 Multi-core: Architectural Challenges - Instruction-level parallelism v/s Thread-level parallelism tradeoffs and balance - Shared resource management (functional units, caches, tlb, btb) - Multi-threading v/s Multi-core tradeoffs - On and Off-chip bandwidth requirements - Latencies (execution, cache, and memory) reduction - Memory Coherence/Consistency (for high speed on-die cache hierarchies) - Multiple domains (and crossing) in clocking, voltage, reset,... - Partitioning resources (between threads/cores) - Fault tolerance (at device, storage, execution, core level) (aka reliability) - On-die interconnect (optimized along latency, bw, modularity, power,...) - Integration (of system components, and/or fixed function devices) Foundation for Advancement of Education and Research (FAER) 38

39 Multi-core: Design Challenges Design Complexity, Productivity Tools / Methods Advance But at slower rate than Moore s Law Replicating cores improves productivity Visibility for Test & Debug Pin Bandwidth/Transistor continues to decline Shrinking dimensions, increasing speeds, Increased test time adding to cost Power Power Delivery di/dt of Amps/nano-second Thermals: Overall power and thermal density Foundation for Advancement of Education and Research (FAER) 39

40 Multi-core: Eco-system challenges Underlying Software assumptions on resource sharing Lack of standard mechanisms to share resource sharing info between hw and OS Lack of Resource sharing aware SW Compilers, Schedulers, Configuration/Management (Power!) etc Legacy SW architectural requirements left on Multi-Core CPUs Compatibility requirements Many more unknowns (to CPU Design world) Foundation for Advancement of Education and Research (FAER) 40

41 Multi-core: Software Challanges - Scalability of O/S Data Structures and Policies - Synchronization and locking, Scheduling, Process management, Data structure sizing and management limitations, Threading granularity and primitives - Memory Hierarchy Awareness - Impact of coherency policy, Efficiency of Data-sharing and Process migration effects, SW visibility to High speed on-die interconnect, SW control of Cache hierarchy, NUCA Awareness - High Bandwidth I/O Support - Light weight Interrupts, Data movement and transformation engines, I/O Affinity Algorithms, Programming Languages, Compilers, Operating Systems, Architectures, Libraries, not ready for 100s of CPUs / chip Foundation for Advancement of Education and Research (FAER) 41

42 More than the cores Foundation for Advancement of Education and Research (FAER) 42

43 Closing notes Single and Multi-core architectures presented Multi-Core CPU is the next generation CPU Architecture 2Core and Intel Quad-Core designs plenty on market already Many More are on their way Several old paradigms ineffective; Several new problems to be addressed Chip Level Multiprocessing and large caches can exploit Moore s Law Thread/Core count in future microprocessor systems to increase Eco-system immature/non-existent Numerous domains in arch/design awaiting research & innovation and here is where you come in!!! Multi-Core Architecture and Design ready for research, development and innovation! Foundation for Advancement of Education and Research (FAER) 43

44 Acknowledgements Gautam Doshi [Principal Engineer, Digital Enterprise Group] Ajay Bhatt [Intel Fellow, Digital Enterprise Group] Dileep Bhandarkar [Architect, Digital Enterprise Group] Sunit Tyagi [Sr. Principal Engineer, Digital Enterprise Group] and countless foil-wares Foundation for Advancement of Education and Research (FAER) 44

45 Resources Intel Tech/Research: Energy Efficient Performance: Intel Core Microarchitecture: Dual-core processor: Multi/Many Core: Intel Platforms: Threading: Foundation for Advancement of Education and Research (FAER) 45

46 Q & A Foundation for Advancement of Education and Research (FAER) 46

47 Backup: Core uarch Foundation for Advancement of Education and Research (FAER) 47

48 Intel Core Microarchitecture Low Power High Performance Scalable Intel Wide Dynamic Execution Intel Intelligent Power Capability Intel Advanced Smart Cache Server Optimized Desktop Optimized (Xeon) Woodcrest (Core2 Duo) Conroe 65nm Intel Smart Memory Access Intel Advanced Digital Media Boost Mobile Optimized (Core2 Duo) Merom *Graphics not Intel representative Higher Education of actual die Program photo or relative & size Foundation for Advancement of Education and Research (FAER) 48

49 Intel Intelligent Power Capability Process Coarse Grained Ultra Fine Grained Transistor 65nm Strained Silicon Low-K K Dielectric More Metal Layers Aggressive Clock Gating Enhanced Speed-Step Step Low VCC Arrays Blocks Controlled Via Sleep Transistors Low Leakage Transistors Sleep Transistors Energy ADVANTAGE Mobile-Level Power Management Energy Efficient Performance *Graphics not Intel representative Higher Education of actual die Program photo or relative & size Foundation for Advancement of Education and Research (FAER) 49

50 Intel Wide Dynamic Execution EACH CORE CORE 1 CORE 2 EFFICIENT 14 STAGE PIPELINE DEEPER BUFFERS 4 WIDE - DECODE TO EXECUTE 4 WIDE - MICRO-OP OP EXECUTE MICRO and MACRO FUSION ENHANCED ALUs INSTRUCTION FETCH AND PRE-DECODE INSTRUCTION QUEUE DECODE RENAME / ALLOC RETIREMENT UNIT (REORDER BUFFER) SCHEDULERS EXECUTE INSTRUCTION FETCH AND PRE-DECODE INSTRUCTION QUEUE DECODE RENAME / ALLOC RETIREMENT UNIT (REORDER BUFFER) SCHEDULERS EXECUTE Perf Energy 33% Wider Execution over Previous Gen ADVANTAGE Comprehensive Advancements Reach Enabled To Teach In Each Core Foundation for Advancement of Education and Research (FAER) 50

51 Intel Wide Dynamic Execution Micro and Macro Fusion Micro Fusion Macro Fusion MACRO FUSION EXAMPLE CMP+JMP IN 1 CLOCK WITH MACRO FUSION WITHOUT MACRO FUSION INSTRUCTION 3 INSTRUCTION 3 ucode ROM DECODE INSTRUCTION 2 INSTRUCTION 1 DECODE INSTRUCTION 2 INSTRUCTION 1 DECODE COMBINED INST 2 & 3 INTERNAL INST 3 INTERNAL INST 1 INTERNAL INST 2 EXECUTE EXECUTE COMPLETED INST 3 INTERNAL INST 1 EXECUTE COMPLETED INST 2 COMPLETED INST 3 COMPLETED INST 1 COMPLETED INST 2 COMPLETED INST 1 Perf Energy Instruction Load Reduced ~ 15% ADVANTAGE ** Micro-Ops Reduced ~ 10% ** *Graphics not representative Reach To of actual Teach die photo or relative size Intel Higher ** Workload Education dependant Program & Foundation for Advancement of Education and Research (FAER) 51

52 Intel Advanced Smart Cache Dynamic L2 Cache Usage Core Microarchitecture Shared L2 Decreased Traffic Increased Traffic Independent L2 Dynamically, Bi-Directionally Available x Not Shareable L1 CACHE L1 CACHE L1 CACHE L1 CACHE CORE 1 CORE 2 CORE 1 CORE 2 Perf Energy Higher Cache Hit Rate ADVANTAGE Reduced BUS Traffic Lower Latency to Data *Graphics not representative of actual die photo or relative size Foundation for Advancement of Education and Research (FAER) 52

53 Intel Smart Memory Access Hardware-based Memory Disambiguation Core Microarchitecture Other INST 2 LOAD [Y] INST 1 STORE [X] IN ORDER INST 2 LOAD [Y] INST 1 STORE [X] DECODE/SCHEDULE DECODE/SCHEDULE INST 2 LOAD [Y] INST 2 LOAD [Y] HARDWARE Mem. Dis. Predictor Inst. 2 Load Can Occur Before Inst. 1 Store INST 1 STORE [X] INST 2 LOAD [Y] EXECUTE INST 1 STORE [X] OUT OF ORDER INST 1 STORE [X] INST 2 LOAD [Y] STALL EXECUTE INST 1 STORE [X] Inst. 2 Must Wait For Inst. 1 Store To Complete Perf Energy Higher Utilization of Pipeline ADVANTAGE Masks latency to data access Higher Performance Foundation for Advancement of Education and Research (FAER) 53

54 Intel Advanced Digital Media Boost Single Cycle SSE In Each Core Fusion Support Single Cycle SSE SOURCE SSE/2/3 OP 127 X4 SSE Operation (SSE/SSE2/SSE3) X3 X2 X1 0 DECODE DECODE DEST Y4 Y3 Y2 Y1 Core µarch CLOCK CYCLE 1 X4opY4 X3opY3 X2opY2 X1opY1 EXECUTE EXECUTE Previous CLOCK CYCLE 1 X2opY2 X1opY1 CLOCK CYCLE 2 X4opY4 X3opY3 Perf Energy Increased Performance ADVANTAGE 128 bit Single Cycle in each core Improved Energy Efficiency *Graphics not representative of actual die photo or relative size Foundation for Advancement of Education and Research (FAER) 54

55 Backup: Next Gen Technologies Foundation for Advancement of Education and Research (FAER) 55

56 Traditional Operating Systems (Time-mux) Foundation for Advancement of Education and Research (FAER) 56

57 What is Virtualization? App App... App VM 0 App App... App VM 1 App App... App Operating System Physical Host Hardware GFX A new layer of software... Guest OS 0... Guest OS 1 Processors Memory Graphics VM Monitor (VMM) Physical Host Hardware Network Storage Keyboard / Mouse Without VMs: Single OS owns all hardware resources With VMs: Multiple OSes share hardware resources Virtualization enables multiple operating systems to run Reach on To the Teach same platform Foundation for Advancement of Education and Research (FAER) 57

58 Types of Virtualization Hosted VMM launched from within an OS, e.g., VMplayer, WSX, GSX, Virtual PC, Virtual Server Cheap but lower performance Hypervisor: A bootable layer on Bios Thick: embeds all the drivers, e.g., ESX Thin: has a service VM, e.g., Xen derivates Virtual Appliances: dedicated Virtual machines, e.g., MojoPC Foundation for Advancement of Education and Research (FAER) 58

59 Intel Virtualization Technology (VT) App.. App App App 1 st VT base SW Solutions OS OS OS OS Virtual Machine Monitor Processors with Intel Virtualization Technology Intel VT First to market with native virtualization support Broadest HW and SW ecosystem support and others Core TM Microarchitecture based systems Significant increase in performance and improved VT performance overall segments Mobile - Intel Core 2 Duo Mobile Processor for Intel Centrino Duo Mobile Technology Desktop - Intel Core 2 Duo Desktop Processor E6000 sequence - Server Dual and Quad Core Intel Xeon Processor 5000 series Get More Done On Every Server Get More Capabilities On Client Foundation for Advancement of Education and Research (FAER) 59

60 Trusted Execution Technology Foundation for Advancement of Education and Research (FAER) 60

61 LT Hardware Ingredients LT = CPU + Chipset + TPM + Protected I/O = LT-specific enhancement CPU Extensions Enables domain separation Sets policy for protected memory Protected Graphics Trusted channel between graphics and trusted SW Integrated or third party discrete graphics Protected Keyboard & Mouse Trusted channel between keyboard/mouse and trusted SW Intel CPU Intel (G)MCH ICH USB LPC Protected Memory Mgmt Enforces access policy to protected memory TPM RAM Trusted Platform Module v1.2 Protects keys, digital certificates & attestation credentials Provides platform authentication Foundation for Advancement of Education and Research (FAER) 61

62 Backup: Misc Foundation for Advancement of Education and Research (FAER) 62

63 Moore s Law Moving Forward ACTUAL FORECAST- Production Generation nm Gate Length <70nm <50nm nm 90nm 65nm 45nm 35nm 22nm nm <35nm <35nm <35nm <22 Wafer Size (mm( mm) ? 300? 22nm Integration Capacity <100M 100M 200M 500M 1B >1B >2B >4B >8B Another decade is probably straight-forward There is certainly no end to creativity. - Gordon Moore, speaking of extending Moore s s Law at ISSCC, Feb 2003 Foundation for Advancement of Education and Research (FAER) 63

64 Multi-Core Power Efficiency Cache Big core Power Performance 2 1 Small core Power = ¼ Performance = 1/2 1 1 C1 C3 Cache C2 C Many core is more power efficient Power ~ area Single thread performance ~ area**.5 Foundation for Advancement of Education and Research (FAER) 64

65 Multi-Core and Memory Gap Growing Performance Gap Peak Instructions Per DRAM Access LOGIC GAP MEMORY Pentium 66MHz Pentium-Pro 200MHz PentiumIII 1100MHz Pentium4 2 GHz Reduce DRAM access with large caches Extra benefit: power savings. Cache is lower power than logic Tolerate memory latency with multiple threads Multiple cores Hyper-threading Foundation for Advancement of Education and Research (FAER) 65

66 Multi-threading tolerates memory latency Serial Execution A i Idle A i+1 B i Idle B i+1 Multi-threaded Execution A i Idle A i+1 B i B i+1 Execute thread B while thread A waits for memory Multi-core has a similar effect Foundation for Advancement of Education and Research (FAER) 66

67 Multi-core tolerates memory latency Serial Execution A i Idle A i+1 B i Idle B i+1 Multi-core Execution A i Idle A i+1 B i Idle B i+1 Execute thread A and B simultaneously Foundation for Advancement of Education and Research (FAER) 67

68 How does Multicore Change Parallel Programming? SMP P1 P2 P3 P4 No change in fundamental programming model cache CMP C1 cache cache cache cache Memory C2 C3 C4 cache cache cache Memory Synchronization and communication costs greatly reduced Makes it practical to parallelize more programs Resources now shared Caches Memory interface Optimization choices may be different Foundation for Advancement of Education and Research (FAER) 68

69 Art of the Possible Billion transistors realized in 65nm Si process Multi-Billion transistors possible in future Si process Large die sizes can be built 400 to 600 square millimeters What can fit on a single die? For 65nm (rough est) 30 mm 2 per proc. 15 mm 2 per MB Die size (core + cache only) in mm 2 16 MB cache 2 cores cores cores MB cache Foundation for Advancement of Education and Research (FAER) 69

70 Quad Cores here a quarter ago already! Foundation for Advancement of Education and Research (FAER) 70

71 Multi-Core Foundation for Advancement of Education and Research (FAER) 71

Enabling Technologies for Distributed and Cloud Computing

Enabling Technologies for Distributed and Cloud Computing Enabling Technologies for Distributed and Cloud Computing Dr. Sanjay P. Ahuja, Ph.D. 2010-14 FIS Distinguished Professor of Computer Science School of Computing, UNF Multi-core CPUs and Multithreading

More information

More on Pipelining and Pipelines in Real Machines CS 333 Fall 2006 Main Ideas Data Hazards RAW WAR WAW More pipeline stall reduction techniques Branch prediction» static» dynamic bimodal branch prediction

More information

Introduction to Cloud Computing

Introduction to Cloud Computing Introduction to Cloud Computing Parallel Processing I 15 319, spring 2010 7 th Lecture, Feb 2 nd Majd F. Sakr Lecture Motivation Concurrency and why? Different flavors of parallel computing Get the basic

More information

Lecture 11: Multi-Core and GPU. Multithreading. Integration of multiple processor cores on a single chip.

Lecture 11: Multi-Core and GPU. Multithreading. Integration of multiple processor cores on a single chip. Lecture 11: Multi-Core and GPU Multi-core computers Multithreading GPUs General Purpose GPUs Zebo Peng, IDA, LiTH 1 Multi-Core System Integration of multiple processor cores on a single chip. To provide

More information

Enabling Technologies for Distributed Computing

Enabling Technologies for Distributed Computing Enabling Technologies for Distributed Computing Dr. Sanjay P. Ahuja, Ph.D. Fidelity National Financial Distinguished Professor of CIS School of Computing, UNF Multi-core CPUs and Multithreading Technologies

More information

Digital Design for Low Power Systems

Digital Design for Low Power Systems Digital Design for Low Power Systems Shekhar Borkar Intel Corp. Outline Low Power Outlook & Challenges Circuit solutions for leakage avoidance, control, & tolerance Microarchitecture for Low Power System

More information

OpenPOWER Outlook AXEL KOEHLER SR. SOLUTION ARCHITECT HPC

OpenPOWER Outlook AXEL KOEHLER SR. SOLUTION ARCHITECT HPC OpenPOWER Outlook AXEL KOEHLER SR. SOLUTION ARCHITECT HPC Driving industry innovation The goal of the OpenPOWER Foundation is to create an open ecosystem, using the POWER Architecture to share expertise,

More information

Multi-Threading Performance on Commodity Multi-Core Processors

Multi-Threading Performance on Commodity Multi-Core Processors Multi-Threading Performance on Commodity Multi-Core Processors Jie Chen and William Watson III Scientific Computing Group Jefferson Lab 12000 Jefferson Ave. Newport News, VA 23606 Organization Introduction

More information

Multi-core architectures. Jernej Barbic 15-213, Spring 2007 May 3, 2007

Multi-core architectures. Jernej Barbic 15-213, Spring 2007 May 3, 2007 Multi-core architectures Jernej Barbic 15-213, Spring 2007 May 3, 2007 1 Single-core computer 2 Single-core CPU chip the single core 3 Multi-core architectures This lecture is about a new trend in computer

More information

Low Power AMD Athlon 64 and AMD Opteron Processors

Low Power AMD Athlon 64 and AMD Opteron Processors Low Power AMD Athlon 64 and AMD Opteron Processors Hot Chips 2004 Presenter: Marius Evers Block Diagram of AMD Athlon 64 and AMD Opteron Based on AMD s 8 th generation architecture AMD Athlon 64 and AMD

More information

Intel Virtualization Technology

Intel Virtualization Technology Intel Virtualization Technology Examining VT-x and VT-d August, 2007 v 1.0 Peter Carlston, Platform Architect Embedded & Communications Processor Division Intel, the Intel logo, Pentium, and VTune are

More information

Intel Itanium Quad-Core Architecture for the Enterprise. Lambert Schaelicke Eric DeLano

Intel Itanium Quad-Core Architecture for the Enterprise. Lambert Schaelicke Eric DeLano Intel Itanium Quad-Core Architecture for the Enterprise Lambert Schaelicke Eric DeLano Agenda Introduction Intel Itanium Roadmap Intel Itanium Processor 9300 Series Overview Key Features Pipeline Overview

More information

Parallel Programming Survey

Parallel Programming Survey Christian Terboven 02.09.2014 / Aachen, Germany Stand: 26.08.2014 Version 2.3 IT Center der RWTH Aachen University Agenda Overview: Processor Microarchitecture Shared-Memory

More information

Intel Itanium Architecture

Intel Itanium Architecture Intel Itanium Architecture Roadmap and Technology Update Dr. Gernot Hoyler Technical Marketing EMEA Intel Itanium Architecture Growth MARKET Over 3x revenue growth Y/Y* More than 10x growth* in shipments

More information

Making Multicore Work and Measuring its Benefits. Markus Levy, president EEMBC and Multicore Association

Making Multicore Work and Measuring its Benefits. Markus Levy, president EEMBC and Multicore Association Making Multicore Work and Measuring its Benefits Markus Levy, president EEMBC and Multicore Association Agenda Why Multicore? Standards and issues in the multicore community What is Multicore Association?

More information

Multi-core and Linux* Kernel

Multi-core and Linux* Kernel Multi-core and Linux* Kernel Suresh Siddha Intel Open Source Technology Center Abstract Semiconductor technological advances in the recent years have led to the inclusion of multiple CPU execution cores

More information

Parallel Algorithm Engineering

Parallel Algorithm Engineering Parallel Algorithm Engineering Kenneth S. Bøgh PhD Fellow Based on slides by Darius Sidlauskas Outline Background Current multicore architectures UMA vs NUMA The openmp framework Examples Software crisis

More information

Desktop Processor Roadmap. Solution Provider Accounts

Desktop Processor Roadmap. Solution Provider Accounts Desktop Processor Roadmap Solution Provider Accounts August 2008 Desktop Division Roadmap Changes since July 2008 Additions Energy-efficient Brisbane 5050e processor to launch in Q408 Desktop Processors

More information

CS 159 Two Lecture Introduction. Parallel Processing: A Hardware Solution & A Software Challenge

CS 159 Two Lecture Introduction. Parallel Processing: A Hardware Solution & A Software Challenge CS 159 Two Lecture Introduction Parallel Processing: A Hardware Solution & A Software Challenge We re on the Road to Parallel Processing Outline Hardware Solution (Day 1) Software Challenge (Day 2) Opportunities

More information

evm Virtualization Platform for Windows

evm Virtualization Platform for Windows B A C K G R O U N D E R evm Virtualization Platform for Windows Host your Embedded OS and Windows on a Single Hardware Platform using Intel Virtualization Technology April, 2008 TenAsys Corporation 1400

More information

Overview. CPU Manufacturers. Current Intel and AMD Offerings

Overview. CPU Manufacturers. Current Intel and AMD Offerings Central Processor Units (CPUs) Overview... 1 CPU Manufacturers... 1 Current Intel and AMD Offerings... 1 Evolution of Intel Processors... 3 S-Spec Code... 5 Basic Components of a CPU... 6 The CPU Die and

More information

MODULE 3 VIRTUALIZED DATA CENTER COMPUTE

MODULE 3 VIRTUALIZED DATA CENTER COMPUTE MODULE 3 VIRTUALIZED DATA CENTER COMPUTE Module 3: Virtualized Data Center Compute Upon completion of this module, you should be able to: Describe compute virtualization Discuss the compute virtualization

More information

Intel Pentium 4 Processor on 90nm Technology

Intel Pentium 4 Processor on 90nm Technology Intel Pentium 4 Processor on 90nm Technology Ronak Singhal August 24, 2004 Hot Chips 16 1 1 Agenda Netburst Microarchitecture Review Microarchitecture Features Hyper-Threading Technology SSE3 Intel Extended

More information

Microkernels, virtualization, exokernels. Tutorial 1 CSC469

Microkernels, virtualization, exokernels. Tutorial 1 CSC469 Microkernels, virtualization, exokernels Tutorial 1 CSC469 Monolithic kernel vs Microkernel Monolithic OS kernel Application VFS System call User mode What was the main idea? What were the problems? IPC,

More information

Exploring the Design of the Cortex-A15 Processor ARM s next generation mobile applications processor. Travis Lanier Senior Product Manager

Exploring the Design of the Cortex-A15 Processor ARM s next generation mobile applications processor. Travis Lanier Senior Product Manager Exploring the Design of the Cortex-A15 Processor ARM s next generation mobile applications processor Travis Lanier Senior Product Manager 1 Cortex-A15: Next Generation Leadership Cortex-A class multi-processor

More information

Scaling in a Hypervisor Environment

Scaling in a Hypervisor Environment Scaling in a Hypervisor Environment Richard McDougall Chief Performance Architect VMware VMware ESX Hypervisor Architecture Guest Monitor Guest TCP/IP Monitor (BT, HW, PV) File System CPU is controlled

More information

Putting it all together: Intel Nehalem. http://www.realworldtech.com/page.cfm?articleid=rwt040208182719

Putting it all together: Intel Nehalem. http://www.realworldtech.com/page.cfm?articleid=rwt040208182719 Putting it all together: Intel Nehalem http://www.realworldtech.com/page.cfm?articleid=rwt040208182719 Intel Nehalem Review entire term by looking at most recent microprocessor from Intel Nehalem is code

More information

This Unit: Putting It All Together. CIS 501 Computer Architecture. Sources. What is Computer Architecture?

This Unit: Putting It All Together. CIS 501 Computer Architecture. Sources. What is Computer Architecture? This Unit: Putting It All Together CIS 501 Computer Architecture Unit 11: Putting It All Together: Anatomy of the XBox 360 Game Console Slides originally developed by Amir Roth with contributions by Milo

More information

A Quantum Leap in Enterprise Computing

A Quantum Leap in Enterprise Computing A Quantum Leap in Enterprise Computing Unprecedented Reliability and Scalability in a Multi-Processor Server Product Brief Intel Xeon Processor 7500 Series Whether you ve got data-demanding applications,

More information

OC By Arsene Fansi T. POLIMI 2008 1

OC By Arsene Fansi T. POLIMI 2008 1 IBM POWER 6 MICROPROCESSOR OC By Arsene Fansi T. POLIMI 2008 1 WHAT S IBM POWER 6 MICROPOCESSOR The IBM POWER6 microprocessor powers the new IBM i-series* and p-series* systems. It s based on IBM POWER5

More information

A Survey on ARM Cortex A Processors. Wei Wang Tanima Dey

A Survey on ARM Cortex A Processors. Wei Wang Tanima Dey A Survey on ARM Cortex A Processors Wei Wang Tanima Dey 1 Overview of ARM Processors Focusing on Cortex A9 & Cortex A15 ARM ships no processors but only IP cores For SoC integration Targeting markets:

More information

AMD PhenomII. Architecture for Multimedia System -2010. Prof. Cristina Silvano. Group Member: Nazanin Vahabi 750234 Kosar Tayebani 734923

AMD PhenomII. Architecture for Multimedia System -2010. Prof. Cristina Silvano. Group Member: Nazanin Vahabi 750234 Kosar Tayebani 734923 AMD PhenomII Architecture for Multimedia System -2010 Prof. Cristina Silvano Group Member: Nazanin Vahabi 750234 Kosar Tayebani 734923 Outline Introduction Features Key architectures References AMD Phenom

More information

A Scalable VISC Processor Platform for Modern Client and Cloud Workloads

A Scalable VISC Processor Platform for Modern Client and Cloud Workloads A Scalable VISC Processor Platform for Modern Client and Cloud Workloads Mohammad Abdallah Founder, President and CTO Soft Machines Linley Processor Conference October 7, 2015 Agenda Soft Machines Background

More information

Unit A451: Computer systems and programming. Section 2: Computing Hardware 1/5: Central Processing Unit

Unit A451: Computer systems and programming. Section 2: Computing Hardware 1/5: Central Processing Unit Unit A451: Computer systems and programming Section 2: Computing Hardware 1/5: Central Processing Unit Section Objectives Candidates should be able to: (a) State the purpose of the CPU (b) Understand the

More information

Multi-Core Programming

Multi-Core Programming Multi-Core Programming Increasing Performance through Software Multi-threading Shameem Akhter Jason Roberts Intel PRESS Copyright 2006 Intel Corporation. All rights reserved. ISBN 0-9764832-4-6 No part

More information

INSTRUCTION LEVEL PARALLELISM PART VII: REORDER BUFFER

INSTRUCTION LEVEL PARALLELISM PART VII: REORDER BUFFER Course on: Advanced Computer Architectures INSTRUCTION LEVEL PARALLELISM PART VII: REORDER BUFFER Prof. Cristina Silvano Politecnico di Milano [email protected] Prof. Silvano, Politecnico di Milano

More information

Virtualization Technologies and Blackboard: The Future of Blackboard Software on Multi-Core Technologies

Virtualization Technologies and Blackboard: The Future of Blackboard Software on Multi-Core Technologies Virtualization Technologies and Blackboard: The Future of Blackboard Software on Multi-Core Technologies Kurt Klemperer, Principal System Performance Engineer [email protected] Agenda Session Length:

More information

Design Cycle for Microprocessors

Design Cycle for Microprocessors Cycle for Microprocessors Raúl Martínez Intel Barcelona Research Center Cursos de Verano 2010 UCLM Intel Corporation, 2010 Agenda Introduction plan Architecture Microarchitecture Logic Silicon ramp Types

More information

Logical Operations. Control Unit. Contents. Arithmetic Operations. Objectives. The Central Processing Unit: Arithmetic / Logic Unit.

Logical Operations. Control Unit. Contents. Arithmetic Operations. Objectives. The Central Processing Unit: Arithmetic / Logic Unit. Objectives The Central Processing Unit: What Goes on Inside the Computer Chapter 4 Identify the components of the central processing unit and how they work together and interact with memory Describe how

More information

Introduction to Microprocessors

Introduction to Microprocessors Introduction to Microprocessors Yuri Baida [email protected] [email protected] October 2, 2010 Moscow Institute of Physics and Technology Agenda Background and History What is a microprocessor?

More information

Distribution One Server Requirements

Distribution One Server Requirements Distribution One Server Requirements Introduction Welcome to the Hardware Configuration Guide. The goal of this guide is to provide a practical approach to sizing your Distribution One application and

More information

Full and Para Virtualization

Full and Para Virtualization Full and Para Virtualization Dr. Sanjay P. Ahuja, Ph.D. 2010-14 FIS Distinguished Professor of Computer Science School of Computing, UNF x86 Hardware Virtualization The x86 architecture offers four levels

More information

Servervirualisierung mit Citrix XenServer

Servervirualisierung mit Citrix XenServer Servervirualisierung mit Citrix XenServer Paul Murray, Senior Systems Engineer, MSG EMEA Citrix Systems International GmbH [email protected] Virtualization Wave is Just Beginning Only 6% of x86

More information

IOS110. Virtualization 5/27/2014 1

IOS110. Virtualization 5/27/2014 1 IOS110 Virtualization 5/27/2014 1 Agenda What is Virtualization? Types of Virtualization. Advantages and Disadvantages. Virtualization software Hyper V What is Virtualization? Virtualization Refers to

More information

This Unit: Multithreading (MT) CIS 501 Computer Architecture. Performance And Utilization. Readings

This Unit: Multithreading (MT) CIS 501 Computer Architecture. Performance And Utilization. Readings This Unit: Multithreading (MT) CIS 501 Computer Architecture Unit 10: Hardware Multithreading Application OS Compiler Firmware CU I/O Memory Digital Circuits Gates & Transistors Why multithreading (MT)?

More information

OPENSPARC T1 OVERVIEW

OPENSPARC T1 OVERVIEW Chapter Four OPENSPARC T1 OVERVIEW Denis Sheahan Distinguished Engineer Niagara Architecture Group Sun Microsystems Creative Commons 3.0United United States License Creative CommonsAttribution-Share Attribution-Share

More information

Embedded Systems: map to FPGA, GPU, CPU?

Embedded Systems: map to FPGA, GPU, CPU? Embedded Systems: map to FPGA, GPU, CPU? Jos van Eijndhoven [email protected] Bits&Chips Embedded systems Nov 7, 2013 # of transistors Moore s law versus Amdahl s law Computational Capacity Hardware

More information

Why Computers Are Getting Slower (and what we can do about it) Rik van Riel Sr. Software Engineer, Red Hat

Why Computers Are Getting Slower (and what we can do about it) Rik van Riel Sr. Software Engineer, Red Hat Why Computers Are Getting Slower (and what we can do about it) Rik van Riel Sr. Software Engineer, Red Hat Why Computers Are Getting Slower The traditional approach better performance Why computers are

More information

An Implementation Of Multiprocessor Linux

An Implementation Of Multiprocessor Linux An Implementation Of Multiprocessor Linux This document describes the implementation of a simple SMP Linux kernel extension and how to use this to develop SMP Linux kernels for architectures other than

More information

Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging

Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging In some markets and scenarios where competitive advantage is all about speed, speed is measured in micro- and even nano-seconds.

More information

PikeOS: Multi-Core RTOS for IMA. Dr. Sergey Tverdyshev SYSGO AG 29.10.2012, Moscow

PikeOS: Multi-Core RTOS for IMA. Dr. Sergey Tverdyshev SYSGO AG 29.10.2012, Moscow PikeOS: Multi-Core RTOS for IMA Dr. Sergey Tverdyshev SYSGO AG 29.10.2012, Moscow Contents Multi Core Overview Hardware Considerations Multi Core Software Design Certification Consideratins PikeOS Multi-Core

More information

Measuring Cache and Memory Latency and CPU to Memory Bandwidth

Measuring Cache and Memory Latency and CPU to Memory Bandwidth White Paper Joshua Ruggiero Computer Systems Engineer Intel Corporation Measuring Cache and Memory Latency and CPU to Memory Bandwidth For use with Intel Architecture December 2008 1 321074 Executive Summary

More information

Chapter 2 Parallel Computer Architecture

Chapter 2 Parallel Computer Architecture Chapter 2 Parallel Computer Architecture The possibility for a parallel execution of computations strongly depends on the architecture of the execution platform. This chapter gives an overview of the general

More information

Thread level parallelism

Thread level parallelism Thread level parallelism ILP is used in straight line code or loops Cache miss (off-chip cache and main memory) is unlikely to be hidden using ILP. Thread level parallelism is used instead. Thread: process

More information

Xeon+FPGA Platform for the Data Center

Xeon+FPGA Platform for the Data Center Xeon+FPGA Platform for the Data Center ISCA/CARL 2015 PK Gupta, Director of Cloud Platform Technology, DCG/CPG Overview Data Center and Workloads Xeon+FPGA Accelerator Platform Applications and Eco-system

More information

Intel Core i3-2310m Processor (3M Cache, 2.10 GHz)

Intel Core i3-2310m Processor (3M Cache, 2.10 GHz) Intel Core i3-2310m Processor All Essentials Memory Specifications Essentials Status Launched Compare w (0) Graphics Specifications Launch Date Q1'11 Expansion Options Package Specifications Advanced Technologies

More information

SOC architecture and design

SOC architecture and design SOC architecture and design system-on-chip (SOC) processors: become components in a system SOC covers many topics processor: pipelined, superscalar, VLIW, array, vector storage: cache, embedded and external

More information

Virtualization. Clothing the Wolf in Wool. Wednesday, April 17, 13

Virtualization. Clothing the Wolf in Wool. Wednesday, April 17, 13 Virtualization Clothing the Wolf in Wool Virtual Machines Began in 1960s with IBM and MIT Project MAC Also called open shop operating systems Present user with the view of a bare machine Execute most instructions

More information

Hardware Based Virtualization Technologies. Elsie Wahlig [email protected] Platform Software Architect

Hardware Based Virtualization Technologies. Elsie Wahlig elsie.wahlig@amd.com Platform Software Architect Hardware Based Virtualization Technologies Elsie Wahlig [email protected] Platform Software Architect Outline What is Virtualization? Evolution of Virtualization AMD Virtualization AMD s IO Virtualization

More information

Discovering Computers 2011. Living in a Digital World

Discovering Computers 2011. Living in a Digital World Discovering Computers 2011 Living in a Digital World Objectives Overview Differentiate among various styles of system units on desktop computers, notebook computers, and mobile devices Identify chips,

More information

Leading Virtualization Performance and Energy Efficiency in a Multi-processor Server

Leading Virtualization Performance and Energy Efficiency in a Multi-processor Server Leading Virtualization Performance and Energy Efficiency in a Multi-processor Server Product Brief Intel Xeon processor 7400 series Fewer servers. More performance. With the architecture that s specifically

More information

Exascale Challenges and General Purpose Processors. Avinash Sodani, Ph.D. Chief Architect, Knights Landing Processor Intel Corporation

Exascale Challenges and General Purpose Processors. Avinash Sodani, Ph.D. Chief Architect, Knights Landing Processor Intel Corporation Exascale Challenges and General Purpose Processors Avinash Sodani, Ph.D. Chief Architect, Knights Landing Processor Intel Corporation Jun-93 Aug-94 Oct-95 Dec-96 Feb-98 Apr-99 Jun-00 Aug-01 Oct-02 Dec-03

More information

High Performance Computing in the Multi-core Area

High Performance Computing in the Multi-core Area High Performance Computing in the Multi-core Area Arndt Bode Technische Universität München Technology Trends for Petascale Computing Architectures: Multicore Accelerators Special Purpose Reconfigurable

More information

BC43: Virtualization and the Green Factor. Ed Harnish

BC43: Virtualization and the Green Factor. Ed Harnish BC43: Virtualization and the Green Factor Ed Harnish Agenda The Need for a Green Datacenter Using Greener Technologies Reducing Server Footprints Moving to new Processor Architectures The Benefits of Virtualization

More information

Itanium 2 Platform and Technologies. Alexander Grudinski Business Solution Specialist Intel Corporation

Itanium 2 Platform and Technologies. Alexander Grudinski Business Solution Specialist Intel Corporation Itanium 2 Platform and Technologies Alexander Grudinski Business Solution Specialist Intel Corporation Intel s s Itanium platform Top 500 lists: Intel leads with 84 Itanium 2-based systems Continued growth

More information

Making the Move to Quad-Core and Beyond

Making the Move to Quad-Core and Beyond White Paper Intel Multi-Core Processors Intel Multi-Core Processors Making the Move to Quad-Core and Beyond R.M. Ramanathan Intel Corporation White Paper Intel Multi-Core Processors: Making the Move to

More information

How To Understand The Design Of A Microprocessor

How To Understand The Design Of A Microprocessor Computer Architecture R. Poss 1 What is computer architecture? 2 Your ideas and expectations What is part of computer architecture, what is not? Who are computer architects, what is their job? What is

More information

Multithreading Lin Gao cs9244 report, 2006

Multithreading Lin Gao cs9244 report, 2006 Multithreading Lin Gao cs9244 report, 2006 2 Contents 1 Introduction 5 2 Multithreading Technology 7 2.1 Fine-grained multithreading (FGMT)............. 8 2.2 Coarse-grained multithreading (CGMT)............

More information

Hitachi Virtage Embedded Virtualization Hitachi BladeSymphony 10U

Hitachi Virtage Embedded Virtualization Hitachi BladeSymphony 10U Hitachi Virtage Embedded Virtualization Hitachi BladeSymphony 10U Datasheet Brings the performance and reliability of mainframe virtualization to blade computing BladeSymphony is the first true enterprise-class

More information

Performance monitoring at CERN openlab. July 20 th 2012 Andrzej Nowak, CERN openlab

Performance monitoring at CERN openlab. July 20 th 2012 Andrzej Nowak, CERN openlab Performance monitoring at CERN openlab July 20 th 2012 Andrzej Nowak, CERN openlab Data flow Reconstruction Selection and reconstruction Online triggering and filtering in detectors Raw Data (100%) Event

More information

The Motherboard Chapter #5

The Motherboard Chapter #5 The Motherboard Chapter #5 Amy Hissom Key Terms Advanced Transfer Cache (ATC) A type of L2 cache contained within the Pentium processor housing that is embedded on the same core processor die as the CPU

More information

Generations of the computer. processors.

Generations of the computer. processors. . Piotr Gwizdała 1 Contents 1 st Generation 2 nd Generation 3 rd Generation 4 th Generation 5 th Generation 6 th Generation 7 th Generation 8 th Generation Dual Core generation Improves and actualizations

More information

Distributed and Cloud Computing

Distributed and Cloud Computing Distributed and Cloud Computing K. Hwang, G. Fox and J. Dongarra Chapter 3: Virtual Machines and Virtualization of Clusters and datacenters Adapted from Kai Hwang University of Southern California March

More information

Quad-Core Intel Xeon Processor

Quad-Core Intel Xeon Processor Product Brief Intel Xeon Processor 7300 Series Quad-Core Intel Xeon Processor 7300 Series Maximize Performance and Scalability in Multi-Processor Platforms Built for Virtualization and Data Demanding Applications

More information

OBJECTIVE ANALYSIS WHITE PAPER MATCH FLASH. TO THE PROCESSOR Why Multithreading Requires Parallelized Flash ATCHING

OBJECTIVE ANALYSIS WHITE PAPER MATCH FLASH. TO THE PROCESSOR Why Multithreading Requires Parallelized Flash ATCHING OBJECTIVE ANALYSIS WHITE PAPER MATCH ATCHING FLASH TO THE PROCESSOR Why Multithreading Requires Parallelized Flash T he computing community is at an important juncture: flash memory is now generally accepted

More information

Vocera Voice 4.3 and 4.4 Server Sizing Matrix

Vocera Voice 4.3 and 4.4 Server Sizing Matrix Vocera Voice 4.3 and 4.4 Server Sizing Matrix Vocera Server Recommended Configuration Guidelines Maximum Simultaneous Users 450 5,000 Sites Single Site or Multiple Sites Requires Multiple Sites Entities

More information

The Art of Virtualization with Free Software

The Art of Virtualization with Free Software Master on Free Software 2009/2010 {mvidal,jfcastro}@libresoft.es GSyC/Libresoft URJC April 24th, 2010 (cc) 2010. Some rights reserved. This work is licensed under a Creative Commons Attribution-Share Alike

More information

Intel Core i3-2120 Processor (3M Cache, 3.30 GHz)

Intel Core i3-2120 Processor (3M Cache, 3.30 GHz) *Trademarks Intel Core i3-2120 Processor (3M Cache, 3.30 GHz) COMPARE PRODUCTS Intel Corporation All Essentials Memory Specifications Essentials Status Launched Add to Compare Compare w (0) Graphics Specifications

More information

Slide Set 8. for ENCM 369 Winter 2015 Lecture Section 01. Steve Norman, PhD, PEng

Slide Set 8. for ENCM 369 Winter 2015 Lecture Section 01. Steve Norman, PhD, PEng Slide Set 8 for ENCM 369 Winter 2015 Lecture Section 01 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary Winter Term, 2015 ENCM 369 W15 Section

More information

Performance evaluation

Performance evaluation Performance evaluation Arquitecturas Avanzadas de Computadores - 2547021 Departamento de Ingeniería Electrónica y de Telecomunicaciones Facultad de Ingeniería 2015-1 Bibliography and evaluation Bibliography

More information

Intel architecture. Platform Basics. White Paper Todd Langley Systems Engineer/ Architect Intel Corporation. September 2010

Intel architecture. Platform Basics. White Paper Todd Langley Systems Engineer/ Architect Intel Corporation. September 2010 White Paper Todd Langley Systems Engineer/ Architect Intel Corporation Intel architecture Platform Basics September 2010 324377 Executive Summary Creating an Intel architecture design encompasses some

More information

AMD Opteron Quad-Core

AMD Opteron Quad-Core AMD Opteron Quad-Core a brief overview Daniele Magliozzi Politecnico di Milano Opteron Memory Architecture native quad-core design (four cores on a single die for more efficient data sharing) enhanced

More information

The Central Processing Unit:

The Central Processing Unit: The Central Processing Unit: What Goes on Inside the Computer Chapter 4 Objectives Identify the components of the central processing unit and how they work together and interact with memory Describe how

More information

Intel Xeon Processor E5-2600

Intel Xeon Processor E5-2600 Intel Xeon Processor E5-2600 Best combination of performance, power efficiency, and cost. Platform Microarchitecture Processor Socket Chipset Intel Xeon E5 Series Processors and the Intel C600 Chipset

More information

A Comparison of VMware and {Virtual Server}

A Comparison of VMware and {Virtual Server} A Comparison of VMware and {Virtual Server} Kurt Lamoreaux Consultant, MCSE, VCP Computer Networking and Consulting Services A Funny Thing Happened on the Way to HP World 2004 Call for speakers at the

More information

Chapter 6. Inside the System Unit. What You Will Learn... Computers Are Your Future. What You Will Learn... Describing Hardware Performance

Chapter 6. Inside the System Unit. What You Will Learn... Computers Are Your Future. What You Will Learn... Describing Hardware Performance What You Will Learn... Computers Are Your Future Chapter 6 Understand how computers represent data Understand the measurements used to describe data transfer rates and data storage capacity List the components

More information

What is a System on a Chip?

What is a System on a Chip? What is a System on a Chip? Integration of a complete system, that until recently consisted of multiple ICs, onto a single IC. CPU PCI DSP SRAM ROM MPEG SoC DRAM System Chips Why? Characteristics: Complex

More information

PC Solutions That Mean Business

PC Solutions That Mean Business PC Solutions That Mean Business Desktop and notebook PCs for small business Powered by the Intel Core 2 Duo Processor The Next Big Thing in Business PCs The Features and Performance to Drive Business Success

More information

Thread Level Parallelism II: Multithreading

Thread Level Parallelism II: Multithreading Thread Level Parallelism II: Multithreading Readings: H&P: Chapter 3.5 Paper: NIAGARA: A 32-WAY MULTITHREADED Thread Level Parallelism II: Multithreading 1 This Unit: Multithreading (MT) Application OS

More information

Symmetric Multiprocessing

Symmetric Multiprocessing Multicore Computing A multi-core processor is a processing system composed of two or more independent cores. One can describe it as an integrated circuit to which two or more individual processors (called

More information

Next Generation Intel Microarchitecture Nehalem Paul G. Howard, Ph.D. Chief Scientist, Microway, Inc. Copyright 2009 by Microway, Inc.

Next Generation Intel Microarchitecture Nehalem Paul G. Howard, Ph.D. Chief Scientist, Microway, Inc. Copyright 2009 by Microway, Inc. Next Generation Intel Microarchitecture Nehalem Paul G. Howard, Ph.D. Chief Scientist, Microway, Inc. Copyright 2009 by Microway, Inc. Intel usually introduces a new processor every year, alternating between

More information

IT@Intel. Comparing Multi-Core Processors for Server Virtualization

IT@Intel. Comparing Multi-Core Processors for Server Virtualization White Paper Intel Information Technology Computer Manufacturing Server Virtualization Comparing Multi-Core Processors for Server Virtualization Intel IT tested servers based on select Intel multi-core

More information

The Transition to PCI Express* for Client SSDs

The Transition to PCI Express* for Client SSDs The Transition to PCI Express* for Client SSDs Amber Huffman Senior Principal Engineer Intel Santa Clara, CA 1 *Other names and brands may be claimed as the property of others. Legal Notices and Disclaimers

More information

Multicore Programming with LabVIEW Technical Resource Guide

Multicore Programming with LabVIEW Technical Resource Guide Multicore Programming with LabVIEW Technical Resource Guide 2 INTRODUCTORY TOPICS UNDERSTANDING PARALLEL HARDWARE: MULTIPROCESSORS, HYPERTHREADING, DUAL- CORE, MULTICORE AND FPGAS... 5 DIFFERENCES BETWEEN

More information

Computer Performance. Topic 3. Contents. Prerequisite knowledge Before studying this topic you should be able to:

Computer Performance. Topic 3. Contents. Prerequisite knowledge Before studying this topic you should be able to: 55 Topic 3 Computer Performance Contents 3.1 Introduction...................................... 56 3.2 Measuring performance............................... 56 3.2.1 Clock Speed.................................

More information