ECE 5745 Complex Digital ASIC Design Course Overview
|
|
- Maurice Bailey
- 8 years ago
- Views:
Transcription
1 ECE 5745 Complex Digital ASIC Design Course Overview Christopher Batten School of Electrical and Computer Engineering Cornell University
2 Application Algorithm PL OS ISA Arch RTL Gates Circuits Complex Digital ASIC Design I Course goal, structure, motivation. What is the goal of the course?. How is the course structured?. Why should students want to take this course? I Activity 1: Evaluation of Integer Multiplier I Case Study: Scalar vs. Vector Processors. Example design-space exploration. Example real ASIC chips I Activity 2: Brainstorming for Sorting Accelerator I Course Logistics Devices Technology ECE 5745 Course Overview 2 / 42
3 The Computer Engineering Stack Application Gap too large to bridge in one step (but there are exceptions e.g., magnetic compass) Technology In its broadest definition, computer architecture is the design of the abstraction/implementation layers that allow us to execute information processing applications efficiently using available manufacturing technologies ECE 5745 Course Overview 3 / 42
4 The Computer Engineering Stack Computer Architecture Application Algorithm Programming Language Operating System Instruction Set Architecture Microarchitecture Register-Transfer Level Gate Level Circuits Devices Technology In its broadest definition, computer architecture is the design of the abstraction/implementation layers that allow us to execute information processing applications efficiently using available manufacturing technologies ECE 5745 Course Overview 3 / 42
5 What is Computer Architecture? Computer Architecture Application Algorithm Programming Language Operating System Instruction Set Architecture Microarchitecture Register-Transfer Level Gate Level Circuits Devices Technology Application Requirements Suggest how to improve architecture Provide revenue to fund development Architecture provides feedback to guide application and technology research directions Technology Constraints Restrict what can be done efficiently New technologies make new arch possible In its broadest definition, computer architecture is the design of the abstraction/implementation layers that allow us to execute information processing applications efficiently using available manufacturing technologies ECE 5745 Course Overview 4 / 42
6 Key Metrics in Computer Architecture I$ Network I$ I$ I$ I Primary Metrics. Execution time (cycles/task). Energy (Joules/task). Cycle time (ns/cycle). Area (µm 2 ) P D$ P P P Network D$ D$ D$ Network I Secondary Metrics. Performance (ns/task). Average power (Watts). Peak power (Watts). Cost ($). Design complexity. Reliability. Flexibility Discuss qualitative first-order analysis from ECE 4750 on board ECE 5745 Course Overview 5 / 42
7 Unanswered Questions from ECE 4750 P I$ D$ Network Accelerated Instructions Xcel Xcel Network Network I$ P D$ D$ D$ I How can we quantitatively evaluate area, cycle time, and energy? I How do we actually implement processors, memories, and networks in a real chip? I How should we analyze application-specific accelerators?. Very loosely coupled memory-mapped accelerators. More tightly coupled co-processor accelerators. Specialized instructions and functional units ECE 5745 Course Overview 6 / 42
8 Application-Specific Accelerators Network Out-of-Order Superscalar Superpipelined C D P I$ Accelerated Instructions Xcel Xcel Network I$ P Energy (Joules per Task) Simple Proc Superscalar w/ Deeper Pipelines B A Multicore E F Processor Power Constraint Specialized Accelerators D$ D$ D$ D$ Network Performance (Tasks per Second) ECE 5745 Course Overview 7 / 42
9 Application-Specific Accelerators P I$ D$ Network Accelerated Instructions Xcel Xcel Network I$ P D$ D$ D$ Energy Efficiency (Tasks per Joule) Design Performance Constraint Embedded Architectures Simple Processor Custom ASIC Less Flexible Accelerator More Flexible Accelerator Design Power Constraint Flexibility vs. Specialization High-Performance Architectures Performance (Tasks per Second) Network ECE 5745 Course Overview 8 / 42
10 Goal for ECE 5745 is to answer these questions! P I$ D$ Network Accelerated Instructions Xcel Xcel Network Network I$ P D$ D$ D$ I How can we quantitatively evaluate area, cycle time, and energy? I How do we actually implement processors, memories, and networks in a real chip? I How should we analyze application-specific accelerators?. Very loosely coupled memory-mapped accelerators. More tightly coupled co-processor accelerators. Specialized instructions and functional units ECE 5745 Course Overview 9 / 42
11 Piece of full-custom multiplier array, 1.0 m 2-metal Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Full Custom Design vs. Standard-Cell Design Full-custom layout in 1.0µm w/ 2 metal layers uires complete customization of all layers of wafer Reserved for very high performance or very high volume devices (Intel microprocessors, RF power amps for cellphones) t time consuming design style though each design team usually imposes some discipline igner is free to do anything, anywhere I Full-Custom Design Full Custom Design. Designer is free to do anything, anywhere; though team usually imposes some design discipline. Most time consuming design style; reserved for very high performance or very high volume chips (Intel microprocessors, RF power amps for cellphones) I Standard-Cell Design. Fixed library of standard cells and SRAM memory generators. Register-transfer-level description is automatically mapped to this library of standard cells, then these cells are placed and routed automatically ECE 5745 Course Overview 10 / 42
12 Also called Cell-Based ICs (CBICs) Fixed library Complex of cells Digital plus ASIC memory Design generators Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Cells can be synthesized from HDL, or entered in schematics Cells placed and routed automatically Standard-Cell Design Methodology Requires complete set of custom masks for each design Currently most popular hard-wired ASIC type (6.884 will use this) Cells arranged in rows Standard Cell Design Mem 1 Mem 2 Cells have standard height but vary in width Designed to connect power, ground, and wells Generated by abutment memory arrays ring Feb 2005 Clock Rail (not typical) Power Rails in M1 Clock Rail VDD Rail L01 Introduction 24 Well Contact under Power Rail Cell I/O on M2 Ripple carry adder with carry chain highlighted NAND2 GND Rail Flip-flop ECE 5745 Course Overview 11 / Spring Feb 2005 L01 Introduction 25
13 Standard-Cell Design Methodology Verilog RTL Verilog Simulator Synthesis Place&Route Area & Cycle Time Gate-Level Model Layout Educational Synopsys 90nm Process and Standard Cells Verilog Simulator Switching Activity Power Analysis Execution Time Energy ECE 5745 Course Overview 12 / 42
14 Motivation Architectural Patterns Scale VT Core Maven VT Core Evaluation Example Standard-Cell Chip Plot Single-Lane Vector-Thread Unit w/ 256 Registers MIT CSAIL Christopher Batten 32 / 42 ECE 5745 Course Overview 13 / 42
15 What is Complex Digital ASIC Design? Computer Architecture Application Algorithm Programming Language Operating System Instruction Set Architecture Microarchitecture Register-Transfer Level Gate Level Circuits Devices Technology Complex digital ASIC design is the process of quantitatively exploring the area, cycle time, execution time, and energy trade-offs of various general-purpose and application-specific designs using automated standard-cell CAD tools and then to transform the most promising design to layout ready for fabrication ECE 5745 Course Overview 14 / 42
16 Application Algorithm PL OS ISA Arch RTL Gates Circuits Complex Digital ASIC Design I Course goal, structure, motivation. What is the goal of the course?. How is the course structured?. Why should students want to take this course? I Activity 1: Evaluation of Integer Multiplier I Case Study: Scalar vs. Vector Processors. Example design-space exploration. Example real ASIC chips I Activity 2: Brainstorming for Sorting Accelerator I Course Logistics Devices Technology ECE 5745 Course Overview 15 / 42
17 Course Structure Prereq Computer Architecture Part 2 Digital CMOS Circuits Part 1 ASIC Design Overview P P MM Part 3 CAD Algorithms ECE 5745 Course Overview 16 / 42
18 Part 1: ASIC Design Overview P P Topic 1 Hardware Description Languages M Topic 4 Full-Custom Design Methodology Topic 6 Closing the Gap Topic 5 Automated Design Methodologies Topic 8 Testing and Verification Topic 3 CMOS Circuits Topic 2 CMOS Devices Topic 7 Clocking, Power Distribution, Packaging, and I/O ECE 5745 Course Overview 17 / 42
19 Part 2: Digital CMOS Circuits Topic 9 Combinational Logic Topic 11 Interconnect Topic 10 Sequential State ECE 5745 Course Overview 18 / 42
20 Part 3: CAD Algorithms Topic 12 Synthesis Algorithms Topic 13 Physical Design Automation Placement x = a'bc + a'bc' y = b'c' + ab' + ac x = a'b y = b'c' + ac RTL to Logic Synthesis Technology Independent Synthesis Technology Dependent Synthesis Global Routing Detailed Routing ECE 5745 Course Overview 19 / 42
21 Five-Week Design Project P I$ D$ Network Accelerated Instructions Xcel Xcel Network I$ P D$ D$ D$ Energy Efficiency (Tasks per Joule) Design Performance Constraint Embedded Architectures Simple Processor Custom ASIC Less Flexible Accelerator More Flexible Accelerator Design Power Constraint Flexibility vs. Specialization High-Performance Architectures Performance (Tasks per Second) Network ECE 5745 Course Overview 20 / 42
22 Application Algorithm PL OS ISA Arch RTL Gates Circuits Complex Digital ASIC Design I Course goal, structure, motivation. What is the goal of the course?. How is the course structured?. Why should students want to take this course? I Activity 1: Evaluation of Integer Multiplier I Case Study: Scalar vs. Vector Processors. Example design-space exploration. Example real ASIC chips I Activity 2: Brainstorming for Sorting Accelerator I Course Logistics Devices Technology ECE 5745 Course Overview 21 / 42
23 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Course Motivation: Comp Arch Research Perspective 7 Intel 48-Core Prototype 10 6 AMD 4-Core Opteron DEC Alpha Frequency (MHz) 3 10 Parallel Proc Performance Sequential Processor Performance Intel Pentium 4 10 Transistors (Thousands) MIPS R2K 2 Typical Power (Watts) 1 Number of Cores Data partially collected by M. Horowitz, F. Labonte, O. Shacham, K. Olukotun, L. Hammond ECE 5745 Course Overview 22 / 42
24 Cross-Layer Interaction is Critical Computer Architecture Application Algorithm Programming Language Operating System Instruction Set Architecture Microarchitecture Register-Transfer Level Gate Level Circuits Devices Technology Need to quantitatively understand area, cycle time, and energy trade-offs to be able to create new revolutionary architectures that tackle the most challenging physical design issues ECE 5745 Course Overview 23 / 42
25 Course Motivation: Circuits Research Perspective Your Digital Circuit Here Your Analog Circuit Here Fighter Airplane ~100,000 parts Intel Sandy Bridge E 2.27 Billion transistors ECE 5745 Course Overview 24 / 42
26 Cross-Layer Interaction is Critical Computer Architecture Application Algorithm Programming Language Operating System Instruction Set Architecture Microarchitecture Register-Transfer Level Gate Level Circuits Devices Technology Digital circuit researchers need to appreciate the system-level context for their subsystems; cross-layer interaction can generate some of the most exciting research ideas ECE 5745 Course Overview 25 / 42
27 Course Motivation: Industry Development Perspective ECE 5745 Course Overview 26 / 42
28 Course Motivation: Industry Development Perspective I Ever increasing chip size and complexity. Microprocessors: 100M gates growing to 1000M gates. ASICs: 5M to 10M gates growing to 50M to 100M gates I Ever increasing costs and design team sizes. More than $30M for a 10M gate ASIC. More than $1M for re-spin in case of error (not including redesign costs) I 18 months to design, only an 8 month selling opportunity in market. Fewer new chip-starts every year. Looking for alternatives (e.g., FPGAs) I For chip-starts that do happen. Conservative design. No time for exploration. Educated guess and then implement ECE 5745 Course Overview 27 / 42
29 Cross-Layer Interaction is Critical Computer Architecture Application Algorithm Programming Language Operating System Instruction Set Architecture Microarchitecture Register-Transfer Level Gate Level Circuits Devices Technology Most significant impact on area, cycle time, execution time, and energy is usually at architectural and microarchitectural levels ECE 5745 Course Overview 28 / 42
30 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Application Algorithm PL OS ISA Arch RTL Gates Circuits Complex Digital ASIC Design I Course goal, structure, motivation. What is the goal of the course?. How is the course structured?. Why should students want to take this course? I Activity 1: Evaluation of Integer Multiplier I Case Study: Scalar vs. Vector Processors. Example design-space exploration. Example real ASIC chips I Activity 2: Brainstorming for Sorting Accelerator I Course Logistics Devices Technology ECE 5745 Course Overview 29 / 42
31 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Scalar Processors with Multithreading Programmer's Logical View T0 T1 T2 T3 T4 Ti Memory Control Instructions Memory Instructions Execution Resources Architectural Registers ECE 5745 Course Overview 30 / 42
32 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Scalar Processors with Multithreading Programmer's Logical View T0 T1 T2 T3 T4 Ti Memory Vector-Vector Add loop: load a, a_ptr load b, b_ptr add c, a, b store c, c_ptr a_ptr++ b_ptr++ c_ptr++ branch Masked Filter loop: load a, a_ptr a_ptr++ branch a = 0 op0 op1... branch ECE 5745 Course Overview 30 / 42
33 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Scalar Processors with Multithreading Programmer's Logical View T0 T1 T2 T3 T4 Ti Memory Typical Core Micro- Architecture Instr Memory T0 T1 T2 T3 Multithreaded Cores Energy MIMD Data Memory Performance ECE 5745 Course Overview 30 / 42
34 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Vector-SIMD Processors Programmer's Logical View CT0 elm 0 elm 1 elm 2 elm 3 CTj elm i Memory Vector-SIMD Arithmetic Instructions Vector-SIMD Memory Instructions Architectural Vector Register with 4 Elements ECE 5745 Course Overview 31 / 42
35 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Vector-SIMD Processors Programmer's Logical View CT0 elm 0 elm 1 elm 2 elm 3 CTj elm i Memory Vector-Vector Add loop: vload A, a_ptr vload B, b_ptr vadd C, A, B vstore C, c_ptr a_ptr++ b_ptr++ c_ptr++ branch Masked Filter loop: vload A, a_ptr vset F, A = 0 vop0 under flag F vop1 under flag F... a_ptr++ branch ECE 5745 Course Overview 31 / 42
36 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Vector-SIMD Processors Programmer's Logical View CT0 elm 0 elm 1 elm 2 elm 3 CTj elm i Memory Instruction Memory VIU 1 3 Vector Lanes Energy MIMD Typical Core CP Micro- Architecture 0 2 Vector- SIMD VMU Data Memory Performance ECE 5745 Course Overview 31 / 42
37 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Motivation Architectural Patterns Scale VT Core Maven VT Core Evaluation Quantitative Area Evaluation Single-Lane Vector-Thread Unit w/ 256 Registers MIT CSAIL Christopher Batten 32 / 42 ECE 5745 Course Overview 32 / 42
38 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Quantitative Area Evaluation Normalized Area thread 2 threads 4 threads 8 threads 1 core w/ 4 lanes 4 cores w/ 1 lane ea control logic register file memory units floating-point functional units integer functional units control processor instruction cache data cache Single-core multi-lane design reduces area by 15% Quad-Core w/ Vertical Multithreading Vector- SIMD Multi-core single-lane design increases area by 20% ECE 5745 Course Overview 33 / 42
39 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Quantitative Performance and Energy Evaluation Normalized Energy / Task Single-Lane Vector (increasing vlen) Multithreaded Multicore (increasing number of threads per core) Normalized Tasks / Second Energy / Task ( J) thread 2 threads 4 threads 8 threads 1 elm/lane 2 elm/lane 4 elm/lane 8 elm/lane control logic register file memory units int func units control proc inst cache data cache leakage Quad-Core w/ Vertical Multithreading Vector- SIMD (4 core w/ 1 lane) ECE 5745 Course Overview 34 / 42
40 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Simple RISC Processor ASIC SP Control SP Datapath SP Regfile AHIP Controller RAM Interface VCO RAM Subbank (2KB) RAM Subbank (2KB) RAM Subbank (2KB) RAM Subbank (2KB) Course 2: STC1 chip plot and die photo. On the Overview chip plot, the Exec Unit is colored blue, and the 35 / 42 Fetch Unit is colored green and purple. ECE 5745 Figure
41 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Simple RISC Processor ASIC * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *. * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *. * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *. * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * 410. * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * 240. * * * * * * * * * 230. * * * * * * * * * 220. * * * * * * * * * 210. * * * * * * * * * * * * * * * * * * * 190 * * * * * * * * * * 180 * * * * * * * * * * I RISC processor w/ 8 KB SRAM I TSMC 0.18 µm process Figure 5: Shmoo plot for a wide voltage range. The horizontal axis plots supply voltageiin mv, and the vertical mm axis plots frequency in MHz. A * indicates correct operation at that operating point and a dot indicates failure. The shmoo plot shows combined results from two di erent chips. I 610K Transistors I 450 MHz at 1.8 V ECE 5745 Course Overview 36 / 42
42 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Scale Vector-Thread Processor ASIC ECE 5745 Course Overview 37 / 42
43 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Scale Vector-Thread Processor ASIC TSMC 0.18µm 7.14 Million Transistors 16.6 mm 2 Core Area Cache Tag CAMs CP Cache Control Cache Data RAM Banks X B A R V M U Vector Lane 0 Vector Lane 1 Vector Lane 2 V I U Vector Lane 3 ECE 5745 Course Overview 38 / 42
44 Motivation Complex Digital ASIC Vector-Threading Design ActivityScale 1 VT Processor Case Study: Scalar Maven vs. Vector VT Processors FutureActivity Research 2 Energy-Efficiency Scale vs. ofperformance the Scale VTResults Processor ECE 5745 Course Overview 39 / 42 MIT CSAIL Christopher Batten 24 / 39
45 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Application Algorithm PL OS ISA Arch RTL Gates Circuits Complex Digital ASIC Design I Course goal, structure, motivation. What is the goal of the course?. How is the course structured?. Why should students want to take this course? I Activity 1: Evaluation of Integer Multiplier I Case Study: Scalar vs. Vector Processors. Example design-space exploration. Example real ASIC chips I Activity 2: Brainstorming for Sorting Accelerator I Course Logistics Devices Technology ECE 5745 Course Overview 40 / 42
46 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Brainstorming for Sorting Accelerator I$ Sort array of integers stored in memory Proc sends Xcel base address and size Proc waits for Xcel to finish sorting I Software Baseline. Insertion sort O(n 2 ). 20% of dynamic instructions are lw/sw P Xcel Arbiter D$ Energy Efficiency (Tasks per Joule) Software Baseline I Hardware Accelerator. What is ideal speedup assuming we still use insertion sort?. What if we use a different algorithm?. Sketch hardware impl of a sorting accelerator Performance (Tasks per Second). Where will accelerator be in the energy/perf space? ECE 5745 Course Overview 41 / 42
47 Complex Digital ASIC Design Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Application Algorithm PL OS ISA Arch RTL Gates Circuits Devices Take-Away Points I Complex digital ASIC design is the process of quantitatively exploring the area, cycle time, execution time, and energy trade-offs of general-purpose and application-specific designs using automated standard-cell CAD tools and then to transform the most promising design to layout ready for fabrication I Course can provide useful experience with cross-layer interaction for students interested in pursuing research in computer architecture or circuits or students interested in pursuing a career in in industry development of ASICs Technology ECE 5745 Course Overview 42 / 42
Design Cycle for Microprocessors
Cycle for Microprocessors Raúl Martínez Intel Barcelona Research Center Cursos de Verano 2010 UCLM Intel Corporation, 2010 Agenda Introduction plan Architecture Microarchitecture Logic Silicon ramp Types
More informationArchitectures and Platforms
Hardware/Software Codesign Arch&Platf. - 1 Architectures and Platforms 1. Architecture Selection: The Basic Trade-Offs 2. General Purpose vs. Application-Specific Processors 3. Processor Specialisation
More informationPyMTL and Pydgin Tutorial. Python Frameworks for Highly Productive Computer Architecture Research
PyMTL and Pydgin Tutorial Python Frameworks for Highly Productive Computer Architecture Research Derek Lockhart, Berkin Ilbeyi, Christopher Batten Computer Systems Laboratory School of Electrical and Computer
More information7a. System-on-chip design and prototyping platforms
7a. System-on-chip design and prototyping platforms Labros Bisdounis, Ph.D. Department of Computer and Communication Engineering 1 What is System-on-Chip (SoC)? System-on-chip is an integrated circuit
More informationCurriculum for a Master s Degree in ECE with focus on Mixed Signal SOC Design
Curriculum for a Master s Degree in ECE with focus on Mixed Signal SOC Design Department of Electrical and Computer Engineering Overview The VLSI Design program is part of two tracks in the department:
More informationA Survey on ARM Cortex A Processors. Wei Wang Tanima Dey
A Survey on ARM Cortex A Processors Wei Wang Tanima Dey 1 Overview of ARM Processors Focusing on Cortex A9 & Cortex A15 ARM ships no processors but only IP cores For SoC integration Targeting markets:
More informationECE 5745 Complex Digital ASIC Design, Spring 2016. Course Syllabus
1. Course Information School of Electrical and Computer Engineering Cornell University revision: 2016-01-28-10-32 Prereqs Instructor Admin. Assistant Graduate TA ECE 4750 Computer Architecture Prof. Christopher
More informationSOC architecture and design
SOC architecture and design system-on-chip (SOC) processors: become components in a system SOC covers many topics processor: pipelined, superscalar, VLIW, array, vector storage: cache, embedded and external
More informationSeeking Opportunities for Hardware Acceleration in Big Data Analytics
Seeking Opportunities for Hardware Acceleration in Big Data Analytics Paul Chow High-Performance Reconfigurable Computing Group Department of Electrical and Computer Engineering University of Toronto Who
More informationA New, High-Performance, Low-Power, Floating-Point Embedded Processor for Scientific Computing and DSP Applications
1 A New, High-Performance, Low-Power, Floating-Point Embedded Processor for Scientific Computing and DSP Applications Simon McIntosh-Smith Director of Architecture 2 Multi-Threaded Array Processing Architecture
More informationImplementation Details
LEON3-FT Processor System Scan-I/F FT FT Add-on Add-on 2 2 kbyte kbyte I- I- Cache Cache Scan Scan Test Test UART UART 0 0 UART UART 1 1 Serial 0 Serial 1 EJTAG LEON_3FT LEON_3FT Core Core 8 Reg. Windows
More informationThis Unit: Putting It All Together. CIS 501 Computer Architecture. Sources. What is Computer Architecture?
This Unit: Putting It All Together CIS 501 Computer Architecture Unit 11: Putting It All Together: Anatomy of the XBox 360 Game Console Slides originally developed by Amir Roth with contributions by Milo
More informationIntroducción. Diseño de sistemas digitales.1
Introducción Adapted from: Mary Jane Irwin ( www.cse.psu.edu/~mji ) www.cse.psu.edu/~cg431 [Original from Computer Organization and Design, Patterson & Hennessy, 2005, UCB] Diseño de sistemas digitales.1
More informationIntroduction to Electrical and Computer Engineering
Introduction to Electrical and Computer Engineering Christopher Batten Computer Systems Laboratory School of Electrical and Computer Engineering Cornell University ENGRG 1060 Explorations in Engineering
More informationU. Wisconsin CS/ECE 752 Advanced Computer Architecture I
U. Wisconsin CS/ECE 752 Advanced Computer Architecture I Prof. David A. Wood Unit 0: Introduction Slides developed by Amir Roth of University of Pennsylvania with sources that included University of Wisconsin
More informationIntroduction to RISC Processor. ni logic Pvt. Ltd., Pune
Introduction to RISC Processor ni logic Pvt. Ltd., Pune AGENDA What is RISC & its History What is meant by RISC Architecture of MIPS-R4000 Processor Difference Between RISC and CISC Pros and Cons of RISC
More informationComputer Architecture TDTS10
why parallelism? Performance gain from increasing clock frequency is no longer an option. Outline Computer Architecture TDTS10 Superscalar Processors Very Long Instruction Word Processors Parallel computers
More informationLecture 11: Multi-Core and GPU. Multithreading. Integration of multiple processor cores on a single chip.
Lecture 11: Multi-Core and GPU Multi-core computers Multithreading GPUs General Purpose GPUs Zebo Peng, IDA, LiTH 1 Multi-Core System Integration of multiple processor cores on a single chip. To provide
More informationAims and Objectives. E 3.05 Digital System Design. Course Syllabus. Course Syllabus (1) Programmable Logic
Aims and Objectives E 3.05 Digital System Design Peter Cheung Department of Electrical & Electronic Engineering Imperial College London URL: www.ee.ic.ac.uk/pcheung/ E-mail: p.cheung@ic.ac.uk How to go
More informationLecture 3: Modern GPUs A Hardware Perspective Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com
CSCI-GA.3033-012 Graphics Processing Units (GPUs): Architecture and Programming Lecture 3: Modern GPUs A Hardware Perspective Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com Modern GPU
More informationHow To Understand The Design Of A Microprocessor
Computer Architecture R. Poss 1 What is computer architecture? 2 Your ideas and expectations What is part of computer architecture, what is not? Who are computer architects, what is their job? What is
More informationADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-12: ARM
ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-12: ARM 1 The ARM architecture processors popular in Mobile phone systems 2 ARM Features ARM has 32-bit architecture but supports 16 bit
More informationAgenda. Michele Taliercio, Il circuito Integrato, Novembre 2001
Agenda Introduzione Il mercato Dal circuito integrato al System on a Chip (SoC) La progettazione di un SoC La tecnologia Una fabbrica di circuiti integrati 28 How to handle complexity G The engineering
More informationIntroduction to Digital System Design
Introduction to Digital System Design Chapter 1 1 Outline 1. Why Digital? 2. Device Technologies 3. System Representation 4. Abstraction 5. Development Tasks 6. Development Flow Chapter 1 2 1. Why Digital
More informationPower Reduction Techniques in the SoC Clock Network. Clock Power
Power Reduction Techniques in the SoC Network Low Power Design for SoCs ASIC Tutorial SoC.1 Power Why clock power is important/large» Generally the signal with the highest frequency» Typically drives a
More informationDESIGN CHALLENGES OF TECHNOLOGY SCALING
DESIGN CHALLENGES OF TECHNOLOGY SCALING IS PROCESS TECHNOLOGY MEETING THE GOALS PREDICTED BY SCALING THEORY? AN ANALYSIS OF MICROPROCESSOR PERFORMANCE, TRANSISTOR DENSITY, AND POWER TRENDS THROUGH SUCCESSIVE
More informationComputer System: User s View. Computer System Components: High Level View. Input. Output. Computer. Computer System: Motherboard Level
System: User s View System Components: High Level View Input Output 1 System: Motherboard Level 2 Components: Interconnection I/O MEMORY 3 4 Organization Registers ALU CU 5 6 1 Input/Output I/O MEMORY
More information路 論 Chapter 15 System-Level Physical Design
Introduction to VLSI Circuits and Systems 路 論 Chapter 15 System-Level Physical Design Dept. of Electronic Engineering National Chin-Yi University of Technology Fall 2007 Outline Clocked Flip-flops CMOS
More informationSystem-on. on-chip Design Flow. Prof. Jouni Tomberg Tampere University of Technology Institute of Digital and Computer Systems. jouni.tomberg@tut.
System-on on-chip Design Flow Prof. Jouni Tomberg Tampere University of Technology Institute of Digital and Computer Systems jouni.tomberg@tut.fi 26.03.2003 Jouni Tomberg / TUT 1 SoC - How and with whom?
More informationMcPAT: An Integrated Power, Area, and Timing Modeling Framework for Multicore and Manycore Architectures
McPAT: An Integrated Power, Area, and Timing Modeling Framework for Multicore and Manycore Architectures Sheng Li, Junh Ho Ahn, Richard Strong, Jay B. Brockman, Dean M Tullsen, Norman Jouppi MICRO 2009
More informationOptimizing Code for Accelerators: The Long Road to High Performance
Optimizing Code for Accelerators: The Long Road to High Performance Hans Vandierendonck Mons GPU Day November 9 th, 2010 The Age of Accelerators 2 Accelerators in Real Life 3 Latency (ps/inst) Why Accelerators?
More informationArea 3: Analog and Digital Electronics. D.A. Johns
Area 3: Analog and Digital Electronics D.A. Johns 1 1970 2012 Tech Advancements Everything but Electronics: Roughly factor of 2 improvement Cars and airplanes: 70% more fuel efficient Materials: up to
More informationDigitale Signalverarbeitung mit FPGA (DSF) Soft Core Prozessor NIOS II Stand Mai 2007. Jens Onno Krah
(DSF) Soft Core Prozessor NIOS II Stand Mai 2007 Jens Onno Krah Cologne University of Applied Sciences www.fh-koeln.de jens_onno.krah@fh-koeln.de NIOS II 1 1 What is Nios II? Altera s Second Generation
More informationHigh-Level Synthesis for FPGA Designs
High-Level Synthesis for FPGA Designs BRINGING BRINGING YOU YOU THE THE NEXT NEXT LEVEL LEVEL IN IN EMBEDDED EMBEDDED DEVELOPMENT DEVELOPMENT Frank de Bont Trainer consultant Cereslaan 10b 5384 VT Heesch
More informationWhat is a System on a Chip?
What is a System on a Chip? Integration of a complete system, that until recently consisted of multiple ICs, onto a single IC. CPU PCI DSP SRAM ROM MPEG SoC DRAM System Chips Why? Characteristics: Complex
More informationLayout and Cross-section of an inverter. Lecture 5. Layout Design. Electric Handles Objects. Layout & Fabrication. A V i
Layout and Cross-section of an inverter Lecture 5 A Layout Design Peter Cheung Department of Electrical & Electronic Engineering Imperial College London V DD Q p A V i V o URL: www.ee.ic.ac.uk/pcheung/
More informationCFD Implementation with In-Socket FPGA Accelerators
CFD Implementation with In-Socket FPGA Accelerators Ivan Gonzalez UAM Team at DOVRES FuSim-E Programme Symposium: CFD on Future Architectures C 2 A 2 S 2 E DLR Braunschweig 14 th -15 th October 2009 Outline
More informationIntroduction to Cloud Computing
Introduction to Cloud Computing Parallel Processing I 15 319, spring 2010 7 th Lecture, Feb 2 nd Majd F. Sakr Lecture Motivation Concurrency and why? Different flavors of parallel computing Get the basic
More informationUniversity of Texas at Dallas. Department of Electrical Engineering. EEDG 6306 - Application Specific Integrated Circuit Design
University of Texas at Dallas Department of Electrical Engineering EEDG 6306 - Application Specific Integrated Circuit Design Synopsys Tools Tutorial By Zhaori Bi Minghua Li Fall 2014 Table of Contents
More informationLow Power AMD Athlon 64 and AMD Opteron Processors
Low Power AMD Athlon 64 and AMD Opteron Processors Hot Chips 2004 Presenter: Marius Evers Block Diagram of AMD Athlon 64 and AMD Opteron Based on AMD s 8 th generation architecture AMD Athlon 64 and AMD
More informationEnabling Technologies for Distributed and Cloud Computing
Enabling Technologies for Distributed and Cloud Computing Dr. Sanjay P. Ahuja, Ph.D. 2010-14 FIS Distinguished Professor of Computer Science School of Computing, UNF Multi-core CPUs and Multithreading
More informationMulti-core architectures. Jernej Barbic 15-213, Spring 2007 May 3, 2007
Multi-core architectures Jernej Barbic 15-213, Spring 2007 May 3, 2007 1 Single-core computer 2 Single-core CPU chip the single core 3 Multi-core architectures This lecture is about a new trend in computer
More informationChapter 2 Logic Gates and Introduction to Computer Architecture
Chapter 2 Logic Gates and Introduction to Computer Architecture 2.1 Introduction The basic components of an Integrated Circuit (IC) is logic gates which made of transistors, in digital system there are
More informationVLSI Design Verification and Testing
VLSI Design Verification and Testing Instructor Chintan Patel (Contact using email: cpatel2@cs.umbc.edu). Text Michael L. Bushnell and Vishwani D. Agrawal, Essentials of Electronic Testing, for Digital,
More informationHardware Implementations of RSA Using Fast Montgomery Multiplications. ECE 645 Prof. Gaj Mike Koontz and Ryon Sumner
Hardware Implementations of RSA Using Fast Montgomery Multiplications ECE 645 Prof. Gaj Mike Koontz and Ryon Sumner Overview Introduction Functional Specifications Implemented Design and Optimizations
More informationIntel Labs at ISSCC 2012. Copyright Intel Corporation 2012
Intel Labs at ISSCC 2012 Copyright Intel Corporation 2012 Intel Labs ISSCC 2012 Highlights 1. Efficient Computing Research: Making the most of every milliwatt to make computing greener and more scalable
More informationLogical Operations. Control Unit. Contents. Arithmetic Operations. Objectives. The Central Processing Unit: Arithmetic / Logic Unit.
Objectives The Central Processing Unit: What Goes on Inside the Computer Chapter 4 Identify the components of the central processing unit and how they work together and interact with memory Describe how
More informationIL2225 Physical Design
IL2225 Physical Design Nasim Farahini farahini@kth.se Outline Physical Implementation Styles ASIC physical design Flow Floor and Power planning Placement Clock Tree Synthesis Routing Timing Analysis Verification
More informationCISC, RISC, and DSP Microprocessors
CISC, RISC, and DSP Microprocessors Douglas L. Jones ECE 497 Spring 2000 4/6/00 CISC, RISC, and DSP D.L. Jones 1 Outline Microprocessors circa 1984 RISC vs. CISC Microprocessors circa 1999 Perspective:
More informationDigital Design for Low Power Systems
Digital Design for Low Power Systems Shekhar Borkar Intel Corp. Outline Low Power Outlook & Challenges Circuit solutions for leakage avoidance, control, & tolerance Microarchitecture for Low Power System
More information9/14/2011 14.9.2011 8:38
Algorithms and Implementation Platforms for Wireless Communications TLT-9706/ TKT-9636 (Seminar Course) BASICS OF FIELD PROGRAMMABLE GATE ARRAYS Waqar Hussain firstname.lastname@tut.fi Department of Computer
More informationDesign Verification and Test of Digital VLSI Circuits NPTEL Video Course. Module-VII Lecture-I Introduction to Digital VLSI Testing
Design Verification and Test of Digital VLSI Circuits NPTEL Video Course Module-VII Lecture-I Introduction to Digital VLSI Testing VLSI Design, Verification and Test Flow Customer's Requirements Specifications
More informationEEC 119B Spring 2014 Final Project: System-On-Chip Module
EEC 119B Spring 2014 Final Project: System-On-Chip Module Dept. of Electrical and Computer Engineering University of California, Davis Issued: March 14, 2014 Subject to Revision Final Report Due: June
More informationRethinking SIMD Vectorization for In-Memory Databases
SIGMOD 215, Melbourne, Victoria, Australia Rethinking SIMD Vectorization for In-Memory Databases Orestis Polychroniou Columbia University Arun Raghavan Oracle Labs Kenneth A. Ross Columbia University Latest
More informationEnabling Technologies for Distributed Computing
Enabling Technologies for Distributed Computing Dr. Sanjay P. Ahuja, Ph.D. Fidelity National Financial Distinguished Professor of CIS School of Computing, UNF Multi-core CPUs and Multithreading Technologies
More informationOUTILS DE DÉMONSTRATION
OUTILS DE DÉMONSTRATION AUTOMATIQUE ET PREUVE DE CIRCUITS ÉLECTRONIQUES Laurence Pierre Laboratoire TIMA, Grenoble PREAMBLE Design/validation of embedded applications: Design/validation for the system
More informationARM Cortex-A9 MPCore Multicore Processor Hierarchical Implementation with IC Compiler
ARM Cortex-A9 MPCore Multicore Processor Hierarchical Implementation with IC Compiler DAC 2008 Philip Watson Philip Watson Implementation Environment Program Manager ARM Ltd Background - Who Are We? Processor
More informationAll Programmable Logic. Hans-Joachim Gelke Institute of Embedded Systems. Zürcher Fachhochschule
All Programmable Logic Hans-Joachim Gelke Institute of Embedded Systems Institute of Embedded Systems 31 Assistants 10 Professors 7 Technical Employees 2 Secretaries www.ines.zhaw.ch Research: Education:
More informationDesign Compiler Graphical Create a Better Starting Point for Faster Physical Implementation
Datasheet Create a Better Starting Point for Faster Physical Implementation Overview Continuing the trend of delivering innovative synthesis technology, Design Compiler Graphical delivers superior quality
More informationState-of-Art (SoA) System-on-Chip (SoC) Design HPC SoC Workshop
Photos placed in horizontal position with even amount of white space between photos and header State-of-Art (SoA) System-on-Chip (SoC) Design HPC SoC Workshop Michael Holmes Manager, Mixed Signal ASIC/SoC
More informationLecture 5: Gate Logic Logic Optimization
Lecture 5: Gate Logic Logic Optimization MAH, AEN EE271 Lecture 5 1 Overview Reading McCluskey, Logic Design Principles- or any text in boolean algebra Introduction We could design at the level of irsim
More informationFPGA-based Multithreading for In-Memory Hash Joins
FPGA-based Multithreading for In-Memory Hash Joins Robert J. Halstead, Ildar Absalyamov, Walid A. Najjar, Vassilis J. Tsotras University of California, Riverside Outline Background What are FPGAs Multithreaded
More informationComputer Organization
Computer Organization and Architecture Designing for Performance Ninth Edition William Stallings International Edition contributions by R. Mohan National Institute of Technology, Tiruchirappalli PEARSON
More informationEE411: Introduction to VLSI Design Course Syllabus
: Introduction to Course Syllabus Dr. Mohammad H. Awedh Spring 2008 Course Overview This is an introductory course which covers basic theories and techniques of digital VLSI design in CMOS technology.
More informationIntroduction to Programmable Logic Devices. John Coughlan RAL Technology Department Detector & Electronics Division
Introduction to Programmable Logic Devices John Coughlan RAL Technology Department Detector & Electronics Division PPD Lectures Programmable Logic is Key Underlying Technology. First-Level and High-Level
More informationEE361: Digital Computer Organization Course Syllabus
EE361: Digital Computer Organization Course Syllabus Dr. Mohammad H. Awedh Spring 2014 Course Objectives Simply, a computer is a set of components (Processor, Memory and Storage, Input/Output Devices)
More informationAMD Opteron Quad-Core
AMD Opteron Quad-Core a brief overview Daniele Magliozzi Politecnico di Milano Opteron Memory Architecture native quad-core design (four cores on a single die for more efficient data sharing) enhanced
More informationVHDL DESIGN OF EDUCATIONAL, MODERN AND OPEN- ARCHITECTURE CPU
VHDL DESIGN OF EDUCATIONAL, MODERN AND OPEN- ARCHITECTURE CPU Martin Straka Doctoral Degree Programme (1), FIT BUT E-mail: strakam@fit.vutbr.cz Supervised by: Zdeněk Kotásek E-mail: kotasek@fit.vutbr.cz
More informationTesting & Verification of Digital Circuits ECE/CS 5745/6745. Hardware Verification using Symbolic Computation
Testing & Verification of Digital Circuits ECE/CS 5745/6745 Hardware Verification using Symbolic Computation Instructor: Priyank Kalla (kalla@ece.utah.edu) 3 Credits Mon, Wed, 1:25-2:45pm, WEB L105 Office
More informationdesign Synopsys and LANcity
Synopsys and LANcity LANcity Adopts Design Reuse with DesignWare to Bring Low-Cost, High-Speed Cable TV Modem to Consumer Market What does it take to redesign a commercial product for a highly-competitive
More informationAlpha CPU and Clock Design Evolution
Alpha CPU and Clock Design Evolution This lecture uses two papers that discuss the evolution of the Alpha CPU and clocking strategy over three CPU generations Gronowski, Paul E., et.al., High Performance
More informationMS GRADUATE PROGRAM IN COMPUTER ENGINEERING
MS GRADUATE PROGRAM IN COMPUTER ENGINEERING INTRODUCTION The increased interaction between computing and communication in recent years is changing the landscape of computer engineering. There is now an
More informationGenerations of the computer. processors.
. Piotr Gwizdała 1 Contents 1 st Generation 2 nd Generation 3 rd Generation 4 th Generation 5 th Generation 6 th Generation 7 th Generation 8 th Generation Dual Core generation Improves and actualizations
More informationFLIX: Fast Relief for Performance-Hungry Embedded Applications
FLIX: Fast Relief for Performance-Hungry Embedded Applications Tensilica Inc. February 25 25 Tensilica, Inc. 25 Tensilica, Inc. ii Contents FLIX: Fast Relief for Performance-Hungry Embedded Applications...
More informationSystem on Chip Design. Michael Nydegger
Short Questions, 26. February 2015 What is meant by the term n-well process? What does this mean for the n-type MOSFETs in your design? What is the meaning of the threshold voltage (practically)? What
More information18-447 Computer Architecture Lecture 3: ISA Tradeoffs. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 1/18/2013
18-447 Computer Architecture Lecture 3: ISA Tradeoffs Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 1/18/2013 Reminder: Homeworks for Next Two Weeks Homework 0 Due next Wednesday (Jan 23), right
More informationOC By Arsene Fansi T. POLIMI 2008 1
IBM POWER 6 MICROPROCESSOR OC By Arsene Fansi T. POLIMI 2008 1 WHAT S IBM POWER 6 MICROPOCESSOR The IBM POWER6 microprocessor powers the new IBM i-series* and p-series* systems. It s based on IBM POWER5
More informationTesting of Digital System-on- Chip (SoC)
Testing of Digital System-on- Chip (SoC) 1 Outline of the Talk Introduction to system-on-chip (SoC) design Approaches to SoC design SoC test requirements and challenges Core test wrapper P1500 core test
More informationOBJECTIVE ANALYSIS WHITE PAPER MATCH FLASH. TO THE PROCESSOR Why Multithreading Requires Parallelized Flash ATCHING
OBJECTIVE ANALYSIS WHITE PAPER MATCH ATCHING FLASH TO THE PROCESSOR Why Multithreading Requires Parallelized Flash T he computing community is at an important juncture: flash memory is now generally accepted
More informationwhat operations can it perform? how does it perform them? on what kind of data? where are instructions and data stored?
Inside the CPU how does the CPU work? what operations can it perform? how does it perform them? on what kind of data? where are instructions and data stored? some short, boring programs to illustrate the
More information數 位 積 體 電 路 Digital Integrated Circuits
IEE5049 - Spring 2012 數 位 積 體 電 路 Digital Integrated Circuits Course Overview Professor Wei Hwang 黃 威 教 授 Department of Electronics Engineering National Chiao Tung University hwang@mail.nctu.edu.tw Wei
More informationİSTANBUL AYDIN UNIVERSITY
İSTANBUL AYDIN UNIVERSITY FACULTY OF ENGİNEERİNG SOFTWARE ENGINEERING THE PROJECT OF THE INSTRUCTION SET COMPUTER ORGANIZATION GÖZDE ARAS B1205.090015 Instructor: Prof. Dr. HASAN HÜSEYİN BALIK DECEMBER
More informationWhat is this course is about? Design of Digital Circuitsit. Digital Integrated Circuits. What is this course is about?
What is this course is about? Design of Digital Circuitsit Design of digital microelectronic circuits.» CMOS devices and manufacturing technology.» Digital gates. Propagation delay, noise margins, and
More informationINTRODUCTION TO DIGITAL SYSTEMS. IMPLEMENTATION: MODULES (ICs) AND NETWORKS IMPLEMENTATION OF ALGORITHMS IN HARDWARE
INTRODUCTION TO DIGITAL SYSTEMS 1 DESCRIPTION AND DESIGN OF DIGITAL SYSTEMS FORMAL BASIS: SWITCHING ALGEBRA IMPLEMENTATION: MODULES (ICs) AND NETWORKS IMPLEMENTATION OF ALGORITHMS IN HARDWARE COURSE EMPHASIS:
More informationClocking. Figure by MIT OCW. 6.884 - Spring 2005 2/18/05 L06 Clocks 1
ing Figure by MIT OCW. 6.884 - Spring 2005 2/18/05 L06 s 1 Why s and Storage Elements? Inputs Combinational Logic Outputs Want to reuse combinational logic from cycle to cycle 6.884 - Spring 2005 2/18/05
More informationFPGA-based MapReduce Framework for Machine Learning
FPGA-based MapReduce Framework for Machine Learning Bo WANG 1, Yi SHAN 1, Jing YAN 2, Yu WANG 1, Ningyi XU 2, Huangzhong YANG 1 1 Department of Electronic Engineering Tsinghua University, Beijing, China
More informationDigital Systems Design! Lecture 1 - Introduction!!
ECE 3401! Digital Systems Design! Lecture 1 - Introduction!! Course Basics Classes: Tu/Th 11-12:15, ITE 127 Instructor Mohammad Tehranipoor Office hours: T 1-2pm, or upon appointments @ ITE 441 Email:
More informationIBM CELL CELL INTRODUCTION. Project made by: Origgi Alessandro matr. 682197 Teruzzi Roberto matr. 682552 IBM CELL. Politecnico di Milano Como Campus
Project made by: Origgi Alessandro matr. 682197 Teruzzi Roberto matr. 682552 CELL INTRODUCTION 2 1 CELL SYNERGY Cell is not a collection of different processors, but a synergistic whole Operation paradigms,
More informationComputer Engineering: Incoming MS Student Orientation Requirements & Course Overview
Computer Engineering: Incoming MS Student Orientation Requirements & Course Overview Prof. Charles Zukowski (caz@columbia.edu) Interim Chair, September 3, 2015 MS Requirements: Overview (see bulletin for
More informationINSTRUCTION LEVEL PARALLELISM PART VII: REORDER BUFFER
Course on: Advanced Computer Architectures INSTRUCTION LEVEL PARALLELISM PART VII: REORDER BUFFER Prof. Cristina Silvano Politecnico di Milano cristina.silvano@polimi.it Prof. Silvano, Politecnico di Milano
More informationUnit 4: Performance & Benchmarking. Performance Metrics. This Unit. CIS 501: Computer Architecture. Performance: Latency vs.
This Unit CIS 501: Computer Architecture Unit 4: Performance & Benchmarking Metrics Latency and throughput Speedup Averaging CPU Performance Performance Pitfalls Slides'developed'by'Milo'Mar0n'&'Amir'Roth'at'the'University'of'Pennsylvania'
More informationDesign of Digital Circuits (SS16)
Design of Digital Circuits (SS16) 252-0014-00L (6 ECTS), BSc in CS, ETH Zurich Lecturers: Srdjan Capkun, D-INFK, ETH Zurich Frank K. Gürkaynak, D-ITET, ETH Zurich Labs: Der-Yeuan Yu dyu@inf.ethz.ch Website:
More informationRadeon HD 2900 and Geometry Generation. Michael Doggett
Radeon HD 2900 and Geometry Generation Michael Doggett September 11, 2007 Overview Introduction to 3D Graphics Radeon 2900 Starting Point Requirements Top level Pipeline Blocks from top to bottom Command
More informationCourse Development of Programming for General-Purpose Multicore Processors
Course Development of Programming for General-Purpose Multicore Processors Wei Zhang Department of Electrical and Computer Engineering Virginia Commonwealth University Richmond, VA 23284 wzhang4@vcu.edu
More informationFSMD and Gezel. Jan Madsen
FSMD and Gezel Jan Madsen Informatics and Mathematical Modeling Technical University of Denmark Richard Petersens Plads, Building 321 DK2800 Lyngby, Denmark jan@imm.dtu.dk Processors Pentium IV General-purpose
More informationIntroducing PgOpenCL A New PostgreSQL Procedural Language Unlocking the Power of the GPU! By Tim Child
Introducing A New PostgreSQL Procedural Language Unlocking the Power of the GPU! By Tim Child Bio Tim Child 35 years experience of software development Formerly VP Oracle Corporation VP BEA Systems Inc.
More informationCHAPTER 4 MARIE: An Introduction to a Simple Computer
CHAPTER 4 MARIE: An Introduction to a Simple Computer 4.1 Introduction 195 4.2 CPU Basics and Organization 195 4.2.1 The Registers 196 4.2.2 The ALU 197 4.2.3 The Control Unit 197 4.3 The Bus 197 4.4 Clocks
More informationIC-EMC Simulation of Electromagnetic Compatibility of Integrated Circuits
IC-EMC Simulation of Electromagnetic Compatibility of Integrated Circuits SUMMARY CONTENTS 1. CONTEXT 2. TECHNOLOGY TRENDS 3. MOTIVATION 4. WHAT IS IC-EMC 5. SUPPORTED STANDARD 6. EXAMPLES CONTEXT - WHY
More informationRiding silicon trends into our future
Riding silicon trends into our future VLSI Design and Embedded Systems Conference, Bangalore, Jan 05 2015 Sunit Rikhi Vice President, Technology & Manufacturing Group General Manager, Intel Custom Foundry
More informationA Lab Course on Computer Architecture
A Lab Course on Computer Architecture Pedro López José Duato Depto. de Informática de Sistemas y Computadores Facultad de Informática Universidad Politécnica de Valencia Camino de Vera s/n, 46071 - Valencia,
More information