Fetch. Decode. Execute. Memory. PC update



Similar documents
CS:APP Chapter 4 Computer Architecture. Wrap-Up. William J. Taffe Plymouth State University. using the slides of

l C-Programming l A real computer language l Data Representation l Everything goes down to bits and bytes l Machine representation Language

Parallel and Distributed Programming. Performance Metrics

Question 3: How do you find the relative extrema of a function?

5 2 index. e e. Prime numbers. Prime factors and factor trees. Powers. worked example 10. base. power

CPS 220 Theory of Computation REGULAR LANGUAGES. Regular expressions

Repulsive Force

Lecture 20: Emitter Follower and Differential Amplifiers

5.4 Exponential Functions: Differentiation and Integration TOOTLIFTST:

Entry Voice Mail for HiPath Systems. User Manual for Your Telephone

Remote-scope promotion

Lecture 3: Diffusion: Fick s first law

Hardware Modules of the RSA Algorithm

Job shop scheduling with unit processing times

Instruction Set Architecture

Traffic Flow Analysis (2)

Category 7: Employee Commuting

The example is taken from Sect. 1.2 of Vol. 1 of the CPN book.

Projections - 3D Viewing. Overview Lecture 4. Projection - 3D viewing. Projections. Projections Parallel Perspective

CPU. Rasterization. Per Vertex Operations & Primitive Assembly. Polynomial Evaluator. Frame Buffer. Per Fragment. Display List.

Problem Set 6 Solutions

SPECIAL VOWEL SOUNDS

FEASIBILITY STUDY OF JUST IN TIME INVENTORY MANAGEMENT ON CONSTRUCTION PROJECT

Use a high-level conceptual data model (ER Model). Identify objects of interest (entities) and relationships between these objects

(Analytic Formula for the European Normal Black Scholes Formula)

81-1-ISD Economic Considerations of Heat Transfer on Sheet Metal Duct

ITIL & Service Predictability/Modeling Plexent

June Enprise Rent. Enprise Author: Document Version: Product: Product Version: SAP Version:

Econ 371: Answer Key for Problem Set 1 (Chapter 12-13)

Long run: Law of one price Purchasing Power Parity. Short run: Market for foreign exchange Factors affecting the market for foreign exchange

CS:APP Chapter 4 Computer Architecture Instruction Set Architecture. CS:APP2e

Mathematics. Mathematics 3. hsn.uk.net. Higher HSN23000

Practical Embedded Systems Engineering Syllabus for Graduate Students with Multidisciplinary Backgrounds

QUANTITATIVE METHODS CLASSES WEEK SEVEN

LG has introduced the NeON 2, with newly developed Cello Technology which improves performance and reliability. Up to 320W 300W

Version 1.0. General Certificate of Education (A-level) January Mathematics MPC3. (Specification 6360) Pure Core 3. Final.

Cisco Data Virtualization

Micro Satellite System

Analyzing Failures of a Semi-Structured Supercomputer Log File Efficiently by Using PIG on Hadoop

Natural Gas & Electricity Prices

User Guide Thank you for purchasing the DX90

A Loadable Task Execution Recorder for Hierarchical Scheduling in Linux

Data warehouse on Manpower Employment for Decision Support System

TIMING DIAGRAM O 8085

Register File, Finite State Machines & Hardware Control Language

Precise Memory Leak Detection for Java Software Using Container Profiling

Combinatorial Analysis of Network Security

LAB 3: VELOCITY AND ACCELERATION GRAPHS

Nimble Storage Exchange ,000-Mailbox Resiliency Storage Solution

Asset set Liability Management for

Description. Rc NPT G 1/8 1/4 3/8 1/2 3/4. With drain cock Drain guide 1/8 Drain guide 1/4 Drain cock with barb fitting: For ø6 x ø4 nylon tube

C H A P T E R 1 Writing Reports with SAS

Change Your History How Can Soccer Knowledge Improve Your Business Processes?

IBM Healthcare Home Care Monitoring

Business rules FATCA V. 02/11/2015

Usability Test of UCRS e-learning DVD

WORKERS' COMPENSATION ANALYST, 1774 SENIOR WORKERS' COMPENSATION ANALYST, 1769

Cloud and Big Data Summer School, Stockholm, Aug., 2015 Jeffrey D. Ullman

Continuity Cloud Virtual Firewall Guide

CS:APP Chapter 4 Computer Architecture Instruction Set Architecture

Architecture of the proposed standard

Developing a Travel Route Planner Accounting for Traffic Variability

Adverse Selection and Moral Hazard in a Model With 2 States of the World

Meerkats: A Power-Aware, Self-Managing Wireless Camera Network for Wide Area Monitoring

Planning and Managing Copper Cable Maintenance through Cost- Benefit Modeling

Using SAS s PROC GPLOT to plot data and lines

erkeley / uc berkeley extension Be YoUR Best / be est with berkeley / uc berkeley With BerkELEY exten xtension / be your best with berkele

ME 612 Metal Forming and Theory of Plasticity. 6. Strain

Free ACA SOLUTION (IRS 1094&1095 Reporting)

An Broad outline of Redundant Array of Inexpensive Disks Shaifali Shrivastava 1 Department of Computer Science and Engineering AITR, Indore

Important Information Call Through... 8 Internet Telephony... 6 two PBX systems Internet Calls... 3 Internet Telephony... 2

Entity-Relationship Model

FACULTY SALARIES FALL NKU CUPA Data Compared To Published National Data

DUAL N-CHANNEL AND DUAL P-CHANNEL MATCHED MOSFET PAIR

Magic Message Maker Amaze your customers with this Gift of Caring communication piece

CARE QUALITY COMMISSION ESSENTIAL STANDARDS OF QUALITY AND SAFETY. Outcome 10 Regulation 11 Safety and Suitability of Premises

Rural and Remote Broadband Access: Issues and Solutions in Australia

Defense Logistics Agency STANDARD OPERATING PROCEDURE

CHAPTER 4c. ROOTS OF EQUATIONS

[ ] These are the motor parameters that are needed: Motor voltage constant. J total (lb-in-sec^2)

Contents. Presentation contents: Basic EDI dataflow in Russia. eaccounting for HR and Payroll. eaccounting in a Cloud

Réponse à une question de Roger Bastide Document 40

EFFECT OF GEOMETRICAL PARAMETERS ON HEAT TRANSFER PERFORMACE OF RECTANGULAR CIRCUMFERENTIAL FINS

Type Inference and Optimisation for an Impure World.

Information Management Strategy: Exploiting Big data and Advanced Analytics

The international Internet site of the geoviticulture MCC system Le site Internet international du système CCM géoviticole

Intermediate Macroeconomic Theory / Macroeconomic Analysis (ECON 3560/5040) Final Exam (Answers)

Scalable Transactions for Web Applications in the Cloud using Customized CloudTPS

Analyzing the Economic Efficiency of ebaylike Online Reputation Reporting Mechanisms

Topology Information Condensation in Hierarchical Networks.

Physics 106 Lecture 12. Oscillations II. Recap: SHM using phasors (uniform circular motion) music structural and mechanical engineering waves

Computer organization

Global Sourcing: lessons from lean companies to improve supply chain performances

Policies for Simultaneous Estimation and Optimization

Keynote Speech Collaborative Web Services and Peer-to-Peer Grids

Remember you can apply online. It s quick and easy. Go to Title. Forename(s) Surname. Sex. Male Date of birth D

by John Donald, Lecturer, School of Accounting, Economics and Finance, Deakin University, Australia

Transcription:

nwpc PC Nw PC valm Mmory Mm. control rad writ Data mmory data out rmmovl ra, D(rB) Excut Bch CC ALU A vale ALU Addr ALU B Data vala ALU fun. valb dste dstm srca srcb dste dstm srca srcb Ftch Dcod Excut icod:ifun M 1 [PC] ra:rb M 1 [PC+1] valc M 4 [PC+2] valp PC+6 vala [ra] valb [rb] vale valb + valc Dcod icod ifun ra rb valc valp A B M istr fil E Writ back Mmory Writ back PC updat M 4 [vale] vala PC valp Ftch Instruction mmory PC incrmnt PC

SEQ Opration #1 Combinational Loic CC ad ad Ports Data mmory Writ Writ Ports Stat PC ristr Cond. Cod ristr Data mmory istr fil All updatd as clock riss PC 0x00c istr fil Combinational Loic ALU Control Mmory rads Instruction mmory istr fil Data mmory

SEQ Opration #2 Cycl 1: Cycl 2: Cycl 3: Cycl 4: Cycl 1 Cycl 2 Cycl 3 Cycl 4 0x000: irmovl $0x100,%bx # %bx <-- 0x100 0x006: irmovl $0x200,%dx # %dx <-- 0x200 0x00c: addl %dx,%bx # %bx <-- 0x300 CC <-- 000 0x00: j dst # Not takn Combinational Loic CC 100 PC 0x00c ad ad Ports Data mmory istr fil %bx = 0x100 Writ Writ Ports stat st accordin to scond irmovl instruction combinational startin to ract to stat chans

SEQ Opration #3 Cycl 1: Cycl 2: Cycl 3: Cycl 4: Cycl 1 Cycl 2 Cycl 3 Cycl 4 0x000: irmovl $0x100,%bx # %bx <-- 0x100 0x006: irmovl $0x200,%dx # %dx <-- 0x200 0x00c: addl %dx,%bx # %bx <-- 0x300 CC <-- 000 0x00: j dst # Not takn Combinational Loic CC 100 000 PC 0x00c 0x00 ad ad Ports Data mmory istr fil %bx = 0x100 Writ Writ Ports 0x300 stat st accordin to scond irmovl instruction combinational nrats rsults for addl instruction

SEQ Opration #4 Cycl 1: Cycl 2: Cycl 3: Cycl 4: Cycl 1 Cycl 2 Cycl 3 Cycl 4 0x000: irmovl $0x100,%bx # %bx <-- 0x100 0x006: irmovl $0x200,%dx # %dx <-- 0x200 0x00c: addl %dx,%bx # %bx <-- 0x300 CC <-- 000 0x00: j dst # Not takn Combinational Loic CC 000 PC 0x00 ad ad Ports Data mmory istr fil %bx = 0x300 Writ Writ Ports stat st accordin to addl instruction combinational startin to ract to stat chans

SEQ Opration #5 Cycl 1: Cycl 2: Cycl 3: Cycl 4: Cycl 1 Cycl 2 Cycl 3 Cycl 4 0x000: irmovl $0x100,%bx # %bx <-- 0x100 0x006: irmovl $0x200,%dx # %dx <-- 0x200 0x00c: addl %dx,%bx # %bx <-- 0x300 CC <-- 000 0x00: j dst # Not takn Combinational Loic CC 000 PC 0x00 0x013 ad Writ Data mmory ad Writ Ports Ports istr fil %bx = 0x300 stat st accordin to addl instruction combinational nrats rsults for j instruction

SEQ Summary Implmntation Exprss vry instruction as sris of simpl stps Follow sam nral flow for ach instruction typ Assmbl ristrs, mmoris, prdsind combinational blocks Connct with control Limitations

SEQ Summary Implmntation Exprss vry instruction as sris of simpl stps Follow sam nral flow for ach instruction typ Assmbl ristrs, mmoris, prdsind combinational blocks Connct with control Limitations Too slow to b practical In on cycl, must propaat throuh instruction mmory, ristr fil, ALU, and data mmory Would nd to run clock vry slowly Hardwar units only activ for fraction of clock cycl

Piplin Ovrviw Gnral Principls of Piplinin Goal Difficultis Cratin a Piplind Y86 Procssor arranin SEQ Insrtin piplin ristrs Problms with data and control hazards

al-world Piplins: Car Washs Squntial Paralll Piplind Ida Divid procss into indpndnt stas Mov objcts throuh stas in squnc At any ivn tims, multipl objcts bin procssd

Computational Exampl 300 ps 20 ps Combinational Dlay = 320 ps Throuhput = 3.12 GOPS Systm Gia - Oprations pr Scond Computation rquirs total of 300 picosconds Additional 20 picosconds to sav rsult in ristr Can must hav clock cycl of at last 320 ps Dlay = Latncy = 320ps = 1/Throuhput

3-Way Piplind Vrsion 100 ps 20 ps 100 ps 20 ps 100 ps 20 ps A B C Systm Divid combinational into 3 blocks of 100 ps ach

3-Way Piplind Vrsion 100 ps 20 ps 100 ps 20 ps 100 ps 20 ps A B C Dlay = 360 ps Throuhput = 8.33 GOPS Systm Divid combinational into 3 blocks of 100 ps ach Can bin nw opration as soon as prvious on passs throuh sta A. Bin nw opration vry 120 ps Ovrall latncy incrass 360 ps from start to finish

Piplin Diarams Unpiplind OP1 OP2 OP3 Tim Cannot start nw opration until prvious on complts 3-Way Piplind OP1 OP2 OP3 Tim Up to 3 oprations in procss simultanously

Opratin a Piplin 239 241 300 359 OP1 OP2 OP3 0 120 240 360 480 640 Tim 100 ps 20 ps 100 ps 20 ps 100 ps 20 ps A B C

Limitations: Nonuniform Dlays 1 100 ps 20 ps A B C OP1 OP2 OP3 Tim

Limitations: Nonuniform Dlays 1 100 ps 20 ps A B C Dlay = 510 ps Throuhput = 5.88 GOPS OP1 OP2 OP3 Tim Throuhput limitd by slowst sta Othr stas sit idl for much of th tim Challnin to partition systm into balancd stas

Limitations: istr Ovrhad

Limitations: istr Ovrhad Dlay = 420 ps, Throuhput = 14.29 GOPS As try to dpn piplin, ovrhad of loadin ristrs bcoms mor sinificant Prcnta of clock cycl spnt loadin ristr: 1-sta piplin: 6.25% 3-sta piplin: 16.67% 6-sta piplin: 28.57% Hih spds of modrn procssor dsins obtaind throuh vry dp piplinin

What could possibly o wron? 1 irmovl $50, %ax 2 addl %ax, %bx 3 mrmovl 100( %bx ), %dx

Data Dpndncis Combinational OP1 OP2 OP3 Tim Systm Each opration dpnds on rsult from prcdin on

Data Hazards A B C OP1 OP2 OP3 OP4 Tim sult dos not fd back around in tim for nxt opration Piplinin has chand bhavior of systm

Data Dpndncis in Procssors 1 irmovl $50, %ax 2 addl %ax, %bx 3 mrmovl 100( %bx ), %dx sult from on instruction usd as oprand for anothr ad-aftr-writ (AW) dpndncy Vry common in actual prorams Must mak sur our piplin handls ths proprly Gt corrct rsults Minimiz prformanc impact