Computer Architecture Lecture 4: Pipelining (Appendix C) Chih Wei Liu 劉志尉 National Chiao Tung University
|
|
- Myron Hicks
- 7 years ago
- Views:
Transcription
1 Computer Architecture Lecture 4: Pipelining (Appendix C) Chih Wei Liu 劉志尉 National Chiao Tung University
2 Why Pipeline? T I 2 I 1 I 2 I 1 Throughput = 1/T I 4 I 3 I 2 I 1 I 4 I 3 I 2 I 1 I 4 I 3 I 2 I 1 I 4 I 3 I 2 I 1 I 4 I 3 I 2 I 1 CA-Lec4 cwliu@twins.ee.nctu.edu.tw 2 Throughput = 4/T
3 Visualizing Pipelining Figure C.3, Page C 9 Time (clock cycles) I n s t r. Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 O r d e r CA-Lec4 cwliu@twins.ee.nctu.edu.tw 3
4 Designing a Pipelined Processor Start with ISA Examine the datapath and control diagram Starting with single or multi cycle datapath? Single or multi cycle control? Partition datapath into steps Insert pipeline registers between successive steps Associate resources with steps Ensure that flows do not conflict, or figure out how to resolve Assert control in appropriate stage CA-Lec4 cwliu@twins.ee.nctu.edu.tw 4
5 Example: 5 Steps of MIPS Datapath Instruction Fetch Instr. Decode. Fetch Execute Addr. Calc Memory Access Write Back Next PC Address 4 Adder Memory Inst Next SEQ PC RS1 RS2 RD File MUX MUX Zero? MUX Data Memory L M D MUX IR <= mem[pc]; PC <= PC + 4 Imm Sign Extend [IR rd ] <= [IR rs ] op IRop [IR rt ] WB Data CA-Lec4 cwliu@twins.ee.nctu.edu.tw 5
6 5 Steps of MIPS Datapath Instruction Fetch Instr. Decode. Fetch Execute Addr. Calc Memory Access Write Back Next PC 4 Adder Next SEQ PC RS1 Next SEQ PC Zero? MUX Address Memory IR <= mem[pc]; PC <= PC + 4 A <= [IR rs ]; B <= [IR rt ] IF/ID RS2 Imm File Sign Extend ID/EX MUX MUX EX/MEM RD RD RD Data Memory MEM/WB MUX WB Data rslt <= A op IRop B WB <= rslt [IR rd ] <= WB CA-Lec4 cwliu@twins.ee.nctu.edu.tw 6
7 The 5 Step of the Lw Instruction Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Load /Dec Exec Mem Wr Lw instruction can be implemented in 5 clock cycles : Instruction Fetch Fetch the instruction from the Instruction Memory /Dec: isters Fetch and Instruction Decode Exec: Execution and calculate the memory address Mem: Read the data from the Data Memory Wr: Write the data back to the register file Branch requires? cycles, Store requires? cycles, others require? cycles CA-Lec4 cwliu@twins.ee.nctu.edu.tw 7
8 The 4 Step of R type Instruction Cycle 1 Cycle 2 Cycle 3 Cycle 4 R-type /Dec Exec Wr : Instruction Fetch Fetch the instruction from the Instruction Memory Update PC /Dec: isters Fetch and Instruction Decode Exec: operates on the two register operands Wr: Write the output back to the register file does not access data memory CA-Lec4 cwliu@twins.ee.nctu.edu.tw 8
9 Important Observation Each functional unit can only be used once per instruction: Load uses ister File s Write Port during its 5th step Load /Dec Exec Mem Wr This s what caused the problem R type uses ister File s Write Port during its 4th step R-type /Dec Exec Wr A structural hazard will be happened!! CA-Lec4 cwliu@twins.ee.nctu.edu.tw 9
10 Pipelining the R type and Load Instructions Clock Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 R-type /Dec Exec Wr Ops! We have a problem! R-type /Dec Exec Wr Load /Dec Exec Mem Wr R-type /Dec Exec Wr R-type /Dec Exec Wr Structural hazard: Two instructions try to write to the register file at the same time! Only one write port CA-Lec4 cwliu@twins.ee.nctu.edu.tw 10
11 Sol 1: Insert Bubble into the Pipeline Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Clock R-type /Dec Exec Wr Load /Dec Exec Mem Wr R-type /Dec Exec Wr R-type /Dec Pipeline Exec Wr R-type Bubble /Dec Exec Wr /Dec Exec Insert a bubble into the pipeline to prevent 2 writes at the same cycle The control logic can be complex. Lose instruction fetch and issue opportunity. CA-Lec4 cwliu@twins.ee.nctu.edu.tw 11
12 Sol 2: Delay R type s Write by One Cycle 5 step R type instructions: Mem step for R type inst. is a NOOP : nothing is being done R-type /Dec Exec Mem Wr Clock Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 R-type /Dec Exec Mem Wr R-type /Dec Exec Mem Wr Load /Dec Exec Mem Wr R-type A unified pipeline architecture for each type of instructions!! /Dec Exec Mem Wr R-type /Dec Exec Mem Wr CA-Lec4 cwliu@twins.ee.nctu.edu.tw 12
13 Similarly, for Store and Branch Instructions Cycle 1 Cycle 2 Cycle 3 Cycle 4 Store /Dec Exec Mem Wr Cycle 1 Cycle 2 Cycle 3 Cycle 4 Beq /Dec Exec Mem Wr CA-Lec4 cwliu@twins.ee.nctu.edu.tw 13
14 Data Stationary Control Pass control signals along just like the data Main control generates control signals during ID WB Instruction Control M WB EX M WB IF/ID ID/EX EX/MEM MEM/WB CA-Lec4 cwliu@twins.ee.nctu.edu.tw 14
15 Use of Data Stationary Control The Main Control generates the control signals during /Dec Control signals for Exec (ExtOp, Src,...) are used 1 cycle later Control signals for Mem (MemWr Branch) are used 2 cycles later Control signals for Wr (Memto MemWr) are used 3 cycles later /Dec Exec Mem Wr ExtOp ExtOp Src Src IF/ID ister Main Control Op Dst MemWr Branch Memto ID/Ex ister Op Dst MemWr Branch Memto Ex/Mem ister MemWr Branch Memto Mem/Wr ister Memto Wr Wr Wr Wr CA-Lec4 cwliu@twins.ee.nctu.edu.tw 15
16 Pipeline Summary A pipeline is like an hooked assembly line. Pipelining, in general, is not visible to the programmer (vs ILP) Pipelining doesn t help latency of single task, it helps throughput of entire workload Pipeline rate limited by slowest pipeline stage Multiple tasks operating simultaneously using different resources Potential speedup = Number pipe stages, if perfectly balanced stage. Unbalanced lengths of pipe stages reduces speedup Time to fill pipeline and time to drain it reduces speedup Pipeline hazard CA-Lec4 cwliu@twins.ee.nctu.edu.tw 16
17 Pipelining is not quite that easy! Limits to pipelining: Hazards prevent next instruction from executing during its designated clock cycle Structural hazards: HW cannot support this combination of instructions (single person to fold and put clothes away) Data hazards: Instruction depends on result of prior instruction still in the pipeline (missing sock) Control hazards: Caused by delay between the fetching of instructions and decisions about changes in control flow (branches and jumps). CA-Lec4 17
18 One Memory Port/Structural Hazards Time (clock cycles) Figure A.4, Page A 14 Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 I n s t r. O r d e r Load Instr 1 Instr 2 Instr 3 Instr 4 CA-Lec4 cwliu@twins.ee.nctu.edu.tw 18
19 One Memory Port/Structural Hazards Time (clock cycles) Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 I n s t r. O r d e r Load Instr 1 Instr 2 Stall Instr 3 Bubble Bubble Bubble Bubble Bubble What will the bubble impact? CA-Lec4 cwliu@twins.ee.nctu.edu.tw 19
20 Speed Up Equations for Pipelining Speedup Average instruction time unpipelined Average instruction time pipelined CPI CPI unpipelined pipelined Clock cycle unpipelined Clock cycle pipelined CPI pipelined Ideal CPI Average Stall cycles per Inst Speedup Ideal CPI Pipeline depth Ideal CPI Pipeline stall CPI Cycle Time Cycle Time unpipelined pipelined for balanced pipelining For simple RISC pipeline, CPI = 1: Pipeline depth Speedup 1 Pipeline stall CPI Cycle Time Cycle Time unpipelined pipelined CA-Lec4 cwliu@twins.ee.nctu.edu.tw 20
21 Example: Dual port vs. Single port Machine A: Dual ported memory ( Harvard Architecture ) Machine B: Single ported memory, but its pipelined implementation has a 1.05 times faster clock rate Ideal CPI = 1 for both Suppose that Loads are 40% of instructions executed SpeedUp A = Pipeline Depth/(1 + 0) x (clock unpipe /clock pipe ) = Pipeline Depth SpeedUp B = Pipeline Depth/( x 1) x (clock unpipe /(clock unpipe / 1.05) = (Pipeline Depth/1.4) x 1.05 = 0.75 x Pipeline Depth SpeedUp A / SpeedUp B = Pipeline Depth/(0.75 x Pipeline Depth) = 1.33 Machine A is 1.33 times faster CA-Lec4 cwliu@twins.ee.nctu.edu.tw 21
22 Data Hazard Problem on r1 Time (clock cycles) Dependencies backwards in time are hazards IF ID/RF EX MEM WB I n s t r. add r1,r2,r3 sub r4,r1,r3 O r d e r and r6,r1,r7 or r8,r1,r9 xor r10,r1,r11 CA-Lec4 cwliu@twins.ee.nctu.edu.tw 22
23 Types of Data Hazards RAW (read after write): true data dependence Get wrong propagation result WAR (write after read): anti dependence Get wrong operand WAW (write after write): output dependence Leave wrong result CA-Lec4 23
24 Name Dependence Data Hazards Write After Read (WAR) Instr J writes operand before Instr I reads it I: sub r4,r1,r3 J: add r1,r2,r3 K: mul r6,r1,r7 Called an anti dependence by compiler writers. This results from reuse of the name r1. Can t happen in MIPS 5 stage pipeline because: All instructions take 5 stages. Reads are always in stage 2 and Writes are always in stage 5 Can happen in OOO processor (discussed in chapter 3). CA-Lec4 cwliu@twins.ee.nctu.edu.tw 24
25 Name Dependence Data Hazards Write After Write (WAW) Instr J writes operand before Instr I writes it. I: sub r1,r4,r3 J: add r1,r2,r3 K: mul r6,r1,r7 Called an output dependence by compiler writers This also results from the reuse of name r1. Can t happen in MIPS 5 stage pipeline because: All instructions take 5 stages, and Writes are always in stage 5 Will see WAR and WAW in more complicated pipes and OOO processor CA-Lec4 cwliu@twins.ee.nctu.edu.tw 25
26 Forwarding to Avoid Data Hazard I n s t r. add r1,r2,r3 sub r4,r1,r3 Time (clock cycles) Forward result from one stage to another Define read/write properly O r d e r and r6,r1,r7 or r8,r1,r9 xor r10,r1,r11 CA-Lec4 cwliu@twins.ee.nctu.edu.tw 26
27 HW Change for Forwarding Additional hardware is required. NextPC isters ID/EX mux mux EX/MEM Data Memory MEM/WR Immediate mux What circuit detects and resolves this hazard? CA-Lec4 27
28 Forwarding to Avoid LW SW Data Hazard Time (clock cycles) I n s t r. add r1,r2,r3 lw r4, 0(r1) O r d e r sw r4,12(r1) or r8,r6,r9 xor r10,r9,r11 CA-Lec4 cwliu@twins.ee.nctu.edu.tw 28
29 Data Hazard Even with Forwarding Time (clock cycles) I n s t r. lw r1, 0(r2) sub r4,r1,r6 O r d e r and r6,r1,r7 or r8,r1,r9 Must delay/stall instruction dependent on loads: Load stall cycles CA-Lec4 cwliu@twins.ee.nctu.edu.tw 29
30 Data Hazard Even with Forwarding Time (clock cycles) I n s t r. lw r1, 0(r2) sub r4,r1,r6 Bubble O r d e r and r6,r1,r7 or r8,r1,r9 Bubble Bubble CA-Lec4 cwliu@twins.ee.nctu.edu.tw 30
31 Software Scheduling to Avoid Load Hazards Try producing fast code for a = b + c; d = e f; assuming a, b, c, d,e, and f in memory. Slow code: LW Rb,b Fast code: LW Rb,b LW Rc,c LW Rc,c ADD Ra,Rb,Rc LW Re,e SW a,ra ADD Ra,Rb,Rc LW Re,e LW Rf,f SW a,ra LW Rf,f SUB Rd,Re,Rf SUB Rd,Re,Rf SW d,rd SW d,rd Compiler optimizes for performance. Hardware checks for safety. CA-Lec4 cwliu@twins.ee.nctu.edu.tw 31
32 Control Hazard on Branches 10: beq r1,r3,36 14: and r2,r3,r5 18: or r6,r1,r7 22: add r8,r1,r9 36: xor r10,r1,r11 What do you do with the 3 instructions in between? The simplest solution is to stall the pipeline as soon as a branch instruction is detected CA-Lec4 cwliu@twins.ee.nctu.edu.tw 32
33 Branch Stall Impact If CPI = 1, 30% branch, Stall 3 cycles => new CPI = 1.9! Two part solution: Determine branch taken or not sooner, AND Compute taken branch address earlier MIPS branch tests if register = 0 or 0 MIPS Solution: Move Zero test to ID/RF stage Adder to calculate new PC in ID/RF stage 1 clock cycle penalty for branch versus 3 CA-Lec4 cwliu@twins.ee.nctu.edu.tw 33
34 Pipelined MIPS Datapath Instruction Fetch Instr. Decode. Fetch Execute Addr. Calc Memory Access Write Back Next PC 4 Adder Next SEQ PC RS1 Adder MUX Zero? Address Memory IF/ID RS2 File ID/EX MUX EX/MEM Data Memory MEM/WB MUX Imm Sign Extend RD RD RD WB Data Interplay of instruction set design and cycle time. CA-Lec4 cwliu@twins.ee.nctu.edu.tw 34
35 Four Branch Hazard Alternatives #1: Stall until branch direction is clear #2: Predict Branch Not Taken Execute successor instructions in sequence Squash instructions in pipeline if branch actually taken Advantage of late pipeline state update 47% MIPS branches not taken on average PC+4 already calculated, so use it to get next instruction #3: Predict Branch Taken 53% MIPS branches taken on average But haven t calculated branch target address in MIPS MIPS still incurs 1 cycle branch penalty Other machines: branch target known before outcome What happens when hit not taken branch? CA-Lec4 cwliu@twins.ee.nctu.edu.tw 35
36 Four Branch Hazard Alternatives #4: Delayed Branch make the stall cycle useful Define branch to take place AFTER a following instruction branch instruction sequential successor 1 sequential successor 2... sequential successor n branch target if taken Branch delay of length n These insts. are executed!! 1 slot delay allows proper decision and branch target address in 5 stage pipeline MIPS uses this CA-Lec4 cwliu@twins.ee.nctu.edu.tw 36
37 Scheduling Branch Delay Slots A. From before branch B. From branch target C. From fall through add $1,$2,$3 if $2=0 then delay slot sub $4,$5,$6 add $1,$2,$3 if $1=0 then delay slot add $1,$2,$3 if $1=0 then delay slot sub $4,$5,$6 becomes becomes becomes add $1,$2,$3 if $2=0 then if $1=0 then add $1,$2,$3 add $1,$2,$3 if $1=0 then sub $4,$5,$6 sub $4,$5,$6 A is the best choice, fills delay slot & reduces instruction count (IC) In B, the sub instruction may need to be copied, increasing IC In B and C, must be okay to execute sub when branch fails CA-Lec4 cwliu@twins.ee.nctu.edu.tw 37
38 Delay Branch Scheduling Schemes and Their Requirements Scheduling Strategy From before Requirements Branch must not depend on the rescheduled instructions Improve Performance When? Always From target From fall through Must be OK to execute rescheduled instructions if branch is not taken. May need to duplicate instructions Must be OK to execute instructions if branch is taken When branch is taken. May enlarge program if instructions are duplicated When branch is not taken. CA-Lec4 38
39 Delayed Branch Compiler effectiveness for single branch delay slot: Fills about 60% of branch delay slots About 80% of instructions executed in branch delay slots useful in computation About 50% (60% x 80%) of slots usefully filled Delayed Branch downside: As processor go to deeper pipelines and multiple issue, the branch delay grows and need more than one delay slots Delayed branch (a static way) has lost popularity compared to more expensive but more flexible dynamic approaches Growth in available transistors has made dynamic approaches relatively cheaper CA-Lec4 cwliu@twins.ee.nctu.edu.tw 39
40 Performance of Branch Schemes Example For MIPS R4K, a deeper pipeline, it takes at least 3 pipeline stages before the branchtarget address is known and an additional cycle before the branch condition is evaluated. Scheme Penalty Uncond Penalty untaken Penalty taken Stall Predicted taken Predicted not taken 2 0 3
41 Evaluating Branch Alternatives Pipeline speedup = Pipeline depth 1 +Branch frequency Branch penalty Assume: 4% unconditional branch, 6% conditional branch untaken, 10% conditional branch taken Scheduling Branch CPI speedup v. speedup v. scheme penalty unpipelined stall Stall pipeline 2x0.04+3x n/ Predict not taken 2x0.04+3x0.06+2x n/ Predict taken 2x0.04+0x0.06+3x n/ Delayed branch 0.5x( ) 1.10 n/ CA-Lec4 cwliu@twins.ee.nctu.edu.tw 41
42 Problems with Pipelining Exception: An unusual event happens to an instruction during its execution Examples: divide by zero, undefined opcode Interrupt: Hardware signal to switch the processor to a new instruction stream Example: a sound card interrupts when it needs more audio output samples (an audio click happens if it is left waiting) Problem: It must appear that the exception or interrupt must appear between 2 instructions (I i and I i+1 ) The effect of all instructions up to and including I i is totalling complete No effect of any instruction after I i can take place The interrupt (exception) handler either aborts program or restarts at instruction I i+1 CA-Lec4 cwliu@twins.ee.nctu.edu.tw 42
43 Example: Device Interrupt (Say, arrival of network message) External Interrupt add subi slli lw lw add sw r1,r2,r3 r4,r1,#4 r4,r4,#2 Hiccup(!) r2,0(r4) r3,4(r4) r2,r2,r3 8(r4),r2 Raise priority Reenable All Ints Save registers lw r1,20(r0) lw r2,0(r1) addi r3,r0,#5 sw 0(r1),r3 Restore registers Clear current Int Disable All Ints Restore priority RTE Interrupt Handler CA-Lec4 43
44 Alternative: Polling (again, for arrival of network message) External Interrupt no_mess: Disable Network Intr subi r4,r1,#4 slli r4,r4,#2 lw r2,0(r4) lw r3,4(r4) add r2,r2,r3 sw 8(r4),r2 lw r1,12(r0) beq r1,no_mess lw r1,20(r0) lw r2,0(r1) addi r3,r0,#5 sw 0(r1),r3 Clear Network Intr Polling Point (check device register) Handler CA-Lec4 44
45 Polling is faster/slower than Interrupts. Polling is faster than interrupts because Compiler knows which registers in use at polling point. Hence, do not need to save and restore registers (or not as many). Other interrupt overhead avoided (pipeline flush, trap priorities, etc). Polling is slower than interrupts because Overhead of polling instructions is incurred regardless of whether or not handler is run. This could add to inner loop delay. Device may have to wait for service for a long time. When to use one or the other? Multi axis tradeoff Frequent/regular events good for polling, as long as device can be controlled at user level. Interrupts good for infrequent/irregular events Interrupts good for ensuring regular/predictable service of events. CA-Lec4 cwliu@twins.ee.nctu.edu.tw 45
46 Trap/Interrupt Classifications Traps: relevant to the current process Faults, arithmetic traps, and synchronous traps Invoke software on behalf of the currently executing process Interrupts: caused by asynchronous, outside events I/O devices requiring service (DISK, network) Clock interrupts (real time scheduling) Machine Checks: caused by serious hardware failure Not always restartable Indicate that bad things have happened. Non recoverable ECC error Machine room fire Power outage CA-Lec4 46
47 A related classification: Synchronous vs. Asynchronous Synchronous: means related to the instruction stream, i.e. during the execution of an instruction Must stop an instruction that is currently executing Page fault on load or store instruction Arithmetic exception Software Trap Instructions Asynchronous: means unrelated to the instruction stream, i.e. caused by an outside event. Does not have to disrupt instructions that are already executing Interrupts are asynchronous Machine checks are asynchronous SemiSynchronous (or high availability interrupts): Caused by external event but may have to disrupt current instructions in order to guarantee service CA-Lec4 47
48 Interrupt Controller Timer Interrupt Mask Priority Encoder IntID Interrupt CPU Int Disable Network Software Interrupt Control NMI Interrupts invoked with interrupt lines from devices Interrupt controller chooses interrupt request to honor Mask enables/disables interrupts Priority encoder picks highest enabled interrupt Software Interrupt Set/Cleared by Software Interrupt identity specified with ID line CPU can disable all interrupts with internal flag Non maskable interrupt line (NMI) can t be disabled CA-Lec4 cwliu@twins.ee.nctu.edu.tw 48
49 Precise Interrupts/Exceptions An interrupt or exception is considered precise if there is a single instruction (or interrupt point) for which: All instructions before that have committed their state No following instructions (including the interrupting instruction) have modified any state. This means that you can restart execution at the interrupt point and get the right answer Implicit in our previous example of a device interrupt: Interrupt point is at first lw instruction External Interrupt add subi slli lw lw add sw r1,r2,r3 r4,r1,#4 r4,r4,#2 r2,0(r4) r3,4(r4) r2,r2,r3 8(r4),r2 Int handler CA-Lec4 cwliu@twins.ee.nctu.edu.tw 49
50 Why are precise interrupts desirable? Many types of interrupts/exceptions need to be restartable, which is easier to figure out what actually happened: e.g. TLB faults. Need to fix translation, then restart load/store e.g. IEEE gradual underflow, illegal operation, etc Restartability doesn t require preciseness. However, preciseness makes it a lot easier to restart. Simplify the task of the operating system a lot Less state needs to be saved away if unloading process. Quick to restart (making for fast interrupts) CA-Lec4 cwliu@twins.ee.nctu.edu.tw 50
51 Precise Exceptions in simple 5 stage pipeline: Exceptions/interrupts may occur at different stages in pipeline : Arithmetic exceptions occur in execution stage TLB faults can occur in instruction fetch or memory stage What about interrupts? try to interrupt the pipeline as little as possible All of this solved by tagging instructions in pipeline as cause exception or not and wait until end of memory stage to flag exception Interrupts become marked NOPs (like bubbles) that are placed into pipeline instead of an instruction. Assume that interrupt condition persists in case NOP flushed Clever instruction fetch might start fetching instructions from interrupt vector, but this is complicated by need for supervisor mode switch, saving of one or more PCs, etc CA-Lec4 cwliu@twins.ee.nctu.edu.tw 51
52 Another Exception Problem Data TLB IFetch Dcd Exec Mem WB Time Bad Inst Inst TLB fault Overflow Program Flow IFetch Dcd Exec Mem WB Use pipeline to sort this out! Pass exception status along with instruction. Keep track of PCs for every instruction in pipeline. Don t act on exception until it reache WB stage IFetch Dcd Exec Mem WB Handle interrupts through faulting noop in IF stage When instruction reaches WB stage: Save PC EPC, Interrupt vector addr PC Turn all instructions in earlier stages into noops! IFetch Dcd Exec Mem WB CA-Lec4 cwliu@twins.ee.nctu.edu.tw 52
53 Precise Exceptions in Static Pipelines Key observation: architected state only change in memory and register write stages. CA-Lec4 53
Pipeline Hazards. Structure hazard Data hazard. ComputerArchitecture_PipelineHazard1
Pipeline Hazards Structure hazard Data hazard Pipeline hazard: the major hurdle A hazard is a condition that prevents an instruction in the pipe from executing its next scheduled pipe stage Taxonomy of
More informationCS352H: Computer Systems Architecture
CS352H: Computer Systems Architecture Topic 9: MIPS Pipeline - Hazards October 1, 2009 University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell Data Hazards in ALU Instructions
More informationDesign of Pipelined MIPS Processor. Sept. 24 & 26, 1997
Design of Pipelined MIPS Processor Sept. 24 & 26, 1997 Topics Instruction processing Principles of pipelining Inserting pipe registers Data Hazards Control Hazards Exceptions MIPS architecture subset R-type
More informationQ. Consider a dynamic instruction execution (an execution trace, in other words) that consists of repeats of code in this pattern:
Pipelining HW Q. Can a MIPS SW instruction executing in a simple 5-stage pipelined implementation have a data dependency hazard of any type resulting in a nop bubble? If so, show an example; if not, prove
More informationWAR: Write After Read
WAR: Write After Read write-after-read (WAR) = artificial (name) dependence add R1, R2, R3 sub R2, R4, R1 or R1, R6, R3 problem: add could use wrong value for R2 can t happen in vanilla pipeline (reads
More informationComputer Architecture Lecture 2: Instruction Set Principles (Appendix A) Chih Wei Liu 劉 志 尉 National Chiao Tung University cwliu@twins.ee.nctu.edu.
Computer Architecture Lecture 2: Instruction Set Principles (Appendix A) Chih Wei Liu 劉 志 尉 National Chiao Tung University cwliu@twins.ee.nctu.edu.tw Review Computers in mid 50 s Hardware was expensive
More informationINSTRUCTION LEVEL PARALLELISM PART VII: REORDER BUFFER
Course on: Advanced Computer Architectures INSTRUCTION LEVEL PARALLELISM PART VII: REORDER BUFFER Prof. Cristina Silvano Politecnico di Milano cristina.silvano@polimi.it Prof. Silvano, Politecnico di Milano
More informationSolutions. Solution 4.1. 4.1.1 The values of the signals are as follows:
4 Solutions Solution 4.1 4.1.1 The values of the signals are as follows: RegWrite MemRead ALUMux MemWrite ALUOp RegMux Branch a. 1 0 0 (Reg) 0 Add 1 (ALU) 0 b. 1 1 1 (Imm) 0 Add 1 (Mem) 0 ALUMux is the
More informationPipeline Hazards. Arvind Computer Science and Artificial Intelligence Laboratory M.I.T. Based on the material prepared by Arvind and Krste Asanovic
1 Pipeline Hazards Computer Science and Artificial Intelligence Laboratory M.I.T. Based on the material prepared by and Krste Asanovic 6.823 L6-2 Technology Assumptions A small amount of very fast memory
More informationPipelining Review and Its Limitations
Pipelining Review and Its Limitations Yuri Baida yuri.baida@gmail.com yuriy.v.baida@intel.com October 16, 2010 Moscow Institute of Physics and Technology Agenda Review Instruction set architecture Basic
More informationLecture: Pipelining Extensions. Topics: control hazards, multi-cycle instructions, pipelining equations
Lecture: Pipelining Extensions Topics: control hazards, multi-cycle instructions, pipelining equations 1 Problem 6 Show the instruction occupying each stage in each cycle (with bypassing) if I1 is R1+R2
More informationInstruction Set Architecture. or How to talk to computers if you aren t in Star Trek
Instruction Set Architecture or How to talk to computers if you aren t in Star Trek The Instruction Set Architecture Application Compiler Instr. Set Proc. Operating System I/O system Instruction Set Architecture
More informationAdvanced Computer Architecture-CS501. Computer Systems Design and Architecture 2.1, 2.2, 3.2
Lecture Handout Computer Architecture Lecture No. 2 Reading Material Vincent P. Heuring&Harry F. Jordan Chapter 2,Chapter3 Computer Systems Design and Architecture 2.1, 2.2, 3.2 Summary 1) A taxonomy of
More informationComputer organization
Computer organization Computer design an application of digital logic design procedures Computer = processing unit + memory system Processing unit = control + datapath Control = finite state machine inputs
More informationInstruction Set Architecture
Instruction Set Architecture Consider x := y+z. (x, y, z are memory variables) 1-address instructions 2-address instructions LOAD y (r :=y) ADD y,z (y := y+z) ADD z (r:=r+z) MOVE x,y (x := y) STORE x (x:=r)
More informationExecution Cycle. Pipelining. IF and ID Stages. Simple MIPS Instruction Formats
Execution Cycle Pipelining CSE 410, Spring 2005 Computer Systems http://www.cs.washington.edu/410 1. Instruction Fetch 2. Instruction Decode 3. Execute 4. Memory 5. Write Back IF and ID Stages 1. Instruction
More informationAn Introduction to the ARM 7 Architecture
An Introduction to the ARM 7 Architecture Trevor Martin CEng, MIEE Technical Director This article gives an overview of the ARM 7 architecture and a description of its major features for a developer new
More informationComputer Organization and Components
Computer Organization and Components IS5, fall 25 Lecture : Pipelined Processors ssociate Professor, KTH Royal Institute of Technology ssistant Research ngineer, University of California, Berkeley Slides
More informationMore on Pipelining and Pipelines in Real Machines CS 333 Fall 2006 Main Ideas Data Hazards RAW WAR WAW More pipeline stall reduction techniques Branch prediction» static» dynamic bimodal branch prediction
More informationReview: MIPS Addressing Modes/Instruction Formats
Review: Addressing Modes Addressing mode Example Meaning Register Add R4,R3 R4 R4+R3 Immediate Add R4,#3 R4 R4+3 Displacement Add R4,1(R1) R4 R4+Mem[1+R1] Register indirect Add R4,(R1) R4 R4+Mem[R1] Indexed
More informationPROBLEMS #20,R0,R1 #$3A,R2,R4
506 CHAPTER 8 PIPELINING (Corrisponde al cap. 11 - Introduzione al pipelining) PROBLEMS 8.1 Consider the following sequence of instructions Mul And #20,R0,R1 #3,R2,R3 #$3A,R2,R4 R0,R2,R5 In all instructions,
More informationInstruction Set Design
Instruction Set Design Instruction Set Architecture: to what purpose? ISA provides the level of abstraction between the software and the hardware One of the most important abstraction in CS It s narrow,
More informationCPU Performance Equation
CPU Performance Equation C T I T ime for task = C T I =Average # Cycles per instruction =Time per cycle =Instructions per task Pipelining e.g. 3-5 pipeline steps (ARM, SA, R3000) Attempt to get C down
More information150127-Microprocessor & Assembly Language
Chapter 3 Z80 Microprocessor Architecture The Z 80 is one of the most talented 8 bit microprocessors, and many microprocessor-based systems are designed around the Z80. The Z80 microprocessor needs an
More informationCourse on Advanced Computer Architectures
Course on Advanced Computer Architectures Surname (Cognome) Name (Nome) POLIMI ID Number Signature (Firma) SOLUTION Politecnico di Milano, September 3rd, 2015 Prof. C. Silvano EX1A ( 2 points) EX1B ( 2
More information(Refer Slide Time: 00:01:16 min)
Digital Computer Organization Prof. P. K. Biswas Department of Electronic & Electrical Communication Engineering Indian Institute of Technology, Kharagpur Lecture No. # 04 CPU Design: Tirning & Control
More informationSolution: start more than one instruction in the same clock cycle CPI < 1 (or IPC > 1, Instructions per Cycle) Two approaches:
Multiple-Issue Processors Pipelining can achieve CPI close to 1 Mechanisms for handling hazards Static or dynamic scheduling Static or dynamic branch handling Increase in transistor counts (Moore s Law):
More informationStrongARM** SA-110 Microprocessor Instruction Timing
StrongARM** SA-110 Microprocessor Instruction Timing Application Note September 1998 Order Number: 278194-001 Information in this document is provided in connection with Intel products. No license, express
More informationCS:APP Chapter 4 Computer Architecture. Wrap-Up. William J. Taffe Plymouth State University. using the slides of
CS:APP Chapter 4 Computer Architecture Wrap-Up William J. Taffe Plymouth State University using the slides of Randal E. Bryant Carnegie Mellon University Overview Wrap-Up of PIPE Design Performance analysis
More informationCS521 CSE IITG 11/23/2012
CS521 CSE TG 11/23/2012 A Sahu 1 Degree of overlap Serial, Overlapped, d, Super pipelined/superscalar Depth Shallow, Deep Structure Linear, Non linear Scheduling of operations Static, Dynamic A Sahu slide
More informationLSN 2 Computer Processors
LSN 2 Computer Processors Department of Engineering Technology LSN 2 Computer Processors Microprocessors Design Instruction set Processor organization Processor performance Bandwidth Clock speed LSN 2
More informationEE282 Computer Architecture and Organization Midterm Exam February 13, 2001. (Total Time = 120 minutes, Total Points = 100)
EE282 Computer Architecture and Organization Midterm Exam February 13, 2001 (Total Time = 120 minutes, Total Points = 100) Name: (please print) Wolfe - Solution In recognition of and in the spirit of the
More informationInstruction Set Architecture (ISA)
Instruction Set Architecture (ISA) * Instruction set architecture of a machine fills the semantic gap between the user and the machine. * ISA serves as the starting point for the design of a new machine
More informationUsing Graphics and Animation to Visualize Instruction Pipelining and its Hazards
Using Graphics and Animation to Visualize Instruction Pipelining and its Hazards Per Stenström, Håkan Nilsson, and Jonas Skeppstedt Department of Computer Engineering, Lund University P.O. Box 118, S-221
More informationGiving credit where credit is due
CSCE 230J Computer Organization Processor Architecture VI: Wrap-Up Dr. Steve Goddard goddard@cse.unl.edu http://cse.unl.edu/~goddard/courses/csce230j Giving credit where credit is due ost of slides for
More informationIntroducción. Diseño de sistemas digitales.1
Introducción Adapted from: Mary Jane Irwin ( www.cse.psu.edu/~mji ) www.cse.psu.edu/~cg431 [Original from Computer Organization and Design, Patterson & Hennessy, 2005, UCB] Diseño de sistemas digitales.1
More informationWhat is a bus? A Bus is: Advantages of Buses. Disadvantage of Buses. Master versus Slave. The General Organization of a Bus
Datorteknik F1 bild 1 What is a bus? Slow vehicle that many people ride together well, true... A bunch of wires... A is: a shared communication link a single set of wires used to connect multiple subsystems
More informationException and Interrupt Handling in ARM
Exception and Interrupt Handling in ARM Architectures and Design Methods for Embedded Systems Summer Semester 2006 Author: Ahmed Fathy Mohammed Abdelrazek Advisor: Dominik Lücke Abstract We discuss exceptions
More informationInstruction Set Architecture. Datapath & Control. Instruction. LC-3 Overview: Memory and Registers. CIT 595 Spring 2010
Instruction Set Architecture Micro-architecture Datapath & Control CIT 595 Spring 2010 ISA =Programmer-visible components & operations Memory organization Address space -- how may locations can be addressed?
More informationCPU Organization and Assembly Language
COS 140 Foundations of Computer Science School of Computing and Information Science University of Maine October 2, 2015 Outline 1 2 3 4 5 6 7 8 Homework and announcements Reading: Chapter 12 Homework:
More informationAddressing The problem. When & Where do we encounter Data? The concept of addressing data' in computations. The implications for our machine design(s)
Addressing The problem Objectives:- When & Where do we encounter Data? The concept of addressing data' in computations The implications for our machine design(s) Introducing the stack-machine concept Slide
More informationMICROPROCESSOR AND MICROCOMPUTER BASICS
Introduction MICROPROCESSOR AND MICROCOMPUTER BASICS At present there are many types and sizes of computers available. These computers are designed and constructed based on digital and Integrated Circuit
More informationEC 362 Problem Set #2
EC 362 Problem Set #2 1) Using Single Precision IEEE 754, what is FF28 0000? 2) Suppose the fraction enhanced of a processor is 40% and the speedup of the enhancement was tenfold. What is the overall speedup?
More informationChapter 2 Basic Structure of Computers. Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan
Chapter 2 Basic Structure of Computers Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan Outline Functional Units Basic Operational Concepts Bus Structures Software
More informationLet s put together a Manual Processor
Lecture 14 Let s put together a Manual Processor Hardware Lecture 14 Slide 1 The processor Inside every computer there is at least one processor which can take an instruction, some operands and produce
More informationData Dependences. A data dependence occurs whenever one instruction needs a value produced by another.
Data Hazards 1 Hazards: Key Points Hazards cause imperfect pipelining They prevent us from achieving CPI = 1 They are generally causes by counter flow data pennces in the pipeline Three kinds Structural
More informationExceptions in MIPS. know the exception mechanism in MIPS be able to write a simple exception handler for a MIPS machine
7 Objectives After completing this lab you will: know the exception mechanism in MIPS be able to write a simple exception handler for a MIPS machine Introduction Branches and jumps provide ways to change
More informationUNIVERSITY OF CALIFORNIA, DAVIS Department of Electrical and Computer Engineering. EEC180B Lab 7: MISP Processor Design Spring 1995
UNIVERSITY OF CALIFORNIA, DAVIS Department of Electrical and Computer Engineering EEC180B Lab 7: MISP Processor Design Spring 1995 Objective: In this lab, you will complete the design of the MISP processor,
More informationIn the Beginning... 1964 -- The first ISA appears on the IBM System 360 In the good old days
RISC vs CISC 66 In the Beginning... 1964 -- The first ISA appears on the IBM System 360 In the good old days Initially, the focus was on usability by humans. Lots of user-friendly instructions (remember
More informationTraditional IBM Mainframe Operating Principles
C H A P T E R 1 7 Traditional IBM Mainframe Operating Principles WHEN YOU FINISH READING THIS CHAPTER YOU SHOULD BE ABLE TO: Distinguish between an absolute address and a relative address. Briefly explain
More informationA Unified View of Virtual Machines
A Unified View of Virtual Machines First ACM/USENIX Conference on Virtual Execution Environments J. E. Smith June 2005 Introduction Why are virtual machines interesting? They allow transcending of interfaces
More informationCHAPTER 7: The CPU and Memory
CHAPTER 7: The CPU and Memory The Architecture of Computer Hardware, Systems Software & Networking: An Information Technology Approach 4th Edition, Irv Englander John Wiley and Sons 2010 PowerPoint slides
More informationCSC 2405: Computer Systems II
CSC 2405: Computer Systems II Spring 2013 (TR 8:30-9:45 in G86) Mirela Damian http://www.csc.villanova.edu/~mdamian/csc2405/ Introductions Mirela Damian Room 167A in the Mendel Science Building mirela.damian@villanova.edu
More informationEE482: Advanced Computer Organization Lecture #11 Processor Architecture Stanford University Wednesday, 31 May 2000. ILP Execution
EE482: Advanced Computer Organization Lecture #11 Processor Architecture Stanford University Wednesday, 31 May 2000 Lecture #11: Wednesday, 3 May 2000 Lecturer: Ben Serebrin Scribe: Dean Liu ILP Execution
More informationPART B QUESTIONS AND ANSWERS UNIT I
PART B QUESTIONS AND ANSWERS UNIT I 1. Explain the architecture of 8085 microprocessor? Logic pin out of 8085 microprocessor Address bus: unidirectional bus, used as high order bus Data bus: bi-directional
More informationCOMPUTER ORGANIZATION ARCHITECTURES FOR EMBEDDED COMPUTING
COMPUTER ORGANIZATION ARCHITECTURES FOR EMBEDDED COMPUTING 2013/2014 1 st Semester Sample Exam January 2014 Duration: 2h00 - No extra material allowed. This includes notes, scratch paper, calculator, etc.
More informationCPU Organisation and Operation
CPU Organisation and Operation The Fetch-Execute Cycle The operation of the CPU 1 is usually described in terms of the Fetch-Execute cycle. 2 Fetch-Execute Cycle Fetch the Instruction Increment the Program
More informationAdministration. Instruction scheduling. Modern processors. Examples. Simplified architecture model. CS 412 Introduction to Compilers
CS 4 Introduction to Compilers ndrew Myers Cornell University dministration Prelim tomorrow evening No class Wednesday P due in days Optional reading: Muchnick 7 Lecture : Instruction scheduling pr 0 Modern
More informationA SystemC Transaction Level Model for the MIPS R3000 Processor
SETIT 2007 4 th International Conference: Sciences of Electronic, Technologies of Information and Telecommunications March 25-29, 2007 TUNISIA A SystemC Transaction Level Model for the MIPS R3000 Processor
More informationA s we saw in Chapter 4, a CPU contains three main sections: the register section,
6 CPU Design A s we saw in Chapter 4, a CPU contains three main sections: the register section, the arithmetic/logic unit (ALU), and the control unit. These sections work together to perform the sequences
More informationon an system with an infinite number of processors. Calculate the speedup of
1. Amdahl s law Three enhancements with the following speedups are proposed for a new architecture: Speedup1 = 30 Speedup2 = 20 Speedup3 = 10 Only one enhancement is usable at a time. a) If enhancements
More informationLinux Process Scheduling Policy
Lecture Overview Introduction to Linux process scheduling Policy versus algorithm Linux overall process scheduling objectives Timesharing Dynamic priority Favor I/O-bound process Linux scheduling algorithm
More informationComputer Organization and Architecture
Computer Organization and Architecture Chapter 11 Instruction Sets: Addressing Modes and Formats Instruction Set Design One goal of instruction set design is to minimize instruction length Another goal
More information361 Computer Architecture Lecture 14: Cache Memory
1 361 Computer Architecture Lecture 14 Memory cache.1 The Motivation for s Memory System Processor DRAM Motivation Large memories (DRAM) are slow Small memories (SRAM) are fast Make the average access
More informationADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-12: ARM
ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-12: ARM 1 The ARM architecture processors popular in Mobile phone systems 2 ARM Features ARM has 32-bit architecture but supports 16 bit
More informationMACHINE ARCHITECTURE & LANGUAGE
in the name of God the compassionate, the merciful notes on MACHINE ARCHITECTURE & LANGUAGE compiled by Jumong Chap. 9 Microprocessor Fundamentals A system designer should consider a microprocessor-based
More informationLecture 6: Interrupts. CSC 469H1F Fall 2006 Angela Demke Brown
Lecture 6: Interrupts CSC 469H1F Fall 2006 Angela Demke Brown Topics What is an interrupt? How do operating systems handle interrupts? FreeBSD example Linux in tutorial Interrupts Defn: an event external
More informationOverview. CISC Developments. RISC Designs. CISC Designs. VAX: Addressing Modes. Digital VAX
Overview CISC Developments Over Twenty Years Classic CISC design: Digital VAX VAXÕs RISC successor: PRISM/Alpha IntelÕs ubiquitous 80x86 architecture Ð 8086 through the Pentium Pro (P6) RJS 2/3/97 Philosophy
More informationInstruction Set Architecture (ISA) Design. Classification Categories
Instruction Set Architecture (ISA) Design Overview» Classify Instruction set architectures» Look at how applications use ISAs» Examine a modern RISC ISA (DLX)» Measurement of ISA usage in real computers
More informationA Lab Course on Computer Architecture
A Lab Course on Computer Architecture Pedro López José Duato Depto. de Informática de Sistemas y Computadores Facultad de Informática Universidad Politécnica de Valencia Camino de Vera s/n, 46071 - Valencia,
More informationSystems I: Computer Organization and Architecture
Systems I: Computer Organization and Architecture Lecture : Microprogrammed Control Microprogramming The control unit is responsible for initiating the sequence of microoperations that comprise instructions.
More informationPerformance evaluation
Performance evaluation Arquitecturas Avanzadas de Computadores - 2547021 Departamento de Ingeniería Electrónica y de Telecomunicaciones Facultad de Ingeniería 2015-1 Bibliography and evaluation Bibliography
More informationIA-64 Application Developer s Architecture Guide
IA-64 Application Developer s Architecture Guide The IA-64 architecture was designed to overcome the performance limitations of today s architectures and provide maximum headroom for the future. To achieve
More informationComputer Architecture TDTS10
why parallelism? Performance gain from increasing clock frequency is no longer an option. Outline Computer Architecture TDTS10 Superscalar Processors Very Long Instruction Word Processors Parallel computers
More informationChapter 2 Logic Gates and Introduction to Computer Architecture
Chapter 2 Logic Gates and Introduction to Computer Architecture 2.1 Introduction The basic components of an Integrated Circuit (IC) is logic gates which made of transistors, in digital system there are
More informationReduced Instruction Set Computer (RISC)
Reduced Instruction Set Computer (RISC) Focuses on reducing the number and complexity of instructions of the ISA. RISC Goals RISC: Simplify ISA Simplify CPU Design Better CPU Performance Motivated by simplifying
More informationIntroduction to Cloud Computing
Introduction to Cloud Computing Parallel Processing I 15 319, spring 2010 7 th Lecture, Feb 2 nd Majd F. Sakr Lecture Motivation Concurrency and why? Different flavors of parallel computing Get the basic
More informationCS 61C: Great Ideas in Computer Architecture Virtual Memory Cont.
CS 61C: Great Ideas in Computer Architecture Virtual Memory Cont. Instructors: Vladimir Stojanovic & Nicholas Weaver http://inst.eecs.berkeley.edu/~cs61c/ 1 Bare 5-Stage Pipeline Physical Address PC Inst.
More informationA single register, called the accumulator, stores the. operand before the operation, and stores the result. Add y # add y from memory to the acc
Other architectures Example. Accumulator-based machines A single register, called the accumulator, stores the operand before the operation, and stores the result after the operation. Load x # into acc
More informationPROBLEMS. which was discussed in Section 1.6.3.
22 CHAPTER 1 BASIC STRUCTURE OF COMPUTERS (Corrisponde al cap. 1 - Introduzione al calcolatore) PROBLEMS 1.1 List the steps needed to execute the machine instruction LOCA,R0 in terms of transfers between
More informationCS 61C: Great Ideas in Computer Architecture Finite State Machines. Machine Interpreta4on
CS 61C: Great Ideas in Computer Architecture Finite State Machines Instructors: Krste Asanovic & Vladimir Stojanovic hbp://inst.eecs.berkeley.edu/~cs61c/sp15 1 Levels of RepresentaKon/ InterpretaKon High
More informationWeek 1 out-of-class notes, discussions and sample problems
Week 1 out-of-class notes, discussions and sample problems Although we will primarily concentrate on RISC processors as found in some desktop/laptop computers, here we take a look at the varying types
More informationVLIW Processors. VLIW Processors
1 VLIW Processors VLIW ( very long instruction word ) processors instructions are scheduled by the compiler a fixed number of operations are formatted as one big instruction (called a bundle) usually LIW
More informationSwitch Fabric Implementation Using Shared Memory
Order this document by /D Switch Fabric Implementation Using Shared Memory Prepared by: Lakshmi Mandyam and B. Kinney INTRODUCTION Whether it be for the World Wide Web or for an intra office network, today
More informationHardware Assisted Virtualization
Hardware Assisted Virtualization G. Lettieri 21 Oct. 2015 1 Introduction In the hardware-assisted virtualization technique we try to execute the instructions of the target machine directly on the host
More informationCHAPTER 4 MARIE: An Introduction to a Simple Computer
CHAPTER 4 MARIE: An Introduction to a Simple Computer 4.1 Introduction 195 4.2 CPU Basics and Organization 195 4.2.1 The Registers 196 4.2.2 The ALU 197 4.2.3 The Control Unit 197 4.3 The Bus 197 4.4 Clocks
More informationİSTANBUL AYDIN UNIVERSITY
İSTANBUL AYDIN UNIVERSITY FACULTY OF ENGİNEERİNG SOFTWARE ENGINEERING THE PROJECT OF THE INSTRUCTION SET COMPUTER ORGANIZATION GÖZDE ARAS B1205.090015 Instructor: Prof. Dr. HASAN HÜSEYİN BALIK DECEMBER
More informationCentral Processing Unit (CPU)
Central Processing Unit (CPU) CPU is the heart and brain It interprets and executes machine level instructions Controls data transfer from/to Main Memory (MM) and CPU Detects any errors In the following
More informationTIMING DIAGRAM O 8085
5 TIMING DIAGRAM O 8085 5.1 INTRODUCTION Timing diagram is the display of initiation of read/write and transfer of data operations under the control of 3-status signals IO / M, S 1, and S 0. As the heartbeat
More informationwhat operations can it perform? how does it perform them? on what kind of data? where are instructions and data stored?
Inside the CPU how does the CPU work? what operations can it perform? how does it perform them? on what kind of data? where are instructions and data stored? some short, boring programs to illustrate the
More informationThe 104 Duke_ACC Machine
The 104 Duke_ACC Machine The goal of the next two lessons is to design and simulate a simple accumulator-based processor. The specifications for this processor and some of the QuartusII design components
More informationl C-Programming l A real computer language l Data Representation l Everything goes down to bits and bytes l Machine representation Language
198:211 Computer Architecture Topics: Processor Design Where are we now? C-Programming A real computer language Data Representation Everything goes down to bits and bytes Machine representation Language
More informationChapter 2 Topics. 2.1 Classification of Computers & Instructions 2.2 Classes of Instruction Sets 2.3 Informal Description of Simple RISC Computer, SRC
Chapter 2 Topics 2.1 Classification of Computers & Instructions 2.2 Classes of Instruction Sets 2.3 Informal Description of Simple RISC Computer, SRC See Appendix C for Assembly language information. 2.4
More informationQuiz for Chapter 1 Computer Abstractions and Technology 3.10
Date: 3.10 Not all questions are of equal difficulty. Please review the entire quiz first and then budget your time carefully. Name: Course: Solutions in Red 1. [15 points] Consider two different implementations,
More information18-447 Computer Architecture Lecture 3: ISA Tradeoffs. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 1/18/2013
18-447 Computer Architecture Lecture 3: ISA Tradeoffs Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 1/18/2013 Reminder: Homeworks for Next Two Weeks Homework 0 Due next Wednesday (Jan 23), right
More informationMICROPROCESSOR. Exclusive for IACE Students www.iace.co.in iacehyd.blogspot.in Ph: 9700077455/422 Page 1
MICROPROCESSOR A microprocessor incorporates the functions of a computer s central processing unit (CPU) on a single Integrated (IC), or at most a few integrated circuit. It is a multipurpose, programmable
More informationChapter 1 Computer System Overview
Operating Systems: Internals and Design Principles Chapter 1 Computer System Overview Eighth Edition By William Stallings Operating System Exploits the hardware resources of one or more processors Provides
More informationIntel 8086 architecture
Intel 8086 architecture Today we ll take a look at Intel s 8086, which is one of the oldest and yet most prevalent processor architectures around. We ll make many comparisons between the MIPS and 8086
More informationCOS 318: Operating Systems
COS 318: Operating Systems OS Structures and System Calls Andy Bavier Computer Science Department Princeton University http://www.cs.princeton.edu/courses/archive/fall10/cos318/ Outline Protection mechanisms
More informationInterpreters and virtual machines. Interpreters. Interpreters. Why interpreters? Tree-based interpreters. Text-based interpreters
Interpreters and virtual machines Michel Schinz 2007 03 23 Interpreters Interpreters Why interpreters? An interpreter is a program that executes another program, represented as some kind of data-structure.
More information