EC 362 Problem Set #2

Save this PDF as:

Size: px
Start display at page:

Transcription

1 EC 362 Problem Set #2 1) Using Single Precision IEEE 754, what is FF ? 2) Suppose the fraction enhanced of a processor is 40% and the speedup of the enhancement was tenfold. What is the overall speedup? What if the fraction enhanced increased to 90%? 3) MIPS has one of the simpler instruction sets in existence. However, it is possible to imagine even simpler instruction sets. Consider a hypothetical machine call SIC, for Single Instruction Computer. As its name implies, SIC has only one instruction: subtract and branch if negative, or sbn for short. The sbn instruction has three operands, each consisting of the address of a word in memory: sbn a, b, c # Mem[a] = Mem[a] - Mem[b]; if (Mem[a]<0) go to c The instruction will subtract the number in memory location b from the number in location memory location a and place the result back in memory location a, overwriting the previous value. If the result is greater than or equal to 0, the computer will take its next instruction from the memory location just after the current instruction. If the result is less than 0, the next instruction is taken from memory location c. SIC has no registers and no instructions other than sbn. Although it has only one instruction, SIC can imitate many of the operations of more complex instruction sets by using clever sequences of sbn instructions. For example, here is a program to copy a number from location a to location b: start: sbn temp, temp,.+1 # Sets temp to zero sbn temp, a,.+1 # Sets temp to -a sbn b, b,.+1 # Sets b to zero sbn b, temp,.+1 # Sets b to -temp, which is a In the program above, the notation.+1 means the address after this one, so that each instruction in this program goes on to the next in sequence whether or not the result is negative. We assume temp to be the address of a spare memory word that can be used for temporary results. Part a) Write a SIC program to add a and b, leaving the result in a and leaving b unmodified. Your SIC program should only require 3 instructions. Part b) Write a SIC program to multiply a by b, putting the result in c. Assume that memory location "one" contains the number 1. Assume that a and b are greater than 0 and that it s OK to modify a or b. (Hint: What does this program compute?) c = 0; while (b > 0) {b = b 1; c = c + a;} Your SIC program should only require 5 instructions. 4) This problem compares the memory efficiency of four different styles of instruction sets for two code sequences. The architecture styles are the following: Accumulator: All operations occur in a single register called the accumulator (acc). Memory-memory: All three operands of each instruction are in memory. Stack: All operations occur on top of the stack. Only push and pop access memory, and all other instructions remove their operands from the stack and replace them with the result. The implementation uses a stack for the top two entries; accesses that use other stack positions are memory references.

3 Accumulator Memory-memory Stack Load-store For the following C code, write an equivalent assembly language program in each architectural style (assume all variables are initially in memory): a = b + c; b = a + c; d = a b; Assume that the stack instruction set has instructions such as push, pop, add, sub, negate (computes the unary "minus" of the element on the top of the stack), and duplicate (makes a copy of the top of the stack), and assume that the accumulator instruction set has instructions such as load, store, add, sub, and negate. For each code sequence, calculate the instruction bytes fetched and the memory data bytes transferred (read or written). Be efficient in writing your code, i.e. write the code as efficiently as you can so as to minimize the memory traffic. Which architecture is most efficient as measured by code size? Which architecture is most efficient as measured by total memory bandwidth required (code + data)? If the answers are not the same, why are they different? 5) Consider two different implementations, M1 and M2, of the same instruction set. There are three classes of instructions (A, B, and C) in the instruction set. M1 has a clock rate of 4 GHz, and M2 has a clock rate of 2 GHz. The average number of cycles for each instruction class on M1 and M2 is given in the following table. Class CPI on M1 CPI on M2 C1 usage C2 usage Third-Party usage A % 30% 50% B % 20% 30% C % 50% 20% The table also contains a summary of how three different compilers use the instruction set. C1 is a compiler produced by the makers of M1, C2 is a compiler produced by the makers of M2, and the other compiler is a third-party product. Assume that each compiler uses the same number of instructions for a given program but that the instruction mix is as described in the table. Using C1 on both M1 and M2, how much faster can the makers of M1 claim that M1 is compared with M2? Using C2 on both M2 and M1, how much faster can the makers of M2 claim that M2 is compared with M1? If you purchase M1, which compiler would you use? Which machine would you purchase if we assume that all other criteria are identical, including costs?

4 Solutions to Problem Set # *2^( ) = *2^ With 40%, Speedup = 1/( /10) = 1.56 With 90%, Speedup = 1/( /10) = a) sbn temp, temp,.+1 # temp = 0 sbn temp, b,.+1 # temp = -b sbn a, temp,.+1 # a = a temp = a + b 3b) sbn temp, temp,.+1 # temp = 0 sbn b, one,.+2 # if b < 1 jump +2. either way b=b-1 sbn temp, a,.-1 # temp = temp a, go back 1 instruction sbn c, c,.+1 # c = 0 sbn c, temp,.+1 # c = c temp = b*a 4) Total memory traffic = 50 bytes. Total memory traffic = 57 bytes. Acc Code Data Load b 3 4 Add c 3 4 Store a 3 4 Add c 3 4 Store b 3 4 Neg 1 0 Add a 3 4 Store d 3 4 Total Mem/mem Code Data Add a,b,c 7 12 Add b,a,c 7 12 Sub d,a,b 7 12 Total For stack operations, I ll assume that the sub operation computes A-B where B is on top of A on the stack. Stack Code Data Push b 3 4 Push c 3 4 Add 1 0 Dup 1 0 Pop a 3 4 Push c 3 4

5 Add 1 0 Dup 1 0 Pop b 3 4 Neg 1 0 Push a 3 4 Add 1 0 Pop d 3 4 Total Total memory traffic = 55 bytes. Load/store Code Data Load regb,b 4 4 Load regc,c 4 4 Add rega,regb,regc 3 0 Store rega,a 4 4 Add regb,rega,regc 3 0 Store regb,b 4 4 Sub regd,rega,regb 3 0 Store regd,d 4 4 Total Total memory traffic = 49 bytes. The load/store machine has the least memory traffic since everything can be held in registers, once fetched from memory. Of course, it has the largest code size since it cannot combine memory ops with arithmetic. On the other extreme is the mem/mem architecture which has the best code size, but terrible memory traffic since 3 operands are fetched for every statement! 5. Cpu = IC * CPI / Rate Rate M1 = 4 GHz = 4e9 Hz Rate M2 = 2 GHz = 2e9 Hz Class CPI on M1 CPI on M2 C1 usage C2 usage Third-Party usage A % 30% 50% B % 20% 30% C % 50% 20% We are told that the instruction counts (IC) are the same for each compiler on each machine. CPI C1,M1 = (4 * * *.2) = 5.8 CPI C1,M2 = (2 * * *.2) = 3.2 CPI C2,M1 = (4 * * *.5) = 6.4 CPI C2,M2 = (2 * * *.5) = 2.9 CPI C3,M1 = (4 * * *.2) = 5.4 CPI C3,M2 = (2 * * *.2) = 2.8 Perf C1,M1 / Perf C1,M2 = Cpu C1,M2 / Cpu C1,M1 = (IC * 3.2/2e9) / (IC * 5.8/4e9) = 16ns / 14.5ns= 1.10, so that M1 is 1.1 times faster than M2 using compiler C1. Perf C2,M2 / Perf C2,M1 = Cpu C2,M1 / Cpu C2,M2 = (IC * 6.4/4e9) / (IC * 2.9/2e9) = 16ns / 14.5ns = 1.10, so that M2 is 1.1 times faster than M1 using compiler C2.

6 Perf C3,M1 / Perf C3,M2 = Cpu C3,M2 / Cpu C3,M1 = (IC * 2.8/2e9) / (IC * 5.4/4e9) = 14ns / 13.5ns = 1.03, so that M1 is 1.03 times faster than M2 using C3. If you own M1, you should use C3. If you own M2, you should use C3 also. This is poor news for the native compiler writers of these two machines! You should always use C3 as your compiler, and if you buy a machine, buy M1.

Computer Architecture

PhD Qualifiers Examination Computer Architecture Spring 2009 NOTE: 1. This is a CLOSED book, CLOSED notes exam. 2. Please show all your work clearly in legible handwriting. 3. State all your assumptions.

Instruction Set Architecture. or How to talk to computers if you aren t in Star Trek

Instruction Set Architecture or How to talk to computers if you aren t in Star Trek The Instruction Set Architecture Application Compiler Instr. Set Proc. Operating System I/O system Instruction Set Architecture

Advanced Computer Architecture-CS501. Computer Systems Design and Architecture 2.1, 2.2, 3.2

Lecture Handout Computer Architecture Lecture No. 2 Reading Material Vincent P. Heuring&Harry F. Jordan Chapter 2,Chapter3 Computer Systems Design and Architecture 2.1, 2.2, 3.2 Summary 1) A taxonomy of

Central Processing Unit (CPU)

Central Processing Unit (CPU) CPU is the heart and brain It interprets and executes machine level instructions Controls data transfer from/to Main Memory (MM) and CPU Detects any errors In the following

Chapter 5 Instructor's Manual

The Essentials of Computer Organization and Architecture Linda Null and Julia Lobur Jones and Bartlett Publishers, 2003 Chapter 5 Instructor's Manual Chapter Objectives Chapter 5, A Closer Look at Instruction

A single register, called the accumulator, stores the. operand before the operation, and stores the result. Add y # add y from memory to the acc

Other architectures Example. Accumulator-based machines A single register, called the accumulator, stores the operand before the operation, and stores the result after the operation. Load x # into acc

Instruction Set Architecture (ISA)

Instruction Set Architecture (ISA) * Instruction set architecture of a machine fills the semantic gap between the user and the machine. * ISA serves as the starting point for the design of a new machine

CSE Lecture 04 In Class Example Handout

CSE 30321 Lecture 04 In Class Example Handout Part A: An Initial CPU Time Example Question 1: Preface: We can modify the datapath from Lecture 02-03 of the 3-instruction processor to add an instruction

1. Program A runs in 10 seconds on a machine with a 100 MHz clock. How many clock cycles does program A require?

(5 pts) Exercise 1-51 1. Program A runs in 10 seconds on a machine with a 100 MHz clock. How many clock cycles does program A require? (5 pts) Exercise 1-52 2. ) Our favorite program runs in 10 seconds

LSN 2 Computer Processors

LSN 2 Computer Processors Department of Engineering Technology LSN 2 Computer Processors Microprocessors Design Instruction set Processor organization Processor performance Bandwidth Clock speed LSN 2

CDA 3103 Computer Organization

CDA 3103 Computer Organization Exam 2 (Oct. 27th, 2014) Name: USF ID: Exam Rules Problem Points Score 1 10 2 10 3 15 4 7 5 8 6 10 Total 60 Use the back of the exam paper as necessary. But indicate clearly

Instruction Set Design

Instruction Set Design Instruction Set Architecture: to what purpose? ISA provides the level of abstraction between the software and the hardware One of the most important abstraction in CS It s narrow,

Computer Architecture Lecture 2: Instruction Set Principles (Appendix A) Chih Wei Liu 劉 志 尉 National Chiao Tung University cwliu@twins.ee.nctu.edu.

Computer Architecture Lecture 2: Instruction Set Principles (Appendix A) Chih Wei Liu 劉 志 尉 National Chiao Tung University cwliu@twins.ee.nctu.edu.tw Review Computers in mid 50 s Hardware was expensive

a. Chapter 1: Exercises: =>

Selected Solutions to Problem-Set #1 COE608: Computer Organization and Architecture Introduction, Instruction Set Architecture and Computer Arithmetic Chapters 1, 2 and 3 a. Chapter 1: Exercises: 1.1.1

İSTANBUL AYDIN UNIVERSITY

İSTANBUL AYDIN UNIVERSITY FACULTY OF ENGİNEERİNG SOFTWARE ENGINEERING THE PROJECT OF THE INSTRUCTION SET COMPUTER ORGANIZATION GÖZDE ARAS B1205.090015 Instructor: Prof. Dr. HASAN HÜSEYİN BALIK DECEMBER

MICROPROCESSOR AND MICROCOMPUTER BASICS

Introduction MICROPROCESSOR AND MICROCOMPUTER BASICS At present there are many types and sizes of computers available. These computers are designed and constructed based on digital and Integrated Circuit

Lecture Topics. Pipelining in Processors. Pipelining and ISA ECE 486/586. Computer Architecture. Lecture # 10. Reference:

Lecture Topics ECE 486/586 Computer Architecture Lecture # 10 Spring 2015 Pipelining Hazards and Stalls Reference: Effect of Stalls on Pipeline Performance Structural hazards Data Hazards Appendix C: Sections

Overview. CISC Developments. RISC Designs. CISC Designs. VAX: Addressing Modes. Digital VAX

Overview CISC Developments Over Twenty Years Classic CISC design: Digital VAX VAXÕs RISC successor: PRISM/Alpha IntelÕs ubiquitous 80x86 architecture Ð 8086 through the Pentium Pro (P6) RJS 2/3/97 Philosophy

Quiz for Chapter 1 Computer Abstractions and Technology 3.10

Date: 3.10 Not all questions are of equal difficulty. Please review the entire quiz first and then budget your time carefully. Name: Course: Solutions in Red 1. [15 points] Consider two different implementations,

Introduction to Design of a Tiny Computer

Introduction to Design of a Tiny Computer (Background of LAB#4) Objective In this lab, we will build a tiny computer in Verilog. The execution results will be displayed in the HEX[3:0] of your board. Unlike

CPU- Internal Structure

ESD-1 Elettronica dei Sistemi Digitali 1 CPU- Internal Structure Lesson 12 CPU Structure&Function Instruction Sets Addressing Modes Read Stallings s chapters: 11, 9, 10 esd-1-9:10:11-2002 1 esd-1-9:10:11-2002

1 Classical Universal Computer 3

Chapter 6: Machine Language and Assembler Christian Jacob 1 Classical Universal Computer 3 1.1 Von Neumann Architecture 3 1.2 CPU and RAM 5 1.3 Arithmetic Logical Unit (ALU) 6 1.4 Arithmetic Logical Unit

Assignment 2 Solutions Instruction Set Architecture, Performance, Spim, and Other ISAs

Assignment 2 Solutions Instruction Set Architecture, Performance, Spim, and Other ISAs Alice Liang Apr 18, 2013 Unless otherwise noted, the following problems are from the Patterson & Hennessy textbook

Stack machines The MIPS assembly language A simple source language Stack-machine implementation of the simple language Readings: 9.1-9.

Code Generation I Stack machines The MIPS assembly language A simple source language Stack-machine implementation of the simple language Readings: 9.1-9.7 Stack Machines A simple evaluation model No variables

EE282 Computer Architecture and Organization Midterm Exam February 13, 2001. (Total Time = 120 minutes, Total Points = 100)

EE282 Computer Architecture and Organization Midterm Exam February 13, 2001 (Total Time = 120 minutes, Total Points = 100) Name: (please print) Wolfe - Solution In recognition of and in the spirit of the

Chapter 2 Topics. 2.1 Classification of Computers & Instructions 2.2 Classes of Instruction Sets 2.3 Informal Description of Simple RISC Computer, SRC

Chapter 2 Topics 2.1 Classification of Computers & Instructions 2.2 Classes of Instruction Sets 2.3 Informal Description of Simple RISC Computer, SRC See Appendix C for Assembly language information. 2.4

response (or execution) time -- the time between the start and the finish of a task throughput -- total amount of work done in a given time

Chapter 4: Assessing and Understanding Performance 1. Define response (execution) time. response (or execution) time -- the time between the start and the finish of a task 2. Define throughput. throughput

CSE320 Final Exam Practice Questions

CSE320 Final Exam Practice Questions Single Cycle Datapath/ Multi Cycle Datapath Adding instructions Modify the datapath and control signals to perform the new instructions in the corresponding datapath.

Instruction Set Architectures

Instruction Set Architectures Early trend was to add more and more instructions to new CPUs to do elaborate operations (CISC) VAX architecture had an instruction to multiply polynomials! RISC philosophy

Lecture Handouts Computer Architecture Lecture No. 12 Reading Material Vincent P. Heuring&Harry F. Jordan Chapter 4 Computer Systems Design and Architecture 4.1, 4.2, 4.3 Summary 7) The design process

Unit 5 Central Processing Unit (CPU)

Unit 5 Central Processing Unit (CPU) Introduction Part of the computer that performs the bulk of data-processing operations is called the central processing unit (CPU). It consists of 3 major parts: Register

Chapter 5. Processing Unit Design

Chapter 5 Processing Unit Design 5.1 CPU Basics A typical CPU has three major components: Register Set, Arithmetic Logic Unit, and Control Unit (CU). The register set is usually a combination of general-purpose

Advanced Pipelining Techniques 1. Dynamic Scheduling 2. Loop Unrolling 3. Software Pipelining 4. Dynamic Branch Prediction Units 5. Register Renaming 6. Superscalar Processors 7. VLIW (Very Large Instruction

Chapter 7D The Java Virtual Machine

This sub chapter discusses another architecture, that of the JVM (Java Virtual Machine). In general, a VM (Virtual Machine) is a hypothetical machine (implemented in either hardware or software) that directly

Recap & Perspective. Sequential Logic Circuits & Architecture Alternatives. Register Operation. Building a Register. Timing Diagrams.

Sequential Logic Circuits & Architecture Alternatives Recap & Perspective COMP 251 Computer Organization and Architecture Fall 2009 So Far: ALU Implementation Next: Register Implementation Register Operation

Q. Consider a dynamic instruction execution (an execution trace, in other words) that consists of repeats of code in this pattern:

Pipelining HW Q. Can a MIPS SW instruction executing in a simple 5-stage pipelined implementation have a data dependency hazard of any type resulting in a nop bubble? If so, show an example; if not, prove

MIPS: The Virtual Machine

2 Objectives After this lab you will know: what synthetic instructions are how synthetic instructions expand to sequences of native instructions how MIPS addresses the memory Introduction MIPS is a typical

Basic Computer Organization

SE 292 (3:0) High Performance Computing L2: Basic Computer Organization R. Govindarajan govind@serc Basic Computer Organization Main parts of a computer system: Processor: Executes programs Main memory:

Assembly Language: Overview! Jennifer Rexford!

Assembly Language: Overview! Jennifer Rexford! 1 Goals of this Lecture! Help you learn:! The basics of computer architecture! The relationship between C and assembly language! IA-32 assembly language,

CS 350 COMPUTER ORGANIZATION AND ASSEMBLY LANGUAGE PROGRAMMING Week 5

CS 350 COMPUTER ORGANIZATION AND ASSEMBLY LANGUAGE PROGRAMMING Week 5 Reading: Objectives: 1. D. Patterson and J. Hennessy, Chapter- 4 To understand load store architecture. Learn how to implement Load

A Closer Look at Instruction Set Architectures

00068_CH05_Null.qxd 10/13/10 1:30 PM Page 269 Every program has at least one bug and can be shortened by at least one instruction from which, by induction, one can deduce that every program can be reduced

on an system with an infinite number of processors. Calculate the speedup of

1. Amdahl s law Three enhancements with the following speedups are proposed for a new architecture: Speedup1 = 30 Speedup2 = 20 Speedup3 = 10 Only one enhancement is usable at a time. a) If enhancements

Addressing The problem. When & Where do we encounter Data? The concept of addressing data' in computations. The implications for our machine design(s)

Addressing The problem Objectives:- When & Where do we encounter Data? The concept of addressing data' in computations The implications for our machine design(s) Introducing the stack-machine concept Slide

CPU Performance. Lecture 8 CAP 3103 06-11-2014

CPU Performance Lecture 8 CAP 3103 06-11-2014 Defining Performance Which airplane has the best performance? 1.6 Performance Boeing 777 Boeing 777 Boeing 747 BAC/Sud Concorde Douglas DC-8-50 Boeing 747

The Von Neumann Model. University of Texas at Austin CS310H - Computer Organization Spring 2010 Don Fussell

The Von Neumann Model University of Texas at Austin CS310H - Computer Organization Spring 2010 Don Fussell The Stored Program Computer 1943: ENIAC Presper Eckert and John Mauchly -- first general electronic

Computer organization

Computer organization Computer design an application of digital logic design procedures Computer = processing unit + memory system Processing unit = control + datapath Control = finite state machine inputs

Appendix A Solutions (by Rui Ma and Gregory D. Peterson)

2 Solutions to Case Studies and Exercises Appendix A Solutions (by Rui Ma and Gregory D. Peterson) A.1 The first challenge of this exercise is to obtain the instruction mix. The instruction frequencies

CSCI-GA Multicore Processors: Architecture & Programming Lecture 9: Performance Evaluation

CSCI-GA.3033-012 Multicore Processors: Architecture & Programming Lecture 9: Performance Evaluation Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com What Are We trying to do? Measure, Report,

Processing Unit Design

&CHAPTER 5 Processing Unit Design In previous chapters, we studied the history of computer systems and the fundamental issues related to memory locations, addressing modes, assembly language, and computer

Computer Organization and Architecture

Computer Organization and Architecture Chapter 11 Instruction Sets: Addressing Modes and Formats Instruction Set Design One goal of instruction set design is to minimize instruction length Another goal

Engineering 9859 CoE Fundamentals Computer Architecture

Engineering 9859 CoE Fundamentals Computer Architecture Instruction Set Principles Dennis Peters 1 Fall 2007 1 Based on notes from Dr. R. Venkatesan RISC vs. CISC Complex Instruction Set Computers (CISC)

a storage location directly on the CPU, used for temporary storage of small amounts of data during processing.

CS143 Handout 18 Summer 2008 30 July, 2008 Processor Architectures Handout written by Maggie Johnson and revised by Julie Zelenski. Architecture Vocabulary Let s review a few relevant hardware definitions:

Processing Unit. Backing Store

SYSTEM UNIT Basic Computer Structure Input Unit Central Processing Unit Main Memory Output Unit Backing Store The Central Processing Unit (CPU) is the unit in the computer which operates the whole computer

CS 265. Computer Architecture. Wei Lu, Ph.D., P.Eng.

CS 265 Computer Architecture Wei Lu, Ph.D., P.Eng. Part 5: Processors Our goal: understand basics of processors and CPU understand the architecture of MARIE, a model computer a close look at the instruction

Lecture 3 A Very Simple Processor. Memory MPU I/O. Memory. Memory, registers and ALU. Memory

Lecture 3 A Very Simple Processor Based on von Neumann model Instruction for the microprocessor (stored program) and the data for computation are both stored in memory Microprocessing Unit (MPU) contains:

Lecture 3 Pipeline Hazards

CS 203A Advanced Computer Architecture Lecture 3 Pipeline Hazards 1 Pipeline Hazards Hazards are caused by conflicts between instructions. Will lead to incorrect behavior if not fixed. Three types: Structural:

OAMulator. Online One Address Machine emulator and OAMPL compiler. http://myspiders.biz.uiowa.edu/~fil/oam/

OAMulator Online One Address Machine emulator and OAMPL compiler http://myspiders.biz.uiowa.edu/~fil/oam/ OAMulator educational goals OAM emulator concepts Von Neumann architecture Registers, ALU, controller

Model Answers HW1 Chapter #2 2.2. a. On the IAS, what would the machine code instruction look like to load the contents of memory address 2? b. How many trips to memory does the CPU need to make to complete

Instructions: Assembly Language

Chapter 2 Instructions: Assembly Language Reading: The corresponding chapter in the 2nd edition is Chapter 3, in the 3rd edition it is Chapter 2 and Appendix A and in the 4th edition it is Chapter 2 and

Intel 8086 architecture

Intel 8086 architecture Today we ll take a look at Intel s 8086, which is one of the oldest and yet most prevalent processor architectures around. We ll make many comparisons between the MIPS and 8086

Chapter 2 : Central Processing Unit

Chapter 2 Central Processing Unit The part of the computer that performs the bulk of data processing operations is called the Central Processing Unit (CPU) and is the central component of a digital computer.

Instruction Set Architecture

Instruction Set Architecture Consider x := y+z. (x, y, z are memory variables) 1-address instructions 2-address instructions LOAD y (r :=y) ADD y,z (y := y+z) ADD z (r:=r+z) MOVE x,y (x := y) STORE x (x:=r)

Department of Computer and Mathematical Sciences. Assigment 4: Introduction to MARIE

Department of Computer and Mathematical Sciences CS 3401 Assembly Language 4 Assigment 4: Introduction to MARIE Objectives: The main objective of this lab is to get you familiarized with MARIE a simple

Computing Systems. The Processor: Datapath and Control. The performance of a machine depends on 3 key factors:

Computing Systems The Processor: Datapath and Control claudio.talarico@mail.ewu.edu 1 Introduction The performance of a machine depends on 3 key factors: compiler and ISA instruction count clock cycle

Demo program from Lecture #1

Registers start MOV total, a ; Make the first number the subtotal ADD total, total, b ; Add the second number to the subtotal ADD total, total, c ; Add the third number to the subtotal ADD total, total,

What is Pipelining? Time per instruction on unpipelined machine Number of pipe stages

What is Pipelining? Is a key implementation techniques used to make fast CPUs Is an implementation techniques whereby multiple instructions are overlapped in execution It takes advantage of parallelism

Pope Paul VI College Information and Communication Technology

Pope Paul VI College Information and Communication Technology CPU - Central processing unit (CPU) is the brain of a computer which and instructions of a computer program. - It executes computer instructions

Chapter 3. lw \$s1,100(\$s2) \$s1 = Memory[\$s2+100] sw \$s1,100(\$s2) Memory[\$s2+100] = \$s1

Chapter 3 1 MIPS Instructions Instruction Meaning add \$s1,\$s2,\$s3 \$s1 = \$s2 + \$s3 sub \$s1,\$s2,\$s3 \$s1 = \$s2 \$s3 addi \$s1,\$s2,4 \$s1 = \$s2 + 4 ori \$s1,\$s2,4 \$s2 = \$s2 4 lw \$s1,100(\$s2) \$s1 = Memory[\$s2+100]

In the Beginning... 1964 -- The first ISA appears on the IBM System 360 In the good old days

RISC vs CISC 66 In the Beginning... 1964 -- The first ISA appears on the IBM System 360 In the good old days Initially, the focus was on usability by humans. Lots of user-friendly instructions (remember

CSE 141 Midterm Exam

CSE 141 Midterm Exam 2011 Winter Professor Steven Swanson 1. Please write your name at the top of each page 2. This is a close book, closed notes exam. No outside material may be used. 3. You may use a

CPU Organization and Assembly Language

COS 140 Foundations of Computer Science School of Computing and Information Science University of Maine October 2, 2015 Outline 1 2 3 4 5 6 7 8 Homework and announcements Reading: Chapter 12 Homework:

Compilers I - Chapter 4: Generating Better Code

Compilers I - Chapter 4: Generating Better Code Lecturers: Paul Kelly (phjk@doc.ic.ac.uk) Office: room 304, William Penney Building Naranker Dulay (nd@doc.ic.ac.uk) Materials: Office: room 562 Textbook

Divide: Paper & Pencil. Computer Architecture ALU Design : Division and Floating Point. Divide algorithm. DIVIDE HARDWARE Version 1

Divide: Paper & Pencil Computer Architecture ALU Design : Division and Floating Point 1001 Quotient Divisor 1000 1001010 Dividend 1000 10 101 1010 1000 10 (or Modulo result) See how big a number can be

(Refer Slide Time: 00:01:16 min)

Digital Computer Organization Prof. P. K. Biswas Department of Electronic & Electrical Communication Engineering Indian Institute of Technology, Kharagpur Lecture No. # 04 CPU Design: Tirning & Control

Quiz for Chapter 2 Instructions: Language of the Computer3.10

Date: 3.10 Not all questions are of equal difficulty. Please review the entire quiz first and then budget your time carefully. Name: Course: Solutions in Red 1. [5 points] Prior to the early 1980s, machines

Lecture 3: MIPS Instruction Set

Lecture 3: MIPS Instruction Set Today s topic: More MIPS instructions Procedure call/return Reminder: Assignment 1 is on the class web-page (due 9/7) 1 Memory Operands Values must be fetched from memory

Why study the Alpha (assembly)?

Why study the Alpha (assembly)? The Alpha architecture is the first 64-bit load/store RISC (as opposed to CISC) architecture designed to enhance computer performance by improving clock speeding, multiple

Notes on Assembly Language

Notes on Assembly Language Brief introduction to assembly programming The main components of a computer that take part in the execution of a program written in assembly code are the following: A set of

PART OF THE PICTURE: Computer Architecture

PART OF THE PICTURE: Computer Architecture 1 PART OF THE PICTURE: Computer Architecture BY WILLIAM STALLINGS At a top level, a computer consists of processor, memory, and I/O components, with one or more

CS107 Handout 12 Spring 2008 April 18, 2008 Computer Architecture: Take I

CS107 Handout 12 Spring 2008 April 18, 2008 Computer Architecture: Take I Computer architecture Handout written by Julie Zelenski and Nick Parlante A simplified picture with the major features of a computer

Chapter 4 Integer JAVA Virtual Machine

Processor Block Diagram Chapter 4 Integer JAVA Virtual Machine 1 of 14 ECE 357 Register Definitions PC MBR MAR MDR SP LV CPP TOS OPC H Program Counter: Access Data in Method Area Memory Branch Register:

Computer Organization. and Instruction Execution. August 22

Computer Organization and Instruction Execution August 22 CSC201 Section 002 Fall, 2000 The Main Parts of a Computer CSC201 Section Copyright 2000, Douglas Reeves 2 I/O and Storage Devices (lots of devices,

The ARM11 Architecture

The ARM11 Architecture Ian Davey Payton Oliveri Spring 2009 CS433 Why ARM Matters Over 90% of the embedded market is based on the ARM architecture ARM Ltd. makes over \$100 million USD annually in royalties

Performance Basics; Computer Architectures

8 Performance Basics; Computer Architectures 8.1 Speed and limiting factors of computations Basic floating-point operations, such as addition and multiplication, are carried out directly on the central

ARM & IA-32. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

ARM & IA-32 Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu ARM (1) ARM & MIPS similarities ARM: the most popular embedded core Similar basic set

Lecture 1: Introduction to Microcomputers

Lecture 1: Introduction to Microcomputers Today s Topics What is a microcomputers? Why do we study microcomputers? Two basic types of microcomputer architectures Internal components of a microcomputers

The Java Virtual Machine

The Java Virtual Machine The Java programming language is a high-level language developed by Sun Microsystems, now a wholly owned subsidiary of Oracle, Inc. The Java language is neither interpreted nor

Data Types and Addressing Modes 29

29 This section describes data types and addressing modes available to programmers of the Intel Architecture processors. 29.1 Fundamental Data Types The fundamental data types of the Intel Architecture

HW 5 Solutions. Manoj Mardithaya

HW 5 Solutions Manoj Mardithaya Question 1: Processor Performance The critical path latencies for the 7 major blocks in a simple processor are given below. IMem Add Mux ALU Regs DMem Control a 400ps 100ps

Chapter 4 Accessing and Understanding Performance

Chapter 4 Accessing and Understanding Performance 1 Performance Why do we care about performance evaluation? Purchasing perspective given a collection of machines, which has the best performance? least

Ch 5: Designing a Single Cycle Datapath

Ch 5: Designing a Single Cycle path Computer Systems Architecture CS 365 The Big Picture: Where are We Now? The Five Classic Components of a Computer Processor Control Memory path Input Output Today s

High level language programs versus Machine code

High level language programs versus Machine code You all have experience programming in a high level language. You know that when you write a program in a language such as C or Java, the program is nothing

The conceit is that inside the CPU is a tiny man that runs around organising data and performing the calculations. Inside the box of the CPU are:!

Little Man Computer The Little Man Computer is a conceptual model of a simple CPU, introduced by Dr. Stuart Madnick of M.I.T. in 1965. Although it seems simplistic, in fact the model captures many important

Chapter 4 Lecture 3 The Microarchitecture Level Integer JAVA Virtual Machine

Chapter 4 Lecture 3 The Microarchitecture Level Integer JAVA Virtual Machine This is a limited version of a hardware implementation to execute the JAVA programming language. 1 of 30 Review The Data Path

CS4617 Computer Architecture

1/39 CS4617 Computer Architecture Lecture 20: Pipelining Reference: Appendix C, Hennessy & Patterson Reference: Hamacher et al. Dr J Vaughan November 2013 2/39 5-stage pipeline Clock cycle Instr No 1 2

2 Programs: Instructions in the Computer

2 2 Programs: Instructions in the Computer Figure 2. illustrates the first few processing steps taken as a simple CPU executes a program. The CPU for this example is assumed to have a program counter (PC),

2) Write in detail the issues in the design of code generator.

COMPUTER SCIENCE AND ENGINEERING VI SEM CSE Principles of Compiler Design Unit-IV Question and answers UNIT IV CODE GENERATION 9 Issues in the design of code generator The target machine Runtime Storage