Computer Architecture Examination Paper with Sample Solutions and Marking Scheme. CS3 June 1995

Similar documents
WAR: Write After Read

Course on Advanced Computer Architectures

Using Graphics and Animation to Visualize Instruction Pipelining and its Hazards

CS352H: Computer Systems Architecture

Execution Cycle. Pipelining. IF and ID Stages. Simple MIPS Instruction Formats

Solutions. Solution The values of the signals are as follows:

Q. Consider a dynamic instruction execution (an execution trace, in other words) that consists of repeats of code in this pattern:

Pipeline Hazards. Structure hazard Data hazard. ComputerArchitecture_PipelineHazard1

Solution: start more than one instruction in the same clock cycle CPI < 1 (or IPC > 1, Instructions per Cycle) Two approaches:

Design of Pipelined MIPS Processor. Sept. 24 & 26, 1997

Instruction Set Architecture. or How to talk to computers if you aren t in Star Trek

Computer organization

Week 1 out-of-class notes, discussions and sample problems

COMPUTER ORGANIZATION ARCHITECTURES FOR EMBEDDED COMPUTING

Pipeline Hazards. Arvind Computer Science and Artificial Intelligence Laboratory M.I.T. Based on the material prepared by Arvind and Krste Asanovic

LSN 2 Computer Processors

Data Dependences. A data dependence occurs whenever one instruction needs a value produced by another.

INSTRUCTION LEVEL PARALLELISM PART VII: REORDER BUFFER


CPU Organization and Assembly Language

Computer Organization and Components

PROBLEMS #20,R0,R1 #$3A,R2,R4

VLIW Processors. VLIW Processors

Quiz for Chapter 1 Computer Abstractions and Technology 3.10

A SystemC Transaction Level Model for the MIPS R3000 Processor

Computer Architecture Lecture 2: Instruction Set Principles (Appendix A) Chih Wei Liu 劉 志 尉 National Chiao Tung University

Lecture: Pipelining Extensions. Topics: control hazards, multi-cycle instructions, pipelining equations

UNIVERSITY OF CALIFORNIA, DAVIS Department of Electrical and Computer Engineering. EEC180B Lab 7: MISP Processor Design Spring 1995

Instruction Set Architecture

Central Processing Unit (CPU)

ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-12: ARM

Computer Organization and Architecture. Characteristics of Memory Systems. Chapter 4 Cache Memory. Location CPU Registers and control unit memory

CS:APP Chapter 4 Computer Architecture. Wrap-Up. William J. Taffe Plymouth State University. using the slides of

COMP 303 MIPS Processor Design Project 4: MIPS Processor Due Date: 11 December :59

EE361: Digital Computer Organization Course Syllabus

Introduction to Cloud Computing

Module: Software Instruction Scheduling Part I

CHAPTER 4 MARIE: An Introduction to a Simple Computer

EE282 Computer Architecture and Organization Midterm Exam February 13, (Total Time = 120 minutes, Total Points = 100)

CS521 CSE IITG 11/23/2012

An Overview of Stack Architecture and the PSC 1000 Microprocessor

Intel 8086 architecture

Introduction to RISC Processor. ni logic Pvt. Ltd., Pune

ARM Microprocessor and ARM-Based Microcontrollers

Memory ICS 233. Computer Architecture and Assembly Language Prof. Muhamed Mudawar

Let s put together a Manual Processor

Performance evaluation

Lecture 11: Multi-Core and GPU. Multithreading. Integration of multiple processor cores on a single chip.

Unit A451: Computer systems and programming. Section 2: Computing Hardware 1/5: Central Processing Unit

MICROPROCESSOR AND MICROCOMPUTER BASICS

Reduced Instruction Set Computer (RISC)

COMPUTER HARDWARE. Input- Output and Communication Memory Systems

Instruction Set Design

Computer Organization and Architecture

CPU Performance Equation

Advanced Computer Architecture-CS501. Computer Systems Design and Architecture 2.1, 2.2, 3.2

The AVR Microcontroller and C Compiler Co-Design Dr. Gaute Myklebust ATMEL Corporation ATMEL Development Center, Trondheim, Norway

IA-64 Application Developer s Architecture Guide

PROBLEMS. which was discussed in Section

Chapter 01: Introduction. Lesson 02 Evolution of Computers Part 2 First generation Computers

VHDL DESIGN OF EDUCATIONAL, MODERN AND OPEN- ARCHITECTURE CPU

CHAPTER 7: The CPU and Memory

Microprocessor and Microcontroller Architecture

Chapter 2 Basic Structure of Computers. Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan

Unit 4: Performance & Benchmarking. Performance Metrics. This Unit. CIS 501: Computer Architecture. Performance: Latency vs.

Binary search tree with SIMD bandwidth optimization using SSE

picojava TM : A Hardware Implementation of the Java Virtual Machine

The Microarchitecture of Superscalar Processors

CPU Organisation and Operation

Contents. System Development Models and Methods. Design Abstraction and Views. Synthesis. Control/Data-Flow Models. System Synthesis Models

Overview. CISC Developments. RISC Designs. CISC Designs. VAX: Addressing Modes. Digital VAX

Chapter 2 Topics. 2.1 Classification of Computers & Instructions 2.2 Classes of Instruction Sets 2.3 Informal Description of Simple RISC Computer, SRC

(Refer Slide Time: 00:01:16 min)

Giving credit where credit is due

Introducción. Diseño de sistemas digitales.1

what operations can it perform? how does it perform them? on what kind of data? where are instructions and data stored?

Lecture 3: Modern GPUs A Hardware Perspective Mohamed Zahran (aka Z) mzahran@cs.nyu.edu

EEM 486: Computer Architecture. Lecture 4. Performance

EE482: Advanced Computer Organization Lecture #11 Processor Architecture Stanford University Wednesday, 31 May ILP Execution

The Quest for Speed - Memory. Cache Memory. A Solution: Memory Hierarchy. Memory Hierarchy

CSE 141 Introduction to Computer Architecture Summer Session I, Lecture 1 Introduction. Pramod V. Argade June 27, 2005

Architecture of Hitachi SR-8000

This Unit: Putting It All Together. CIS 501 Computer Architecture. Sources. What is Computer Architecture?

Pipelining Review and Its Limitations

A Lab Course on Computer Architecture

Administrative Issues

EC 362 Problem Set #2

Design of Digital Circuits (SS16)

The Orca Chip... Heart of IBM s RISC System/6000 Value Servers

Static Scheduling. option #1: dynamic scheduling (by the hardware) option #2: static scheduling (by the compiler) ECE 252 / CPS 220 Lecture Notes

l C-Programming l A real computer language l Data Representation l Everything goes down to bits and bytes l Machine representation Language

on an system with an infinite number of processors. Calculate the speedup of

High Performance Processor Architecture. André Seznec IRISA/INRIA ALF project-team

Lecture 2: Computer Hardware and Ports.

The ARM Architecture. With a focus on v7a and Cortex-A8

GUJARAT TECHNOLOGICAL UNIVERSITY, AHMEDABAD, GUJARAT. COURSE CURRICULUM COURSE TITLE: COMPUTER ORGANIZATION AND ARCHITECTURE (Code: )

Chapter 1 Computer System Overview

Chapter 5 Instructor's Manual

Software Pipelining. for (i=1, i<100, i++) { x := A[i]; x := x+1; A[i] := x

The Big Picture. Cache Memory CSE Memory Hierarchy (1/3) Disk

Transcription:

Computer Architecture Examination Paper with Sample Solutions and Marking Scheme CS3 June 1995 Nigel Topham (January 7, 1996) Instructions to Candidates Answer any two questions. Each question carries 25 marks. Where marks are shown against a section of a question, they indicate the number of marks available for that section. Candidates have 90 minutes to complete the paper.

Question 1 a. Give a precise definition of hit ratio, as applied to cache memory. [2 marks] b. Explain the difference between local and global hit ratios in the context of a multi-level cache hierarchy. c. Explain why cache hit ratios are not an objective measurement of cache effectiveness. d. A certain memory system has a single level of cache, connected to a main memory. The time to access memory at an address which is present in the cache is t h, whereas the time for the same access when the address is not present in the cache is t m. Derive an equation for E, the effectiveness of the cache, defined as the ratio t h /t a, where t a is the mean access time for all hits and all misses over the measurement interval. e. Briefly discuss the similarities between your equation for part (d) and Amdahl s Law. [4 marks] [6 marks] Question 2 a. The CPU time of a program is defined as the product of the CPI (cycles per instruction) for the processor on which it runs, the total number of instructions executed (I), and processor clock period (φ). Desribe the major factors which influence CPI, I and φ. b. For a new architecture to be worth developing it must have a commercial lifespan of at least 10 years. What long-term factors must designers of a new architecture take into consideration during the design process? c. Microprocessor core speeds increase at a rate of 40 60% per annum, compared with speed increases of 30% every ten years for DRAM devices. In the light of this increasing discrepancy between CPU and main memory speeds, what can architects, system designers and memory chip designers do to reduce the harmful effects of high memory latency in future computer systems? [9 marks] 1

Question 3 A certain risc microprocessor has a five-stage pipeline operating according to the register-transfer specification given in the table supplied overleaf. The pipeline hardware detects all data hazard conditions and stalls the pipeline when necessary for correct program behavour. You are asked to consider how such a pipeline would execute the following loop: loop: lw $t0, ($s1) add $s2, $s2, $t0 sub $s3, $s3, $s2 add $s4, $s4, $s2 sw $s4, ($s8) addi $s1, $s1, 4 addi $s8, $s8, 4 slt $t1, $t5, $s2 bne $t1, $0, loop a. (i) Draw a space-time graph showing the progression of the instructions for one iteration of the loop through this pipeline. [Hint: your diagram should show time as processor cycles progressing vertically downwards, and space as pipeline stages progressing left-toright across the page.] (ii) How many cycles elapse between the initiation of successive iterations? b. (i) Describe briefly the meaning of the term register bypass in the context of pipeline design. (ii) What would be the revised initiation interval for successive loop iterations if the pipeline had unlimited bypassing capability? (explain your reasoning) c. (i) Show how the instructions of the above loop can be legally re-ordered to fill the load delay slot and the branch delay slot of the load and the branch instructions respectively. (ii) What is the resulting initiation interval for successive loop iterations? [7 marks] [2 marks] [3 marks] [3 marks] 2

Pipeline specification for Question 3 Stage ALU instructions Load or Store instructions IF ID EX MEM A Rs1; B Rs2; IR1 IR; ALUout A op 2 [B literal]; ALUout1 ALUout A Rs1; B Rs2; IR1 IR; DMAR A+offset; SMDR B; LMDR Mem[DMAR]; or Mem[DMAR] SMDR; WB Rd ALUout1; Rd LMDR; Branch instructions BTA PC+offset; if (Rs1 op 1 0) PC BTA; Notation PC program counter IR instruction register IR1 copy of IR at ID stage Rs1 source register 1 Rs2 source register 2 A,B copies of Rs1 and Rs2 Rd destination register BTA branch target address ALUout ALU result register ALUout1 copy of ALUout at MEM stage DMAR data memory address register SMDR store memory data register LMDR load memory data register offset 16-bit sign-extended offset from IR 16...31 literal 16-bit sign-extended offset from IR 16...31 Mem[a] contents of memory at address a 1 op {=, } 2 op can be any integer or logical operation 3

Marking Scheme and Outline Solutions 4