Chapter 5: A Closer Look at Instruction Set Architectures
|
|
- Caitlin Jordan
- 7 years ago
- Views:
Transcription
1 Chapter 5: A Closer Look at Instruction Set Architectures Here we look at a few design decisions for instruction sets of the computer. Some basic issues for consideration today. Basic instruction set design Fixed length vs. variable length instructions. Word addressing in a byte addressable architecture. Big Endian vs. Little Endian Designs Stack machines, accumulator machines, and multi register machines Modern Design Principles NOTE: We shall not discuss expanding opcodes, covered in Section of the textbook. The discussion is overly complex.
2 Design Considerations: Instruction Size The MARIE has a fixed instruction size: 16 bits, divided as follows: Bits Content Opcode Operand address (if used) Instructions of a fixed size are easier to decode. Moreover, they facilitate prefetching of instructions, in which one instruction is fetched while the previous one is executing. Instructions with variable size make more efficient use of memory. When memory was costly, this was an important design consideration. For example, the VAX 11/780 had a number of addition instructions: Add register to register Add memory to register Add memory to memory Suppose an 8 bit opcode, 16 registers (so that 4 bits identify each register), and 32 bit addressing. Instruction Lengths: Register to register = 16 bits (two bytes) Memory to register = 44 bits (six bytes) Memory to memory = 72 bits (nine bytes) With fixed length instructions, each instruction would have to be at least 9 bytes long.
3 RISC vs. CISC Considerations The RISC (Reduced Instruction Set Computer) movement advocated a simpler design with fewer options for the instructions. Simpler instructions could execute faster. Machines that were not RISC were called CISC (Complex Instruction Set Computers). The VAX series was definitely CISC, a fact that lead to its demise. One of the main motivations for the RISC movement is the fact that computer memory is no longer a very expensive commodity. In fact, it is a commodity ; that is, a commercial item that is readily and cheaply available. If we have fewer and simpler instructions, we can speed up the computer s speed significantly. True, each instruction might do less, but they are all faster. The Load Store Design One of the slowest operations is the access of memory, either to read values from it or write values to it. A load store design restricts memory access to two instructions 1) Load a register from memory 2) Store a register into memory A moment s reflection will show that this works only if we have more than one register, possibly 8, 16, or 32 registers. More on this when discussing the options for operands. NOTE: It is easier to write a good compiler for an architecture with a lot of registers.
4 Word Addressing vs. Byte Addressing Each architecture devotes a number of bits to addressing memory. The MARIE has a 12 bit memory address. Question: What is the size of an addressable unit? In byte addressable machines, each byte is individually addressable. In word addressable machines, bytes are not individually addressable. Advantages of Word Addressable Designs The CPU can address a larger memory. In the MARIE it is 2 12 words (not 2 12 bytes) Word addressing is more natural for a number of problems, such as numerical simulations that use a large amount of real number arithmetic. The CDC 6600 used 60 bit addressable words to store real numbers. Advantages of Byte Addressable Designs Direct access to the bytes for easy manipulation. This is good for any string oriented processing, such as editing and message passing. Question: How are multi byte entities (words, longwords, etc.) handled?
5 Word Addressing in a Byte Addressable Machine Each 8 bit byte has a distinct address. A 16 bit word at address Z contains bytes at addresses Z and Z + 1. A 32 bit word at address Z contains bytes at addresses Z, Z + 1, Z + 2, and Z + 3. Note that computer architecture refers to addresses, rather than variables. In a high level programming language we use the term variable to indicate the contents of a specific memory address. Consider the statement Y = X Go to the memory address associated with variable X Get the contents Copy the contents into the address associated with variable Y.
6 Question: What is the Address of the Last Longword (Word)? We shall ask the question in general for an N bit address space and specifically for a 16 bit address space. For byte addresses, this is easy to answer. Byte addresses N bit byte addresses run from 0 to 2 N 1. The byte addresses run from 0 to Word addresses If a word is at address Z, its last byte is at address (Z + 1) The byte of the last word has address Z + 1 = 2 N 1, so the last word has address Z = 2 N 2. For N bit byte addresses run from 0 to 2 N 2. The byte addresses run from 0 to Longword A word at address Z, has last byte is at address (Z + 3) The last byte of the last longword has address Z + 3 = 2 N 1, so the last longword has address Z = 2 N 4. For N bit byte addresses run from 0 to 2 N 4. The byte addresses run from 0 to
7 Big Endian vs. Little Endian Addressing Address Big-Endian Little-Endian Z Z Z Z Note that, within the byte, the hexadecimal digits are never reversed.
8 Words and Longwords in Byte Addressable Memory: Example #1 You are given the following memory map for a byte addressable memory. All values are given in hexadecimal. Address FC FD FE FF Contents Each addressable unit contains one byte, expressed as two hexadecimal digits. A 32 bit longword occupies four bytes. Question: What is the 32 bit longword associated with address 0x100? Answer: It occupies addresses 0x100, 0x101, 0x102, and 0x103. Big Endian: The value, in hexadecimal, is 0x Little Endian: The value, in hexadecimal, is 0x Note: In little endian, we read the bytes backwards. We do not read the hexadecimal digits backwards.
9 Words and Longwords in Byte Addressable Memory: Example #1 (Continued) You are given the following memory map for a byte addressable memory. Address FC FD FE FF Contents Question: What is the 32 bit longword associated with address 0x100? Answer: It occupies addresses 0x100, 0x101, 0x102, and 0x103. Big Endian The value, in hexadecimal, is 0x In decimal: Little Endian: The value, in hexadecimal, is 0x In decimal:
10 Words and Longwords in Byte Addressable Memory: Example #2 You are given the following memory map for a byte addressable memory. Address FC FD FE FF Contents Question: What is the 16 bit word associated with address 0x100? Answer: It occupies addresses 0x100 and 0x101. Big Endian The value, in hexadecimal, is 0x1020 In decimal: Little Endian: The value, in hexadecimal, is 0x2010. In decimal:
11 One Classification of Architectures How do we handle the operands? Consider a simple addition, specifically C = A + B Stack architecture In this all operands are found on a stack. These have good code density (make good use of memory), but have problems with access. Typical instructions would include: Push X // Push the value at address X onto the top of stack Pop Y // Pop the top of stack into address Y Add // Pop the top two values, add them, & push the result Program implementation of C = A + B Push A Push B Add Pop C
12 Single Accumulator Architectures The MARIE is an example of this. There is a single register used to accumulate results of arithmetic operations. The MARIE realization of the instruction C = A + B Load A // Load the AC from address A Add B // Add the value at address B Store C // Store the result into address C In each of these instructions, the accumulator is implicitly one of the operands and need not be specified in the instruction. This saves space. Extension of this for multiplication If we multiply two 16 bit signed integers, we can get a 32 bit result. Consider squaring ( ) or is 16 bit binary. The result is 603, 979, 776 or We need two 16 bit registers to hold the result. The PDP 9 had the MQ and AC. The results of this multiplication would be MQ // High 16 bits AC // Low 16 bits
13 General Purpose Register Architectures These have a number of general purpose registers, normally identified by number. The number of registers is often a power of 2: 8, 16, or 32 being common. (The Intel architecture with its four general purpose registers is different. These are called EAX, EBX, ECX, and EDX a lot of history here) The names of the registers often follow an assembly language notation designed to differentiate register names from variable names. An architecture with eight general purpose registers might name them: %R0, %R1,., %R7. The prefix % here indicates to the assembler that we are referring to a register, not to a variable that has a name such as R0. The latter name would be poor coding practice. Designers might choose to have register %R0 identically set to 0. Having this constant register considerably simplifies a number of circuits in the CPU control unit. We shall return to this %R0 0 when discussing addressing modes.
14 General Purpose Registers with Load Store A Load Store architecture is one with a number of general purpose registers in which the only memory references are: 1) Loading a register from memory 2) Storing a register to memory The realization of our programming statement C = A + B might be something like Load %R1, A // Load memory location A contents into register 1 Load %R2, B // Load register 2 from memory location B Add %R3, %R1, %R2 // Add contents of registers %R1 and %R2 // Place results into register %R3 Store %R3, C // Store register 3 into memory location C
15 General Purpose Registers: Register Memory In a load store architecture, the operands for any arithmetic operation must all be in CPU registers. The Register Memory design relaxes this requirement to requiring only one of the operands to be in a register. We might have two types of addition instructions Add register to register Add memory to register The realization of the above simple program statement, C = A + B, might be Load %R4, A // Get M[A] into register 4 Add %R4, B // Add M[B] to register 4 Store %R4, C // Place results into memory location C
16 General Purpose Registers: Memory Memory In this, there are no restrictions on the location of operands. Our instruction C = A + B might be encoded simply as Add C, A, B The VAX series supported this mode. The VAX had at least three different addition instructions for each data length Add register to register Add memory to register Add memory to memory There were these three for each of the following data types: 8 bit bytes, 16 bit integers, and 32 bit long integers 32 bit floating point numbers and 64 bit floating point numbers. Here we see at least 15 different instructions that perform addition. This is complex.
17 Modern Design Principles: Basic Assumptions Some assumptions that drive current design practice include: 1. The fact that most programs are written in high level compiled languages. Provision of a large general purpose register set greatly facilitates compiler design. 2. The fact that current CPU clock cycle times ( nanoseconds) are much faster than memory access times. 3. The fact that a simpler instruction set implies a smaller control unit, thus freeing chip area for more registers and on chip cache. 4. The fact that execution is more efficient when a two level cache system is implemented on chip. We have a split L1 cache (with an I Cache for instructions and a D Cache for data) and a L2 cache. 5. The fact that memory is so cheap that it is a commodity item.
18 Modern Design Principles 1. Allow only fixed length operands. This may waste memory, but modern designs have plenty of it, and it is cheap. 2. Minimize the number of instruction formats and make them simpler, so that the instructions are more easily and quickly decoded. 3. Provide plenty of registers and the largest possible on chip cache memory. 4. Minimize the number of instructions that reference memory. Preferred practice is called Load/Store in which the only operations to reference primary memory are: register loads from memory register stores into memory. 5. Use pipelined and superscalar approaches that attempt to keep each unit in the CPU busy all the time. At the very least provide for fetching one instruction while the previous instruction is being executed. 6. Push the complexity onto the compiler. This is called moving the DSI (Dynamic Static interface). Let Compilation (static phase) handle any any issue that it can, so that Execution (dynamic phase) is simplified.
19 The Fetch Execute Cycle (Again) This cycle is the logical basis of all stored program computers. Instructions are stored in memory as machine language. Instructions are fetched from memory and then executed. The common fetch cycle can be expressed in the following control sequence. MAR PC. READ. IR MBR. // The PC contains the address of the instruction. // Put the address into the MAR and read memory. // Place the instruction into the MBR. This cycle is described in many different ways, most of which serve to highlight additional steps required to execute the instruction. Examples of additional steps are: Decode the Instruction, Fetch the Arguments, Store the Result, etc. A stored program computer is often called a von Neumann Machine after one of the originators of the EDVAC. This Fetch Execute cycle is often called the von Neumann bottleneck, as the necessity for fetching every instruction from memory slows the computer.
20 Avoiding the Bottleneck In the simple stored program machine, the following loop is executed. Fetch the next instruction Loop Until Stop Execute the instruction Fetch the next instruction End Loop. The first attempt to break out of this endless cycle was instruction prefetch ; fetch the next instruction at the same time the current one is executing. As we can easily see, this concept can be extended.
21 Instruction Level Parallelism: Instruction Prefetch Break up the fetch execute cycle and do the two in parallel. This dates to the IBM Stretch (1959) The prefetch buffer is implemented in the CPU with on chip registers. The prefetch buffer is implemented as a single register or a queue. The CDC 6600 buffer had a queue of length 8 (I think). Think of the prefetch buffer as containing the IR (Instruction Register) When the execution of one instruction completes, the next one is already in the buffer and does not need to be fetched. Any program branch (loop structure, conditional branch, etc.) will invalidate the contents of the prefetch buffer, which must be reloaded.
22 Instruction Level Parallelism: Pipelining Better considered as an assembly line Note that the throughput is distinct from the time required for the execution of a single instruction. Here the throughput is five times the single instruction rate.
23 What About Two Pipelines? Code emitted by a compiler tailored for this architecture has the possibility to run twice as fast as code emitted by a generic compiler. Some pairs of instructions are not candidates for dual pipelining. C = A + B D = A C // Need the new value of C here This is called a RAW (Read After Write) dependency, in that the value for C must be written to a register before it can be read for the next operation. Stopping the pipeline for a needed value is called stalling.
24 Superscalar Architectures Having 2, 4, or 8 completely independent pipelines on a CPU is very resource intensive and not directly in response to careful analysis. Often, the execution units are the slowest units by a large margin. It is usually a better use of resources to replicate the execution units.
Instruction Set Architecture (ISA)
Instruction Set Architecture (ISA) * Instruction set architecture of a machine fills the semantic gap between the user and the machine. * ISA serves as the starting point for the design of a new machine
More informationAdvanced Computer Architecture-CS501. Computer Systems Design and Architecture 2.1, 2.2, 3.2
Lecture Handout Computer Architecture Lecture No. 2 Reading Material Vincent P. Heuring&Harry F. Jordan Chapter 2,Chapter3 Computer Systems Design and Architecture 2.1, 2.2, 3.2 Summary 1) A taxonomy of
More informationOverview. CISC Developments. RISC Designs. CISC Designs. VAX: Addressing Modes. Digital VAX
Overview CISC Developments Over Twenty Years Classic CISC design: Digital VAX VAXÕs RISC successor: PRISM/Alpha IntelÕs ubiquitous 80x86 architecture Ð 8086 through the Pentium Pro (P6) RJS 2/3/97 Philosophy
More informationLSN 2 Computer Processors
LSN 2 Computer Processors Department of Engineering Technology LSN 2 Computer Processors Microprocessors Design Instruction set Processor organization Processor performance Bandwidth Clock speed LSN 2
More informationİSTANBUL AYDIN UNIVERSITY
İSTANBUL AYDIN UNIVERSITY FACULTY OF ENGİNEERİNG SOFTWARE ENGINEERING THE PROJECT OF THE INSTRUCTION SET COMPUTER ORGANIZATION GÖZDE ARAS B1205.090015 Instructor: Prof. Dr. HASAN HÜSEYİN BALIK DECEMBER
More informationCentral Processing Unit (CPU)
Central Processing Unit (CPU) CPU is the heart and brain It interprets and executes machine level instructions Controls data transfer from/to Main Memory (MM) and CPU Detects any errors In the following
More informationComputer Organization and Architecture
Computer Organization and Architecture Chapter 11 Instruction Sets: Addressing Modes and Formats Instruction Set Design One goal of instruction set design is to minimize instruction length Another goal
More informationComputer Architecture Lecture 2: Instruction Set Principles (Appendix A) Chih Wei Liu 劉 志 尉 National Chiao Tung University cwliu@twins.ee.nctu.edu.
Computer Architecture Lecture 2: Instruction Set Principles (Appendix A) Chih Wei Liu 劉 志 尉 National Chiao Tung University cwliu@twins.ee.nctu.edu.tw Review Computers in mid 50 s Hardware was expensive
More informationChapter 5 Instructor's Manual
The Essentials of Computer Organization and Architecture Linda Null and Julia Lobur Jones and Bartlett Publishers, 2003 Chapter 5 Instructor's Manual Chapter Objectives Chapter 5, A Closer Look at Instruction
More informationCHAPTER 7: The CPU and Memory
CHAPTER 7: The CPU and Memory The Architecture of Computer Hardware, Systems Software & Networking: An Information Technology Approach 4th Edition, Irv Englander John Wiley and Sons 2010 PowerPoint slides
More informationInstruction Set Architecture. or How to talk to computers if you aren t in Star Trek
Instruction Set Architecture or How to talk to computers if you aren t in Star Trek The Instruction Set Architecture Application Compiler Instr. Set Proc. Operating System I/O system Instruction Set Architecture
More informationCPU Organisation and Operation
CPU Organisation and Operation The Fetch-Execute Cycle The operation of the CPU 1 is usually described in terms of the Fetch-Execute cycle. 2 Fetch-Execute Cycle Fetch the Instruction Increment the Program
More informationCHAPTER 4 MARIE: An Introduction to a Simple Computer
CHAPTER 4 MARIE: An Introduction to a Simple Computer 4.1 Introduction 195 4.2 CPU Basics and Organization 195 4.2.1 The Registers 196 4.2.2 The ALU 197 4.2.3 The Control Unit 197 4.3 The Bus 197 4.4 Clocks
More informationIntel 8086 architecture
Intel 8086 architecture Today we ll take a look at Intel s 8086, which is one of the oldest and yet most prevalent processor architectures around. We ll make many comparisons between the MIPS and 8086
More information18-447 Computer Architecture Lecture 3: ISA Tradeoffs. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 1/18/2013
18-447 Computer Architecture Lecture 3: ISA Tradeoffs Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 1/18/2013 Reminder: Homeworks for Next Two Weeks Homework 0 Due next Wednesday (Jan 23), right
More informationADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-12: ARM
ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-12: ARM 1 The ARM architecture processors popular in Mobile phone systems 2 ARM Features ARM has 32-bit architecture but supports 16 bit
More informationComputer System: User s View. Computer System Components: High Level View. Input. Output. Computer. Computer System: Motherboard Level
System: User s View System Components: High Level View Input Output 1 System: Motherboard Level 2 Components: Interconnection I/O MEMORY 3 4 Organization Registers ALU CU 5 6 1 Input/Output I/O MEMORY
More informationwhat operations can it perform? how does it perform them? on what kind of data? where are instructions and data stored?
Inside the CPU how does the CPU work? what operations can it perform? how does it perform them? on what kind of data? where are instructions and data stored? some short, boring programs to illustrate the
More informationProcessor Architectures
ECPE 170 Jeff Shafer University of the Pacific Processor Architectures 2 Schedule Exam 3 Tuesday, December 6 th Caches Virtual Memory Input / Output OperaKng Systems Compilers & Assemblers Processor Architecture
More informationCSCI 4717 Computer Architecture. Function. Data Storage. Data Processing. Data movement to a peripheral. Data Movement
CSCI 4717/5717 Computer Architecture Topic: Functional View & History Reading: Sections 1.2, 2.1, & 2.3 Function All computer functions are comprised of four basic operations: Data processing Data storage
More informationa storage location directly on the CPU, used for temporary storage of small amounts of data during processing.
CS143 Handout 18 Summer 2008 30 July, 2008 Processor Architectures Handout written by Maggie Johnson and revised by Julie Zelenski. Architecture Vocabulary Let s review a few relevant hardware definitions:
More informationAdministrative Issues
CSC 3210 Computer Organization and Programming Introduction and Overview Dr. Anu Bourgeois (modified by Yuan Long) Administrative Issues Required Prerequisites CSc 2010 Intro to CSc CSc 2310 Java Programming
More informationMore on Pipelining and Pipelines in Real Machines CS 333 Fall 2006 Main Ideas Data Hazards RAW WAR WAW More pipeline stall reduction techniques Branch prediction» static» dynamic bimodal branch prediction
More informationComputer Architectures
Computer Architectures 2. Instruction Set Architectures 2015. február 12. Budapest Gábor Horváth associate professor BUTE Dept. of Networked Systems and Services ghorvath@hit.bme.hu 2 Instruction set architectures
More informationChapter 01: Introduction. Lesson 02 Evolution of Computers Part 2 First generation Computers
Chapter 01: Introduction Lesson 02 Evolution of Computers Part 2 First generation Computers Objective Understand how electronic computers evolved during the first generation of computers First Generation
More informationCPU Organization and Assembly Language
COS 140 Foundations of Computer Science School of Computing and Information Science University of Maine October 2, 2015 Outline 1 2 3 4 5 6 7 8 Homework and announcements Reading: Chapter 12 Homework:
More informationEC 362 Problem Set #2
EC 362 Problem Set #2 1) Using Single Precision IEEE 754, what is FF28 0000? 2) Suppose the fraction enhanced of a processor is 40% and the speedup of the enhancement was tenfold. What is the overall speedup?
More informationChapter 2 Logic Gates and Introduction to Computer Architecture
Chapter 2 Logic Gates and Introduction to Computer Architecture 2.1 Introduction The basic components of an Integrated Circuit (IC) is logic gates which made of transistors, in digital system there are
More informationMICROPROCESSOR AND MICROCOMPUTER BASICS
Introduction MICROPROCESSOR AND MICROCOMPUTER BASICS At present there are many types and sizes of computers available. These computers are designed and constructed based on digital and Integrated Circuit
More informationCentral Processing Unit
Chapter 4 Central Processing Unit 1. CPU organization and operation flowchart 1.1. General concepts The primary function of the Central Processing Unit is to execute sequences of instructions representing
More informationChapter 2 Basic Structure of Computers. Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan
Chapter 2 Basic Structure of Computers Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan Outline Functional Units Basic Operational Concepts Bus Structures Software
More informationIntroduction to RISC Processor. ni logic Pvt. Ltd., Pune
Introduction to RISC Processor ni logic Pvt. Ltd., Pune AGENDA What is RISC & its History What is meant by RISC Architecture of MIPS-R4000 Processor Difference Between RISC and CISC Pros and Cons of RISC
More informationInstruction Set Design
Instruction Set Design Instruction Set Architecture: to what purpose? ISA provides the level of abstraction between the software and the hardware One of the most important abstraction in CS It s narrow,
More informationAdministration. Instruction scheduling. Modern processors. Examples. Simplified architecture model. CS 412 Introduction to Compilers
CS 4 Introduction to Compilers ndrew Myers Cornell University dministration Prelim tomorrow evening No class Wednesday P due in days Optional reading: Muchnick 7 Lecture : Instruction scheduling pr 0 Modern
More informationInstruction Set Architecture (ISA) Design. Classification Categories
Instruction Set Architecture (ISA) Design Overview» Classify Instruction set architectures» Look at how applications use ISAs» Examine a modern RISC ISA (DLX)» Measurement of ISA usage in real computers
More informationA single register, called the accumulator, stores the. operand before the operation, and stores the result. Add y # add y from memory to the acc
Other architectures Example. Accumulator-based machines A single register, called the accumulator, stores the operand before the operation, and stores the result after the operation. Load x # into acc
More informationELE 356 Computer Engineering II. Section 1 Foundations Class 6 Architecture
ELE 356 Computer Engineering II Section 1 Foundations Class 6 Architecture History ENIAC Video 2 tj History Mechanical Devices Abacus 3 tj History Mechanical Devices The Antikythera Mechanism Oldest known
More information(Refer Slide Time: 00:01:16 min)
Digital Computer Organization Prof. P. K. Biswas Department of Electronic & Electrical Communication Engineering Indian Institute of Technology, Kharagpur Lecture No. # 04 CPU Design: Tirning & Control
More informationLet s put together a Manual Processor
Lecture 14 Let s put together a Manual Processor Hardware Lecture 14 Slide 1 The processor Inside every computer there is at least one processor which can take an instruction, some operands and produce
More informationPROBLEMS (Cap. 4 - Istruzioni macchina)
98 CHAPTER 2 MACHINE INSTRUCTIONS AND PROGRAMS PROBLEMS (Cap. 4 - Istruzioni macchina) 2.1 Represent the decimal values 5, 2, 14, 10, 26, 19, 51, and 43, as signed, 7-bit numbers in the following binary
More informationComputer Architecture Basics
Computer Architecture Basics CIS 450 Computer Organization and Architecture Copyright c 2002 Tim Bower The interface between a computer s hardware and its software is its architecture The architecture
More informationBase Conversion written by Cathy Saxton
Base Conversion written by Cathy Saxton 1. Base 10 In base 10, the digits, from right to left, specify the 1 s, 10 s, 100 s, 1000 s, etc. These are powers of 10 (10 x ): 10 0 = 1, 10 1 = 10, 10 2 = 100,
More informationChapter 7D The Java Virtual Machine
This sub chapter discusses another architecture, that of the JVM (Java Virtual Machine). In general, a VM (Virtual Machine) is a hypothetical machine (implemented in either hardware or software) that directly
More informationAddressing The problem. When & Where do we encounter Data? The concept of addressing data' in computations. The implications for our machine design(s)
Addressing The problem Objectives:- When & Where do we encounter Data? The concept of addressing data' in computations The implications for our machine design(s) Introducing the stack-machine concept Slide
More informationCS:APP Chapter 4 Computer Architecture. Wrap-Up. William J. Taffe Plymouth State University. using the slides of
CS:APP Chapter 4 Computer Architecture Wrap-Up William J. Taffe Plymouth State University using the slides of Randal E. Bryant Carnegie Mellon University Overview Wrap-Up of PIPE Design Performance analysis
More informationEE482: Advanced Computer Organization Lecture #11 Processor Architecture Stanford University Wednesday, 31 May 2000. ILP Execution
EE482: Advanced Computer Organization Lecture #11 Processor Architecture Stanford University Wednesday, 31 May 2000 Lecture #11: Wednesday, 3 May 2000 Lecturer: Ben Serebrin Scribe: Dean Liu ILP Execution
More informationInterpreters and virtual machines. Interpreters. Interpreters. Why interpreters? Tree-based interpreters. Text-based interpreters
Interpreters and virtual machines Michel Schinz 2007 03 23 Interpreters Interpreters Why interpreters? An interpreter is a program that executes another program, represented as some kind of data-structure.
More informationPentium vs. Power PC Computer Architecture and PCI Bus Interface
Pentium vs. Power PC Computer Architecture and PCI Bus Interface CSE 3322 1 Pentium vs. Power PC Computer Architecture and PCI Bus Interface Nowadays, there are two major types of microprocessors in the
More informationProperty of ISA vs. Uarch?
More ISA Property of ISA vs. Uarch? ADD instruction s opcode Number of general purpose registers Number of cycles to execute the MUL instruction Whether or not the machine employs pipelined instruction
More informationComputer Organization and Components
Computer Organization and Components IS5, fall 25 Lecture : Pipelined Processors ssociate Professor, KTH Royal Institute of Technology ssistant Research ngineer, University of California, Berkeley Slides
More informationLearning Outcomes. Simple CPU Operation and Buses. Composition of a CPU. A simple CPU design
Learning Outcomes Simple CPU Operation and Buses Dr Eddie Edwards eddie.edwards@imperial.ac.uk At the end of this lecture you will Understand how a CPU might be put together Be able to name the basic components
More informationMemory Systems. Static Random Access Memory (SRAM) Cell
Memory Systems This chapter begins the discussion of memory systems from the implementation of a single bit. The architecture of memory chips is then constructed using arrays of bit implementations coupled
More information1 Classical Universal Computer 3
Chapter 6: Machine Language and Assembler Christian Jacob 1 Classical Universal Computer 3 1.1 Von Neumann Architecture 3 1.2 CPU and RAM 5 1.3 Arithmetic Logical Unit (ALU) 6 1.4 Arithmetic Logical Unit
More informationLecture 7: Machine-Level Programming I: Basics Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com
CSCI-UA.0201-003 Computer Systems Organization Lecture 7: Machine-Level Programming I: Basics Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com Some slides adapted (and slightly modified)
More informationAn Introduction to the ARM 7 Architecture
An Introduction to the ARM 7 Architecture Trevor Martin CEng, MIEE Technical Director This article gives an overview of the ARM 7 architecture and a description of its major features for a developer new
More informationBindel, Spring 2010 Applications of Parallel Computers (CS 5220) Week 1: Wednesday, Jan 27
Logistics Week 1: Wednesday, Jan 27 Because of overcrowding, we will be changing to a new room on Monday (Snee 1120). Accounts on the class cluster (crocus.csuglab.cornell.edu) will be available next week.
More information612 CHAPTER 11 PROCESSOR FAMILIES (Corrisponde al cap. 12 - Famiglie di processori) PROBLEMS
612 CHAPTER 11 PROCESSOR FAMILIES (Corrisponde al cap. 12 - Famiglie di processori) PROBLEMS 11.1 How is conditional execution of ARM instructions (see Part I of Chapter 3) related to predicated execution
More informationChapter 2 Topics. 2.1 Classification of Computers & Instructions 2.2 Classes of Instruction Sets 2.3 Informal Description of Simple RISC Computer, SRC
Chapter 2 Topics 2.1 Classification of Computers & Instructions 2.2 Classes of Instruction Sets 2.3 Informal Description of Simple RISC Computer, SRC See Appendix C for Assembly language information. 2.4
More information150127-Microprocessor & Assembly Language
Chapter 3 Z80 Microprocessor Architecture The Z 80 is one of the most talented 8 bit microprocessors, and many microprocessor-based systems are designed around the Z80. The Z80 microprocessor needs an
More informationComputer organization
Computer organization Computer design an application of digital logic design procedures Computer = processing unit + memory system Processing unit = control + datapath Control = finite state machine inputs
More informationMicroprocessor and Microcontroller Architecture
Microprocessor and Microcontroller Architecture 1 Von Neumann Architecture Stored-Program Digital Computer Digital computation in ALU Programmable via set of standard instructions input memory output Internal
More informationQ. Consider a dynamic instruction execution (an execution trace, in other words) that consists of repeats of code in this pattern:
Pipelining HW Q. Can a MIPS SW instruction executing in a simple 5-stage pipelined implementation have a data dependency hazard of any type resulting in a nop bubble? If so, show an example; if not, prove
More informationChapter 1 Computer System Overview
Operating Systems: Internals and Design Principles Chapter 1 Computer System Overview Eighth Edition By William Stallings Operating System Exploits the hardware resources of one or more processors Provides
More informationInstruction Set Architecture
Instruction Set Architecture Consider x := y+z. (x, y, z are memory variables) 1-address instructions 2-address instructions LOAD y (r :=y) ADD y,z (y := y+z) ADD z (r:=r+z) MOVE x,y (x := y) STORE x (x:=r)
More informationPROBLEMS. which was discussed in Section 1.6.3.
22 CHAPTER 1 BASIC STRUCTURE OF COMPUTERS (Corrisponde al cap. 1 - Introduzione al calcolatore) PROBLEMS 1.1 List the steps needed to execute the machine instruction LOCA,R0 in terms of transfers between
More informationComputer Architecture TDTS10
why parallelism? Performance gain from increasing clock frequency is no longer an option. Outline Computer Architecture TDTS10 Superscalar Processors Very Long Instruction Word Processors Parallel computers
More informationCentral Processing Unit Simulation Version v2.5 (July 2005) Charles André University Nice-Sophia Antipolis
Central Processing Unit Simulation Version v2.5 (July 2005) Charles André University Nice-Sophia Antipolis 1 1 Table of Contents 1 Table of Contents... 3 2 Overview... 5 3 Installation... 7 4 The CPU
More informationPROBLEMS #20,R0,R1 #$3A,R2,R4
506 CHAPTER 8 PIPELINING (Corrisponde al cap. 11 - Introduzione al pipelining) PROBLEMS 8.1 Consider the following sequence of instructions Mul And #20,R0,R1 #3,R2,R3 #$3A,R2,R4 R0,R2,R5 In all instructions,
More informationIntroduction to Cloud Computing
Introduction to Cloud Computing Parallel Processing I 15 319, spring 2010 7 th Lecture, Feb 2 nd Majd F. Sakr Lecture Motivation Concurrency and why? Different flavors of parallel computing Get the basic
More informationGiving credit where credit is due
CSCE 230J Computer Organization Processor Architecture VI: Wrap-Up Dr. Steve Goddard goddard@cse.unl.edu http://cse.unl.edu/~goddard/courses/csce230j Giving credit where credit is due ost of slides for
More informationOC By Arsene Fansi T. POLIMI 2008 1
IBM POWER 6 MICROPROCESSOR OC By Arsene Fansi T. POLIMI 2008 1 WHAT S IBM POWER 6 MICROPOCESSOR The IBM POWER6 microprocessor powers the new IBM i-series* and p-series* systems. It s based on IBM POWER5
More informationIn the Beginning... 1964 -- The first ISA appears on the IBM System 360 In the good old days
RISC vs CISC 66 In the Beginning... 1964 -- The first ISA appears on the IBM System 360 In the good old days Initially, the focus was on usability by humans. Lots of user-friendly instructions (remember
More informationMICROPROCESSOR. Exclusive for IACE Students www.iace.co.in iacehyd.blogspot.in Ph: 9700077455/422 Page 1
MICROPROCESSOR A microprocessor incorporates the functions of a computer s central processing unit (CPU) on a single Integrated (IC), or at most a few integrated circuit. It is a multipurpose, programmable
More informationGenerations of the computer. processors.
. Piotr Gwizdała 1 Contents 1 st Generation 2 nd Generation 3 rd Generation 4 th Generation 5 th Generation 6 th Generation 7 th Generation 8 th Generation Dual Core generation Improves and actualizations
More informationAn Overview of Stack Architecture and the PSC 1000 Microprocessor
An Overview of Stack Architecture and the PSC 1000 Microprocessor Introduction A stack is an important data handling structure used in computing. Specifically, a stack is a dynamic set of elements in which
More informationSoftware Pipelining. for (i=1, i<100, i++) { x := A[i]; x := x+1; A[i] := x
Software Pipelining for (i=1, i
More informationChapter 4 Lecture 5 The Microarchitecture Level Integer JAVA Virtual Machine
Chapter 4 Lecture 5 The Microarchitecture Level Integer JAVA Virtual Machine This is a limited version of a hardware implementation to execute the JAVA programming language. 1 of 23 Structured Computer
More informationVHDL DESIGN OF EDUCATIONAL, MODERN AND OPEN- ARCHITECTURE CPU
VHDL DESIGN OF EDUCATIONAL, MODERN AND OPEN- ARCHITECTURE CPU Martin Straka Doctoral Degree Programme (1), FIT BUT E-mail: strakam@fit.vutbr.cz Supervised by: Zdeněk Kotásek E-mail: kotasek@fit.vutbr.cz
More informationVLIW Processors. VLIW Processors
1 VLIW Processors VLIW ( very long instruction word ) processors instructions are scheduled by the compiler a fixed number of operations are formatted as one big instruction (called a bundle) usually LIW
More informationIA-64 Application Developer s Architecture Guide
IA-64 Application Developer s Architecture Guide The IA-64 architecture was designed to overcome the performance limitations of today s architectures and provide maximum headroom for the future. To achieve
More informationComparative Performance Review of SHA-3 Candidates
Comparative Performance Review of the SHA-3 Second-Round Candidates Cryptolog International Second SHA-3 Candidate Conference Outline sphlib sphlib sphlib is an open-source implementation of many hash
More informationComputer Organization
Computer Organization and Architecture Designing for Performance Ninth Edition William Stallings International Edition contributions by R. Mohan National Institute of Technology, Tiruchirappalli PEARSON
More informationCOMPUTER ORGANIZATION AND ARCHITECTURE. Slides Courtesy of Carl Hamacher, Computer Organization, Fifth edition,mcgrawhill
COMPUTER ORGANIZATION AND ARCHITECTURE Slides Courtesy of Carl Hamacher, Computer Organization, Fifth edition,mcgrawhill COMPUTER ORGANISATION AND ARCHITECTURE The components from which computers are built,
More informationSolution: start more than one instruction in the same clock cycle CPI < 1 (or IPC > 1, Instructions per Cycle) Two approaches:
Multiple-Issue Processors Pipelining can achieve CPI close to 1 Mechanisms for handling hazards Static or dynamic scheduling Static or dynamic branch handling Increase in transistor counts (Moore s Law):
More informationChapter 4 System Unit Components. Discovering Computers 2012. Your Interactive Guide to the Digital World
Chapter 4 System Unit Components Discovering Computers 2012 Your Interactive Guide to the Digital World Objectives Overview Differentiate among various styles of system units on desktop computers, notebook
More informationThe Central Processing Unit:
The Central Processing Unit: What Goes on Inside the Computer Chapter 4 Objectives Identify the components of the central processing unit and how they work together and interact with memory Describe how
More informationUNIT 2 CLASSIFICATION OF PARALLEL COMPUTERS
UNIT 2 CLASSIFICATION OF PARALLEL COMPUTERS Structure Page Nos. 2.0 Introduction 27 2.1 Objectives 27 2.2 Types of Classification 28 2.3 Flynn s Classification 28 2.3.1 Instruction Cycle 2.3.2 Instruction
More informationHow It All Works. Other M68000 Updates. Basic Control Signals. Basic Control Signals
CPU Architectures Motorola 68000 Several CPU architectures exist currently: Motorola Intel AMD (Advanced Micro Devices) PowerPC Pick one to study; others will be variations on this. Arbitrary pick: Motorola
More informationChapter 4 Register Transfer and Microoperations. Section 4.1 Register Transfer Language
Chapter 4 Register Transfer and Microoperations Section 4.1 Register Transfer Language Digital systems are composed of modules that are constructed from digital components, such as registers, decoders,
More informationManagement Challenge. Managing Hardware Assets. Central Processing Unit. What is a Computer System?
Management Challenge Managing Hardware Assets What computer processing and storage capability does our organization need to handle its information and business transactions? What arrangement of computers
More informationA3 Computer Architecture
A3 Computer Architecture Engineering Science 3rd year A3 Lectures Prof David Murray david.murray@eng.ox.ac.uk www.robots.ox.ac.uk/ dwm/courses/3co Michaelmas 2000 1 / 1 6. Stacks, Subroutines, and Memory
More informationCS101 Lecture 26: Low Level Programming. John Magee 30 July 2013 Some material copyright Jones and Bartlett. Overview/Questions
CS101 Lecture 26: Low Level Programming John Magee 30 July 2013 Some material copyright Jones and Bartlett 1 Overview/Questions What did we do last time? How can we control the computer s circuits? How
More informationA Lab Course on Computer Architecture
A Lab Course on Computer Architecture Pedro López José Duato Depto. de Informática de Sistemas y Computadores Facultad de Informática Universidad Politécnica de Valencia Camino de Vera s/n, 46071 - Valencia,
More informationInstruction Set Architecture
CS:APP Chapter 4 Computer Architecture Instruction Set Architecture Randal E. Bryant adapted by Jason Fritts http://csapp.cs.cmu.edu CS:APP2e Hardware Architecture - using Y86 ISA For learning aspects
More informationEE361: Digital Computer Organization Course Syllabus
EE361: Digital Computer Organization Course Syllabus Dr. Mohammad H. Awedh Spring 2014 Course Objectives Simply, a computer is a set of components (Processor, Memory and Storage, Input/Output Devices)
More informationCISC, RISC, and DSP Microprocessors
CISC, RISC, and DSP Microprocessors Douglas L. Jones ECE 497 Spring 2000 4/6/00 CISC, RISC, and DSP D.L. Jones 1 Outline Microprocessors circa 1984 RISC vs. CISC Microprocessors circa 1999 Perspective:
More informationDriving force. What future software needs. Potential research topics
Improving Software Robustness and Efficiency Driving force Processor core clock speed reach practical limit ~4GHz (power issue) Percentage of sustainable # of active transistors decrease; Increase in #
More informationCS:APP Chapter 4 Computer Architecture Instruction Set Architecture. CS:APP2e
CS:APP Chapter 4 Computer Architecture Instruction Set Architecture CS:APP2e Instruction Set Architecture Assembly Language View Processor state Registers, memory, Instructions addl, pushl, ret, How instructions
More informationThe Java Virtual Machine and Mobile Devices. John Buford, Ph.D. buford@alum.mit.edu Oct 2003 Presented to Gordon College CS 311
The Java Virtual Machine and Mobile Devices John Buford, Ph.D. buford@alum.mit.edu Oct 2003 Presented to Gordon College CS 311 Objectives Review virtual machine concept Introduce stack machine architecture
More informationASSEMBLY PROGRAMMING ON A VIRTUAL COMPUTER
ASSEMBLY PROGRAMMING ON A VIRTUAL COMPUTER Pierre A. von Kaenel Mathematics and Computer Science Department Skidmore College Saratoga Springs, NY 12866 (518) 580-5292 pvonk@skidmore.edu ABSTRACT This paper
More information