Slide Set 8. for ENCM 369 Winter 2015 Lecture Section 01. Steve Norman, PhD, PEng



Similar documents
Logical Operations. Control Unit. Contents. Arithmetic Operations. Objectives. The Central Processing Unit: Arithmetic / Logic Unit.

Chapter 6. Inside the System Unit. What You Will Learn... Computers Are Your Future. What You Will Learn... Describing Hardware Performance

Computer Systems Structure Main Memory Organization

The Central Processing Unit:

CSCA0102 IT & Business Applications. Foundation in Business Information Technology School of Engineering & Computing Sciences FTMS College Global

SDRAM and DRAM Memory Systems Overview

Computer Architecture

Chapter 4 System Unit Components. Discovering Computers Your Interactive Guide to the Digital World

The Quest for Speed - Memory. Cache Memory. A Solution: Memory Hierarchy. Memory Hierarchy

Discovering Computers Living in a Digital World

Understanding Memory TYPES OF MEMORY

CHAPTER 2: HARDWARE BASICS: INSIDE THE BOX

Memory Basics ~ ROM, DRAM, SRAM, Cache Memory The article was added by Kyle Duke

Random Access Memory (RAM) Types of RAM. RAM Random Access Memory Jamie Tees SDRAM. Micro-DIMM SO-DIMM

Homework # 2. Solutions. 4.1 What are the differences among sequential access, direct access, and random access?

RAM. Overview DRAM. What RAM means? DRAM

Module 2. Embedded Processors and Memory. Version 2 EE IIT, Kharagpur 1

Computers. Hardware. The Central Processing Unit (CPU) CMPT 125: Lecture 1: Understanding the Computer

Introducción. Diseño de sistemas digitales.1

Main Memory & Backing Store. Main memory backing storage devices

Introduction To Computers: Hardware and Software

Memory Hierarchy. Arquitectura de Computadoras. Centro de Investigación n y de Estudios Avanzados del IPN. adiaz@cinvestav.mx. MemoryHierarchy- 1

Memory ICS 233. Computer Architecture and Assembly Language Prof. Muhamed Mudawar

With respect to the way of data access we can classify memories as:

RAM & ROM Based Digital Design. ECE 152A Winter 2012

Hardware: Input, Processing, and Output Devices. A PC in Every Home. Assembling a Computer System

Generations of the computer. processors.

The Big Picture. Cache Memory CSE Memory Hierarchy (1/3) Disk

361 Computer Architecture Lecture 14: Cache Memory

Lecture 2: Computer Hardware and Ports.

COMPUTER HARDWARE. Input- Output and Communication Memory Systems

NAND Flash FAQ. Eureka Technology. apn5_87. NAND Flash FAQ

1. Memory technology & Hierarchy

Secondary Storage. Any modern computer system will incorporate (at least) two levels of storage: magnetic disk/optical devices/tape systems

Chapter 5 :: Memory and Logic Arrays

Configuring Memory on the HP Business Desktop dx5150

1 / 25. CS 137: File Systems. Persistent Solid-State Storage

Memory is implemented as an array of electronic switches

COMPUTER ORGANIZATION ARCHITECTURES FOR EMBEDDED COMPUTING

lesson 1 An Overview of the Computer System

Semiconductor Device Technology for Implementing System Solutions: Memory Modules

Dell Reliable Memory Technology

Algorithms and Methods for Distributed Storage Networks 3. Solid State Disks Christian Schindelhauer

Technical Product Specifications Dell Dimension 2400 Created by: Scott Puckett

Introduction to Microprocessors

Computer Performance. Topic 3. Contents. Prerequisite knowledge Before studying this topic you should be able to:

Lecture 9: Memory and Storage Technologies

Unit A451: Computer systems and programming. Section 2: Computing Hardware 1/5: Central Processing Unit

Data storage and high-speed streaming

ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-17: Memory organisation, and types of memory

Writing Assignment #2 due Today (5:00pm) - Post on your CSC101 webpage - Ask if you have questions! Lab #2 Today. Quiz #1 Tomorrow (Lectures 1-7)

Flash s Role in Big Data, Past Present, and Future OBJECTIVE ANALYSIS. Jim Handy

Technical Note. Micron NAND Flash Controller via Xilinx Spartan -3 FPGA. Overview. TN-29-06: NAND Flash Controller on Spartan-3 Overview

OpenSPARC T1 Processor

Computer Architecture

Virtual Memory. How is it possible for each process to have contiguous addresses and so many of them? A System Using Virtual Addressing

Price/performance Modern Memory Hierarchy

Communicating with devices

EE361: Digital Computer Organization Course Syllabus

7a. System-on-chip design and prototyping platforms

快 閃 記 憶 體 的 產 業 應 用 與 製 程

This Unit: Putting It All Together. CIS 501 Computer Architecture. Sources. What is Computer Architecture?

Here is a diagram of a simple computer system: (this diagram will be the one needed for exams) CPU. cache

Linux flash file systems JFFS2 vs UBIFS

The Bus (PCI and PCI-Express)

(Refer Slide Time: 02:39)

A N. O N Output/Input-output connection

DDR4 Memory Technology on HP Z Workstations

Chapter 8 Memory Units

Chapter Introduction. Storage and Other I/O Topics. p. 570( 頁 585) Fig I/O devices can be characterized by. I/O bus connections

Programming NAND devices

Parts of a Computer. Preparation. Objectives. Standards. Materials Micron Technology Foundation, Inc. All Rights Reserved

CHAPTER 7: The CPU and Memory

Cisco MCS 7816-I3 Unified Communications Manager Appliance

Design of Digital Circuits (SS16)

Indexing on Solid State Drives based on Flash Memory

Am186ER/Am188ER AMD Continues 16-bit Innovation

A+ Guide to Managing and Maintaining Your PC, 7e. Chapter 1 Introducing Hardware

Choosing a Computer for Running SLX, P3D, and P5

Memory Basics. SRAM/DRAM Basics

Why Latency Lags Bandwidth, and What it Means to Computing

A3 Computer Architecture

A Computer Glossary. For the New York Farm Viability Institute Computer Training Courses

Introduction to Information System Layers and Hardware. Introduction to Information System Components Chapter 1 Part 1 of 4 CA M S Mehta, FCA

MICROPROCESSOR BCA IV Sem MULTIPLE CHOICE QUESTIONS

Cisco MCS 7825-H3 Unified Communications Manager Appliance

Choosing the Right NAND Flash Memory Technology

Computer Information & Recommendations

Management Challenge. Managing Hardware Assets. Central Processing Unit. What is a Computer System?

New Mexico Broadband Program. Basic Computer Skills. Module 1 Types of Personal Computers Computer Hardware and Software

Database Fundamentals

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 412, University of Maryland. Guest lecturer: David Hovemeyer.

18-548/ Associativity 9/16/98. 7 Associativity / Memory System Architecture Philip Koopman September 16, 1998

Server: Performance Benchmark. Memory channels, frequency and performance

COMPUTER BASICS. Seema Sirpal Delhi University Computer Centre

Advantages of e-mmc 4.4 based Embedded Memory Architectures

Enhancing IBM Netfinity Server Reliability

Memory. The memory types currently in common usage are:

Why Computers Are Getting Slower (and what we can do about it) Rik van Riel Sr. Software Engineer, Red Hat

Transcription:

Slide Set 8 for ENCM 369 Winter 2015 Lecture Section 01 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary Winter Term, 2015

ENCM 369 W15 Section 01 Slide Set 8 slide 2/42 Contents Final Notes on Textbook Chapter 7 Introduction to Memory Systems Memory System Speed Requirements Introduction to Caches

ENCM 369 W15 Section 01 Slide Set 8 slide 3/42 Outline of Slide Set 8 Final Notes on Textbook Chapter 7 Introduction to Memory Systems Memory System Speed Requirements Introduction to Caches

ENCM 369 W15 Section 01 Slide Set 8 slide 4/42 Final Notes on Textbook Chapter 7 In ENCM 369, we skipped Section 7.4 Multicycle Processor to make sure to have time for other important topics, mostly memory systems and floating-point numbers. We ll also skip... Section 7.6: HDL (hardware description language) representation of processor designs. HDLs are a big topic in ENEL 453. Section 7.7: Exception handling, as an extension of the multicycle processor of Section 7.4. It s somewhat sad to skip this, because the FSM of textbook Figure 7.64 is a really cool FSM.

ENCM 369 W15 Section 01 Slide Set 8 slide 5/42 More about skipped sections in Chapter 7 Section 7.8: Advanced Microarchitecture. Simple pipelining, as described in Section 7.5, is a really important idea in processor design. But the state of the art in processor design has gone far beyond simple pipelining! See Section 7.8 (and ENCM 501) for an overview of how current processor chips work. Section 7.9: x86 Microarchitecture. The history and present status of Intel Corp. is fascinating from both business and technical viewpoints!

ENCM 369 W15 Section 01 Slide Set 8 slide 6/42 Figure 7.67 from Section 7.8 This is a sketch of a pipelined datapath designed to start two intructions per clock cycle, whenever possible... CLK CLK CLK CLK CLK PC A RD Instruction Memory A1 A2 A3 A4 A5 A6 WD3 WD6 Register File RD1 RD4 RD2 RD5 ALUs A1 RD1 A2 RD2 Data Memory WD1 WD2 Image is Figure 7.67 from Harris D. M. and Harris S. L., Digital Design and Computer Architecture, 2nd ed., c 2013, Elsevier, Inc.

ENCM 369 W15 Section 01 Slide Set 8 slide 7/42 Outline of Slide Set 8 Final Notes on Textbook Chapter 7 Introduction to Memory Systems Memory System Speed Requirements Introduction to Caches

ENCM 369 W15 Section 01 Slide Set 8 slide 8/42 Introduction to Memory Technology Chapters 6 and 7 of the textbook, and related lectures and labs suggest that memory is simple. In real-world computers, the memory systems are usually not at all simple! Expect to read Sections 8.1 to 8.4 of the textbook multiple times don t give up if it seems too confusing the first time through!

ENCM 369 W15 Section 01 Slide Set 8 slide 9/42 Know the difference between MEMORY and DISK STORAGE! MEMORY is an ELECTRONIC system with NO MOVING PARTS. It is constructed on silicon chips using millions or billions of tiny transistors, capacitors and wires per chip. It is mainly used for INSTRUCTIONS and DATA of RUNNING PROGRAMS. The next slide shows the main memory circuits of a laptop from 2007...

ENCM 369 W15 Section 01 Slide Set 8 slide 10/42

ENCM 369 W15 Section 01 Slide Set 8 slide 11/42 Know the difference between MEMORY and DISK STORAGE! (continued) DISK STORAGE is partly electronic and partly MECHANICAL with LOTS of MOVING PARTS. Access time to disk is MUCH, MUCH, MUCH LONGER than access time to memory. Cost per bit of disk is MUCH less than cost per bit of memory. Disk storage is mainly used for FILE SYSTEMS. The next slide shows a magnetic disk drive with the top of its case removed...

ENCM 369 W15 Section 01 Slide Set 8 slide 12/42 Image is Figure 8.19 from Harris D. M. and Harris S. L., Digital Design and Computer Architecture, 2nd ed., c 2013, Elsevier, Inc.

ENCM 369 W15 Section 01 Slide Set 8 slide 13/42 Describing Capacity: kilobytes, megabytes, gigabytes, kilobits, megabits, gigabits B means byte (8 bits). b means bit. But often B and b get mixed up, so watch out for mistakes! When describing memory size, powers of two are always used (e.g., 1 KB = 1024 bytes, 1 MB = 1, 048, 576 bytes). When describing disk capacity or data transfer rate, powers of ten are more frequently used (e.g, 500 GB = 500,000,000,000 bytes).

ENCM 369 W15 Section 01 Slide Set 8 slide 14/42 Memory capacity definitions 1 KB = 1 kilobyte = 2 10 bytes = 1, 024 bytes. 1 MB = 1 megabyte = 2 20 bytes = 1, 048, 576 bytes. 1 GB = 1 gigabyte = 2 30 bytes = 1, 073, 741, 824 bytes. So, how many one-bit memory cells does it take to make a 256 MB memory circuit?

ENCM 369 W15 Section 01 Slide Set 8 slide 15/42 Does it bother you that 1 Mb might or might not be exactly 1,000,000 bits? It should! Engineers should try to avoid ambiguity and imprecision in technical communication! IEEE standard 1541 unfortunately not in wide use proposes new prefixes and symbols: kibi- (Ki), mebi- (Mi), gibi- (Gi), etc., for powers of two. Example: 1 Mib = 1 mebibit = 2 20 bits. These prefixes are nice, but won t be used in ENCM 369 we will stick with the more commonly used prefixes and symbols.

ENCM 369 W15 Section 01 Slide Set 8 slide 16/42 ROM and RAM: ROM ROM: read-only memory. ROM is useful in many simple embedded systems, where programs do not change. (The programs in these systems are sometimes called firmware instead of software.) An important use of ROM in smartphones, laptops, desktops, and servers is holding instructions that start the process of loading the operating system kernel into memory.

ENCM 369 W15 Section 01 Slide Set 8 slide 17/42 ROM and RAM: RAM RAM: random-access memory. Random-access means accesses do not have to be in sequence by address. (But ROM allows random access for reads.) A more accurate (but unpronounceable) name would be RWM: readable-writable memory being writable is what makes RAM different from ROM.

ENCM 369 W15 Section 01 Slide Set 8 slide 18/42 Silicon memory technologies Silicon is a semiconductor, so often the term semiconductor memory is used. Important types of silicon memory are DRAM, SRAM, and flash. See Sections 5.5.2 5.5.4 of the textbook for brief notes on DRAM and SRAM. See Section 5.5.6 of textbook for brief notes on flash memory.

ENCM 369 W15 Section 01 Slide Set 8 slide 19/42 DRAM: dynamic RAM DRAM is cheap, compared to SRAM. DRAM has one transistor plus one very, very small capacitor per bit very high density DRAM access time is typically tens of nanoseconds, much longer than a processor clock cycle, which is currently 0.5 ns (500 ps) or less.

ENCM 369 W15 Section 01 Slide Set 8 slide 20/42 DRAM (continued) DRAM is used for main memory in smartphones, tablets, laptops, desktops, and servers, in various flavours such as DDR3 SDRAM (3rd generation double-data-rate synchronous DRAM). DRAM chips are installed on DIMMs (dual inline memory modules) that plug into computer motherboards.

ENCM 369 W15 Section 01 Slide Set 8 slide 21/42 Two SO-DIMMs from a 2007 MacBook SO means small outline. 4 DRAM chips per DIMM are hiding under the white paper labels. These two 256 MB DIMMs were replaced with two 1 GB DIMMs to upgrade RAM from 512 MB to 2 GB.

ENCM 369 W15 Section 01 Slide Set 8 slide 22/42 SRAM: static RAM SRAM typically uses 6 transistors per bit lower density than DRAM. SRAM is more expensive than DRAM. SRAM is much faster than DRAM access time can be as low as 1 or 2 processor clock cycles. These days, SRAM is usually on the same chip as the processor cores. See, for example, the L2 cache in textbook Figure 7.79, page 464, on the next slide. Chip dimensions are about 15 mm by 15 mm. Each computer in ICT 320 has one of these chips. The technology is 6 7 years old, but is still awesome in its complexity!

ENCM 369 W15 Section 01 Slide Set 8 slide 23/42 Core 1 Core 2 2 MB Shared Level 2 Cache Image is Figure 7.79 from Harris D. M. and Harris S. L., Digital Design and Computer Architecture, 2nd ed., c 2013, Elsevier, Inc.

ENCM 369 W15 Section 01 Slide Set 8 slide 24/42 Flash memory Compared to DRAM, flash is cheaper per bit, but much slower. The name flash comes from the electrical process used to erase a block of memory, and has nothing to do with relative speed. Unlike DRAM or SRAM, flash maintains data when power is turned off. Flash is useful for file systems for computers where hard disks would be too slow or too bulky. Solid-State Drives, which have replaced magnetic disks in many current laptop and desktop computer designs, store files in flash memory.

ENCM 369 W15 Section 01 Slide Set 8 Three things containing flash memory slide 25/42

ENCM 369 W15 Section 01 Slide Set 8 slide 26/42 Outline of Slide Set 8 Final Notes on Textbook Chapter 7 Introduction to Memory Systems Memory System Speed Requirements Introduction to Caches

ENCM 369 W15 Section 01 Slide Set 8 slide 27/42 Memory System Speed Requirements instruction addresses processor core instructions data addresses data for loads and stores memory system Suppose we are trying to build a memory system for a MIPS processor with a simple 5-stage pipeline like the one we ve just studied. What requirements does the processor design place on the memory system?

ENCM 369 W15 Section 01 Slide Set 8 slide 28/42 Keeping things relatively simple When we start looking at caches, we ll just look at supporting instruction fetch, LW and SW. We will not worry about these extra complexities... To implemement the entire MIPS instruction set we would need also to support loads and stores of bytes, halfwords (16-bit chunks) and doublewords (64-bit chunks). Multiple issue (starting 2 or more instructions in 1 clock cycle within a single core) and supporting multiple cores both require more complex memory interfaces.

ENCM 369 W15 Section 01 Slide Set 8 slide 29/42 A simple model of memory: A BIG ARRAY OF WORDS A definition for the word model: A description of a system that, while not perfectly accurate, helps explain or predict the behaviour of the system. Let s make a sketch of the memory model introduced at the beginning of ENCM 369... How was this model modified in Chapter 7 of the textbook?

ENCM 369 W15 Section 01 Slide Set 8 slide 30/42 Can the simple big-array-of-words model be used for a hardware design? Could memory be organized as a big array (or two big arrays) of 32-bit groups of one-bit DRAM cells? Could memory be organized as a big array (or two big arrays) of 32-bit groups of one-bit SRAM cells?

ENCM 369 W15 Section 01 Slide Set 8 slide 31/42 What can be done? Can we combine a very fast small-capacity memory (SRAM) and a slower large-capacity memory (DRAM) to make a system that performs almost as well as an impossible-to-build very fast large-capacity memory? Answer: YES, such a system can often but not always perform like very big, very fast memory. The small, very fast part of the system is called a cache.

ENCM 369 W15 Section 01 Slide Set 8 slide 32/42 Memory organization in real computers See page 1 of the Introduction to Memory Organization lecture document. This is way more complicated than the simple big-array-of-words model! We will not study all parts in detail, but will focus on key components: caches (textbook Section 8.3); address translation and virtual memory (Section 8.4).

ENCM 369 W15 Section 01 Slide Set 8 slide 33/42 Outline of Slide Set 8 Final Notes on Textbook Chapter 7 Introduction to Memory Systems Memory System Speed Requirements Introduction to Caches

ENCM 369 W15 Section 01 Slide Set 8 slide 34/42 What is a cache? A cache is a small, very fast memory system that holds copies of some, not all of the instructions or data in main memory. Processor cores look in caches, NOT in main memory, to find instructions and data.

ENCM 369 W15 Section 01 Slide Set 8 slide 35/42 System with one level of cache, no address translation physical address 32 I-cache physical address 32 32 processor core (control, registers, ALUs, f-p hardware) 32 instructions 32 main memory physical address 32 D-cache 32 32 32 32 data instructions or data This is from page 2 of Introduction to Memory Organization. Keep this picture in mind as we discuss cache organization.

ENCM 369 W15 Section 01 Slide Set 8 slide 36/42 Simple examples of cache operation We ll consider the system with one level of cache and no address translation. What happens in instruction fetch? What happens in the data memory access step of a load instruction? What about data memory access in a store instruction?

ENCM 369 W15 Section 01 Slide Set 8 slide 37/42 Definitions: Hits and Misses Cache hit: Event in which the processor finds the instruction or data it needs in the cache. Cache miss: Event in which the instruction or data the processor needs is NOT in the cache. A miss causes a long stall while instructions or data are copied from the main memory to the cache. (These definitions apply to instruction fetches and data memory reads. We will have to modify them a little for data memory writes.)

ENCM 369 W15 Section 01 Slide Set 8 slide 38/42 Capacity: Cache vs. Main Memory A typical recent medium-quality laptop (Apple MacBook Air, March 2015) has 4GB of main memory DRAM. It can cache up to 3MB of instructions and data in SRAM within the processor chip. What fraction of main memory contents can be cached? Does that mean that caches are not very useful?

ENCM 369 W15 Section 01 Slide Set 8 slide 39/42 Locality of Reference This is a property computer programs have, to varying degrees. (It doesn t make sense to say that a processor or memory system has good or bad locality of reference.) Locality of reference is what makes caches valuable.

ENCM 369 W15 Section 01 Slide Set 8 slide 40/42 Temporal locality of reference Temporal means related to time. If a program accesses a memory word, the program is likely to access the same word in the near future. What are some examples of temporal locality?

ENCM 369 W15 Section 01 Slide Set 8 slide 41/42 Spatial locality of reference In this context, spatial means related to the address space of a computer, not related to physical space. If a program accesses a memory word, the program is likely to access other words with nearby addresses in the near future. What are some examples of spatial locality?

ENCM 369 W15 Section 01 Slide Set 8 slide 42/42 Locality of Reference and Cache Hit Rates Programs with high degrees of locality of reference tend to have high cache hit rates 95 to 99% of memory accesses are cache hits. So caches can enable many programs to have performance close to performance of a hypothetical computer in which all memory accesses take 1 2 clock cycles. What about programs with bad locality of reference?