COMPUTERS ORGANIZATION 2ND YEAR COMPUTE SCIENCE MANAGEMENT ENGINEERING JOSÉ GARCÍA RODRÍGUEZ JOSÉ ANTONIO SERRA PÉREZ



Similar documents
A N. O N Output/Input-output connection

Memory Basics. SRAM/DRAM Basics

COMPUTERS ORGANIZATION 2ND YEAR COMPUTE SCIENCE MANAGEMENT ENGINEERING UNIT 5 INPUT/OUTPUT UNIT JOSÉ GARCÍA RODRÍGUEZ JOSÉ ANTONIO SERRA PÉREZ

EXERCISES OF FUNDAMENTALS OF COMPUTER TECHNOLOGY UNIT 5. MEMORY SYSTEM

Homework # 2. Solutions. 4.1 What are the differences among sequential access, direct access, and random access?

Computer Organization and Architecture. Characteristics of Memory Systems. Chapter 4 Cache Memory. Location CPU Registers and control unit memory

Computer Systems Structure Main Memory Organization

RAM & ROM Based Digital Design. ECE 152A Winter 2012

Module 2. Embedded Processors and Memory. Version 2 EE IIT, Kharagpur 1

With respect to the way of data access we can classify memories as:

Handout 17. by Dr Sheikh Sharif Iqbal. Memory Unit and Read Only Memories

Computers. Hardware. The Central Processing Unit (CPU) CMPT 125: Lecture 1: Understanding the Computer

The Quest for Speed - Memory. Cache Memory. A Solution: Memory Hierarchy. Memory Hierarchy

Computer Architecture

Memory. The memory types currently in common usage are:

Logical Operations. Control Unit. Contents. Arithmetic Operations. Objectives. The Central Processing Unit: Arithmetic / Logic Unit.

Chapter 4 System Unit Components. Discovering Computers Your Interactive Guide to the Digital World

CSE2102 Digital Design II - Topics CSE Digital Design II

Memory Systems. Static Random Access Memory (SRAM) Cell

Computer Architecture

Discovering Computers Living in a Digital World

Memory Hierarchy. Arquitectura de Computadoras. Centro de Investigación n y de Estudios Avanzados del IPN. adiaz@cinvestav.mx. MemoryHierarchy- 1

COMPUTER HARDWARE. Input- Output and Communication Memory Systems

ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-17: Memory organisation, and types of memory

CSCA0102 IT & Business Applications. Foundation in Business Information Technology School of Engineering & Computing Sciences FTMS College Global

Computer Performance. Topic 3. Contents. Prerequisite knowledge Before studying this topic you should be able to:

MICROPROCESSOR BCA IV Sem MULTIPLE CHOICE QUESTIONS

Chapter 5 :: Memory and Logic Arrays

Objectives. Units of Memory Capacity. CMPE328 Microprocessors (Spring ) Memory and I/O address Decoders. By Dr.

Algorithms and Methods for Distributed Storage Networks 3. Solid State Disks Christian Schindelhauer

CSCA0201 FUNDAMENTALS OF COMPUTING. Chapter 5 Storage Devices

Chapter 1 Computer System Overview

3. Operating Systems

361 Computer Architecture Lecture 14: Cache Memory

The Big Picture. Cache Memory CSE Memory Hierarchy (1/3) Disk

Chapter 7 Memory and Programmable Logic

Data Storage - I: Memory Hierarchies & Disks

Slide Set 8. for ENCM 369 Winter 2015 Lecture Section 01. Steve Norman, PhD, PEng

CHAPTER 7: The CPU and Memory

Chapter 6. Inside the System Unit. What You Will Learn... Computers Are Your Future. What You Will Learn... Describing Hardware Performance

Solid State Drive Architecture

NAND Flash FAQ. Eureka Technology. apn5_87. NAND Flash FAQ

Chapter 9 Semiconductor Memories. Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan

A3 Computer Architecture

Chapter 8 Memory Units

Secondary Storage. Any modern computer system will incorporate (at least) two levels of storage: magnetic disk/optical devices/tape systems

Memory ICS 233. Computer Architecture and Assembly Language Prof. Muhamed Mudawar

Semiconductor Memories

Basic Computer Organization

Price/performance Modern Memory Hierarchy

Machine Architecture and Number Systems. Major Computer Components. Schematic Diagram of a Computer. The CPU. The Bus. Main Memory.

Main Memory & Backing Store. Main memory backing storage devices

Chapter 11 I/O Management and Disk Scheduling

We r e going to play Final (exam) Jeopardy! "Answers:" "Questions:" - 1 -

Random-Access Memory (RAM) The Memory Hierarchy. SRAM vs DRAM Summary. Conventional DRAM Organization. Page 1

SDRAM and DRAM Memory Systems Overview

Operating Systems. Virtual Memory

Chapter 2: Computer-System Structures. Computer System Operation Storage Structure Storage Hierarchy Hardware Protection General System Architecture

CS 6290 I/O and Storage. Milos Prvulovic

Writing Assignment #2 due Today (5:00pm) - Post on your CSC101 webpage - Ask if you have questions! Lab #2 Today. Quiz #1 Tomorrow (Lectures 1-7)

Technical Note. Micron NAND Flash Controller via Xilinx Spartan -3 FPGA. Overview. TN-29-06: NAND Flash Controller on Spartan-3 Overview

Lecture 9: Memory and Storage Technologies

Memory unit. 2 k words. n bits per word

MICROPROCESSOR. Exclusive for IACE Students iacehyd.blogspot.in Ph: /422 Page 1

1. Memory technology & Hierarchy

A+ Guide to Managing and Maintaining Your PC, 7e. Chapter 1 Introducing Hardware

OpenSPARC T1 Processor

Memory is implemented as an array of electronic switches

Multiple Choice Questions(Computer)

Semiconductor Device Technology for Implementing System Solutions: Memory Modules

Understanding Memory TYPES OF MEMORY

Computer Organization & Architecture Lecture #19

Communicating with devices

EFFICIENT EXTERNAL SORTING ON FLASH MEMORY EMBEDDED DEVICES

EDI s x32 MCM-L SRAM Family: Integrated Memory Solution for TMS320C4x DSPs

Board Notes on Virtual Memory

Disk Storage & Dependability

Computer Systems Structure Input/Output

Record Storage and Primary File Organization

CPS104 Computer Organization and Programming Lecture 18: Input-Output. Robert Wagner

Table 1: Address Table

1 / 25. CS 137: File Systems. Persistent Solid-State Storage

In-Memory Databases Algorithms and Data Structures on Modern Hardware. Martin Faust David Schwalb Jens Krüger Jürgen Müller

Outline. CS 245: Database System Principles. Notes 02: Hardware. Hardware DBMS Data Storage

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University. (

CHAPTER 16 MEMORY CIRCUITS

W25Q80, W25Q16, W25Q32 8M-BIT, 16M-BIT AND 32M-BIT SERIAL FLASH MEMORY WITH DUAL AND QUAD SPI

MICROPROCESSOR AND MICROCOMPUTER BASICS

Flash Memories. João Pela (52270), João Santos (55295) December 22, 2008 IST

Chapter Introduction. Storage and Other I/O Topics. p. 570( 頁 585) Fig I/O devices can be characterized by. I/O bus connections

Chapter 13. PIC Family Microcontroller

COS 318: Operating Systems. Storage Devices. Kai Li and Andy Bavier Computer Science Department Princeton University

EDI s x32 MCM-L SRAM Family: Integrated Memory Solution for TMS320C3x DSPs

AQA GCSE in Computer Science Computer Science Microsoft IT Academy Mapping

Computer Organization

GETTING STARTED WITH PROGRAMMABLE LOGIC DEVICES, THE 16V8 AND 20V8

Microprocessor or Microcontroller?

Introduction to Cloud Computing

Introduction to Information System Layers and Hardware. Introduction to Information System Components Chapter 1 Part 1 of 4 CA M S Mehta, FCA

Transcription:

COMPUTERS ORGANIZATION 2ND YEAR COMPUTE SCIENCE MANAGEMENT ENGINEERING UNIT 4 - MEMORY JOSÉ GARCÍA RODRÍGUEZ JOSÉ ANTONIO SERRA PÉREZ Unit 4. Memory 1

Memory Memory Introduction Concepts and definitions Semi conducting main Cache Associative Shared Unit 4. Memory 2

Introduction Introduction The contains programs that are executed in the computer and the data they work with. The is a simple element, however, it has a great number of varieties, technologies, structures, services and costs. Every computer has a hierarchy of elements where some of them are internally located while others are externally located. Unit 4. Memory 3

Requirements of the Definitions and concepts A system must have the following elements: Means or support. It must have an element where the different states that codify the information are stored. Transducer. It is an element that allows to turn an energy into another one, this is, to transform physical magnitudes to electrical (sensor) or electrical magnitudes to physical (actuator). Static and dynamic. Addressing Mechanism. It must provide a way to read/write information in the place and wished time. Unit 4. Memory 4

Characteristic of the Definitions and concepts Location Depending on where the is physically located there are three different types: Internal to processor. Hi-speed which is temporally used. Internal (Main Memory). External (Secondary Memory). Unit 4. Memory 5

Characteristic of the Definitions and concepts Capacity Amount of information that the system is able to store. The capacity is measured using multiples of bit units. 1 bit 1 nibble = 4 bits 1 byte = 1 octeto = 8 bits 1 Kb = 1024 bits = 2 10 bits 1 Mb = 1024 Kb = 2 20 bits 1 Gb = 1024 Mb = 2 30 bits 1 Tb = 1024 Gb = 2 40 bits Unit 4. Memory 6

Characteristic of the Definitions and concepts Unit of transference It is equal to the number of input and output data lines of the module. Associated concepts: Word. The size of a word is generally equal to the number of bits used to represent an integer and the instruction length. Addressable unit. It is the minimum size that we can address in the. Transference unit. For the main it is the number of bits that can be read or written at the same time. Unit 4. Memory 7

Characteristic of the Definitions and concepts Access method Way to locate the information into. Types: SAM: Sequential Access Memory. DAM: Direct Access Memory. RAM: Random Access Memory. CAM: Content Addressable Memory. Unit 4. Memory 8

Characteristic of the Definitions and concepts Speed In order to measure the yield three parameters are used : Access time (T A ) RAM: time that passes from the moment at which a direction of the appears until the data either has been stored or is available for its use. Another one: time that is used in locating the read/write mechanism in the wished position. Unit 4. Memory 9

Characteristic of the Definitions and concepts Memory cycle time (T C ) Time that passes since a read/write order operation occurs until another read/write order can be given. T C tiempo T A petición lectura información disponible próxima petición Unit 4. Memory 10

Characteristic of the Definitions and concepts Transference speed (V T ). It is the speed at that data can be transferred to or from a unit. In the case of random access In the case of non-random access T N T A N V T V = Average time of read/write of N bits Access time Number of bits Transference speed (bits/second) T 1 T N C T = T + A N V T Unit 4. Memory 11

Characteristic of the Definitions and concepts Phisical device Memory systems in computers use different phisical devices. Commonly used types are: Main uses semiconductive. For secondary : Magnetic, discs, tapes, etc. Optical. Magnetic-optical. Unit 4. Memory 12

Characteristic of the Definitions and concepts Phisical facts Main phisical facts to consider to work with different types of are: Changeability. This property makes reference to the possibility to alter the content of a. ROM and RWM. Unit 4. Memory 13

Characteristic of the Definitions and concepts Permanence of the information. Related to the durability of the information stored into : Destructive reading memories. DRO (Destructive ReadOut) and NDRO (Non Destructive ReadOut). Volatibility. This characteristic references the possible destruction of the information stored into certain device when an electrical powerdown occurs. Memory can be volatile and non volatile. Static/dinamic storage. A is static if the information contained doesn t vary along the time. A is dinamic if the information stored is lost along the time. To avoid data of being lost, an information recharge or refresh is needed. SRAM (Static RAM) and DRAM (Dynamic RAM). Unit 4. Memory 14

Characteristic of the Definitions and concepts Organization It makes reference to the physical disposition of the bits to form words. The organization depends on the type of. For a semiconducting we find three types of organization : 2D Organization 2½D Organization 3D Organization Unit 4. Memory 15

Characteristic of the Definitions and concepts 2D Organization RAM with 2 m words of n bit each. The matrix of cells is formed by 2 m rows and n columns. n n-1 n-2... 0 Registro de datos a escribir Registro de direcciones 0 m 1...... m-1 Decodificador 0 1...... 2 m -2 2 m -1... E...... 2 m palabras It is used in memories of reduced capacity.... L Great rapidity of access. CS n-1 n-2... 0 Registro de datos a leídos n Unit 4. Memory 16

Characteristic of the Definitions and concepts 2½D Organization It uses 2 decoders with m/2 inputs and 2 m/2 outputs. Registro de direcciones 0... m m/2-1 m/2... m-1 X CS Decodificador X 0 Y......... 2 X -2 2 Y -1 2 Y -2............ 0...... It requires less logical gates. Registro de datos a escribir 0...... n-2 n 2 X -1 n-1 L... E n-1 n-2... 0 n Registro de datos a leídos Unit 4. Memory 17

Characteristic of the Definitions and concepts 3D Organization Registro de direcciones m Similar to 2½D organization but n bit word is stored in n planes and inside each plane possitions x and y are selected. 0... m/2-1 m/2... m-1 X Y Decodificador X Decodificador Y More complicated design. Slower Access. Registro de datos a leídos n 0............ 2 X -2 2 X -1... 0 2 Y -1 2 Y -2... 0... Registro de datos a escribir 0...... n-2 n n-2 n-1 n-1 Unit 4. Memory 18

Memory hierarchy Definitions and concepts Fundamental parameters that define the types of of the computer : Cost. Speed. The shouldn t cause wait states to the processor. Capacity. The ideal configuration: fast, great capacity and little cost. It is not necessary to use a single type of, but to use different types from, this is, to use a hierarchy. Unit 4. Memory 19

Memory hierarchy Definitions and concepts Coste Velocidad Nº de Accesos Capacida d + + Registros CPU Memoria Cache Memoria Principal Memoria Cache de disco Unidad de Disco Cinta (Streamer) Memoria Interna Memoria Externa If we lower towards the inferior levels of the hierarchy it happens that: The cost by information unit (bit) diminishes. Capacity increases. Access time increases. The frequency of accesses to the by the CPU diminishes. The principle of reference locality depends on the frequency of accesses. Unit 4. Memory 20

Types of Semiconducting main Depending on the type of operation that a allows we can distinguish the following types: Read Only ROM (Read Only Memory). Usually they are used in microprogramming of systems, library routines of frequent use, etc. The manufacturers use it when they produce components in a massive way. PROM (Programable Read Only Memory). The writing process is carried out electrically and can be made either by the provider or by the client after the manufacturing of the original chip, unlike the ROM that is recorded when it is assembled. PROM allows just a single recording and it is more expensive than ROM. Unit 4. Memory 21

Types of Semiconducting main Read-mostly EPROM (Erasable Programable Read Only Memory). EPROM can be written several times using electrical power. However, using ultra-violet rayses all the data can be arased. This type of is more expensive than PROM. EEPROM (Electricaly Erasable Programable Read Only Memory). Data can be erased at byte level in a selective way using electrical power. It s more expensive than EPROM. Flash Memory. Denominated thus by the speed with which it can be reprogrammed. It uses selective electrical erasure at block of bytes level. It s cheaper than EEPROM. Unit 4. Memory 22

Types of Semiconducting main Read/Write RAM (Random Access Memory). Like the previous, it s a random access. Main RAM types are: Dynamic RAM (DRAM). Data is stored in a similar way as in the load of a condenser. Because it tends to unload is necessary to refresh it periodically. Simpler and cheaper than SRAM.. Static RAM (SRAM). Data is stored forming biestable so it is not necessary to refresh it. It s faster and more expensive than DRAM. Unit 4. Memory 23

Types of Semiconducting main Summary chart Tipo Clase Borrado Escritura Volatilidad RAM Lectura/Escritura Eléctricamente Eléctricamente Volátil por bytes ROM Mediante Sólo lectura No máscaras PROM EPROM Luz violeta, No Volátil chip completo Eléctricamente FLASH Sobre todo lectura Eléctricamente por bloques EEPROM Eléctricamente por byte Unit 4. Memory 24

Design Semiconducting main Memory chip It is organized internally like a matrix of cells of nxm, where n is the number of words that can be stored in the chip and m is the number of bits by word. n palabras Direcciones Contenido memoria 0 bit 7 bit 6 bit 5 bit 4 bit 3 bit 2 bit 1 bit 0 1........................... 4094 4095 Contenido de la palabra 0 de la memoria. m bits 4Kx8 Unit 4. Memory 25

Design Semiconducting main The interconnection of a chip is made through its pins: n pins for the address bus, where 2 n words can be addressed. m pins for the data bus indicating that in each acces m bits will be used. W/R (Write/Read). This pin indicates the type of operation: read or write. There are chips having one pin for writing WE (Write Enable) and other one for reading OE (Output Enable). CS (Chip Selection) or CE (Chip Enable). Selects the chip of to be accessed. V CC. Power. V SS. Ground. BUS DIRECCIONES n Chip de Memoria V CC V SS BUS DATOS m CS R/W Unit 4. Memory 26

Design Semiconducting main For the correct operation of the it is necessary to incorporate an additional circuitry like decoders, multiplexors, buffers, etc. CE WE OE LOGICA CONTROL... BUFFER ENTRADA... A 12 A 11 A 5............... DECOD. FILAS............... MATRIZ DE MEMORIA DE 256x32x8............ B U S F A F L E I R D S A............ D 7 D 0 A 4 A 0...... DECOD. COLUM.......... Unit 4. Memory 27

Design Semiconducting main Memory map Space that a computer is able to address. Direcciones en decimal 0 0000 1...... 2 16-1 FFFF Direcciones en hexadecimal 64 K Example of a 16 bit buffer computer. Unit 4. Memory 28

Design Semiconducting main The physical implementation of the map is made using one or several chips of. In the market we can find different chips configuration: zkx1, zkx4, zkx8, zkx16, zkx32, zmx1, zmx4, zmx8, zmx16, zmx32, etc. where z es a multiple of 2. This way, for instance, a 1Kx8 chip indicates that it can store 1024 words of 8 bit each one. Unit 4. Memory 29

Design Semiconducting main Example 1: If we want to design a 128 Kword main. (1)How many 32Kx8 chips do we need if we suppose an 8 bit word? (2)How many 64Kx4 chips do we need if we suppose an 8 bit word? Unit 4. Memory 30

Design Semiconducting main Solution 1: (1)We need to address 128K from 32K so we will need 4 chips. As the length of the word is equal to the content of each chip address we won t be needing anymore. (2) To adderss 128K we will need 2 chips. With both chips we will have a 128Kx4 so we will need another 2 chips to reach the 128Kx8 desired. Unit 4. Memory 31

Design Semiconducting main If we want to design an n bit and we have t bit chips we will need n/t parallel disposed chips to reach the desired word width. Example: Let s suppose we want to design a 64Kbyte (n=8) and we have 64Kx4 (t=4) chips, then we will need 2 chips (8/4). We can observe that there s 1 chip row and 2 chip columns. BUS DIRECC. CS WE CS 16 WE Memoria 64Kx4 16 4 Memoria CS 64Kx4 WE 4 BUS DATOS 8 Unit 4. Memory 32

Design Semiconducting main If we want a ck word cappacity and we have zk cappacity chips, we will need c/z chips to reach the desired cappacity. Example: We want to design a with 64Kbytes and we have 32Kx8 chips, then we will need 2 chips. When line A 15 is at 1 it enables the upper chip, when it s at 0 it enables the lower chip. In this interconection we can observe that there s 2 chip rows and 1 chip column. 15 BUS DIRECC. 16 WE A 15 A 15 15 CS WE CS Memoria 32Kx8 Memoria 32Kx8 8 8 8 BUS DATOS WE Unit 4. Memory 33

Design Semiconducting main Example 2: Let s obtain the map and the connections diagram of a 16 bit computer that can address 1 Mword and has 128Kwords installed from 64Kx1 chips. (1)We must obtain the number of bits of the address bus. (2)Figure out the number of bits needed to address the chip we re going to use. (3)Calculate the number of chips we need. (4)Obtain the number of bits of the address bus that allows to select the chips. (5)Draw the connections diagram by the selection logic. Unit 4. Memory 34

Design Semiconducting main Solution 2: (1) We must obtain the number of bits of the address bus. As they say that it can address 1Mword, we can see that it s a 20 bit bus (1M = 2 20 ). (2) Figure out the number of bits needed to address the chip we re going to use. As it s a 64K chip, we will need 16 bits (64K=2 16 ). The bits we will use to address the chip are the smaller weight ones, so in this particular case A 15 A 14...A 1 A 0. Unit 4. Memory 35

Design Semiconducting main Solution 2: (3) Calculate the number of chips we need. As we want 128Kx16 we will need 16 chips to obtain a complete word (16 bit). With these first 16 bits we will have 64Kx16 so we need another 64Kx16, this is, 16 chips more. In conclusion, we will need 32 chips of 64Kx1 to store 128Kx16. Unit 4. Memory 36

Design Semiconducting main Solution 2: (4) Obtain the number of bits of the address bus that allows to select the chips. As we have 2 rows of 16 chips each, we will need 1 bit to difference one row from another. So we will use the bit A 16 to select the chips. The rest of the addresses will be used for further computer extensions. Unit 4. Memory 37

Design Semiconducting main Solution 2: (4) Obtain the number of bits of the address bus that allows to select the chips. 20 bits A 19 A 18 A 17 A 16 A 15 A 14...A 1 A 0 Selección chip Direccionamiento interno Mapa de Memoria 0 00000......... 2 16-1 0FFFF 2 16 10000......... 2 17-1 1FFFF.................. 2 20-1 FFFFF Fila 0: 16 chips de 64Kx1 Fila 1: 16 chips de 64Kx1 A 19 A 18 A 17 A 16 A 15 A 14 A 13 A 12 A 11 A 10 A 9 A 8 A 7 A 6 A 5 A 4 A 3 A 2 A 1 A 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1............................................................ 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1............................................................ 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1........................................................................................................................ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Fila 0: 16 chips de 64Kx1 Fila 1: 16 chips de 64Kx1 Unit 4. Memory 38

Design Semiconducting main Solution 2: (5) Draw the Bus Direc. A 19...A 0 A 15...A 0 A 16 CPU WE Bus Datos D 15...D 0 connections diagram A 15...A 0 CS WE Chip 0 64Kx1 D 0 A 15...A 0 CS WE Chip 0 64Kx1 D 0 by the selection logic. A 15...A 0 CS WE Chip 1 64Kx1 D 1 A 15...A 0 CS WE Chip 1 64Kx1 D 1............ A 15...A 0 CS WE Chip 15 64Kx1 D 15 A 15...A 0 CS WE Chip 15 64Kx1 D 15 Fila 1 Unit 4. Memory Fila 0 39

Concept Cache CPU is faster than main. The ideal would be that the main was the same technology than the registries of the CPU, but due to its high cost the tendency is to find intermediate solutions A solution would be to take advantage of the locality principle and to place a very fast between the CPU and the main the way the CPU accedes more times to that than it does to main. This very fast will have to be small so the costs won t be excessive. This is denominated cache. Unit 4. Memory 40

Concept Cache The way cache works is based on the transference of parts (blocks) between the main and the cache and the transference of words between cache and the CPU. CPU Transferencia de palabras MEMORIA CACHE Transferencia de bloques MEMORIA PRINCIPAL Unit 4. Memory 41

Concept Cache 2 n words main is organized in M blocks of fixed length (K words/block) where M=2 n /K blocks. Cache is divided in en C lines or partitions of K words (C<<M). Memoria Principal Memoria Cache Direcciones 0 1 K-1......... Bloque 0 Líneas 0 1 Etiqueta Bloque...... C-1 2 n -1 Bloque M-1 K palabras Unit 4. Memory 42

Concept Cache Rate of success When the CPU looks for a word and it finds it in the cache we name it a hit. If the word is not in the cache a miss is entered. The relation between the number of successes and the total number of references to (hits + misses) is the rate of success. Well designed systems usually obtain a rate of success of 0.9. Tasa de acierto = N º N º aciertos referencias Unit 4. Memory 43

Design parameters Cache Cache size It raises a certain commitment: It should be small enough to make te average cost by bit of stored data in the computers internal be close to the one of the main. It should be big enough to make the total average access time as close as possible to the one of the cache. According to empirical studies it is suggested that the size of a cache varies from 1Kb to 1Mb. Unit 4. Memory 44

Design parameters Cache Block size When the block size is increased from very small values the rate of success initially will increase. From a certain size of the block the rate of success begins to diminish. Two effects arise: The biger the size of the block is, less blocks will fit in the cache and more times the replacement of blocks algorithm will be executed. When the block size increases, each new word added to the block will be at a bigger distance from the word required from the CPU, therefore it is less probable that it would be needed in the short term. A size of 4-8 addressable units is suggested. Unit 4. Memory 45

Design parameters Cache Cache levels Internal cache. Level 1. It is physically located in the same chip as the processor. The accesses to this cache are very fast. The capacity of this cache is quite small. External cache. Level 2. It is phisically located outside the proccessor so it will be slower than level 1 cache but will remain faster than main. Being outside, the cappacity could be bigger than level 1 cache. Unit 4. Memory 46

Design parameters Cache Cache content A cache containing both, data and instructions, has the following advantages: It has a higher rate of success because it levels the load, this is, if an execution pattern implies more pick up of instructions than data, cache will tend to be filled in with instructions and vice versa. It s only needed to implement and design one cache, therefore the cost will be more reduced. Nowadays, the parallel execution of instructions is used and so is the design pipelining in which the use of two cache gives better benefits. Advantage: It eliminates the competition between the instructions processor and the unit of execution. Unit 4. Memory 47

Design parameters Cache Writing strategy Before a block which is in the cache can be replaced, it is necessary to know if it s been modified or not (clean block and modified or dirty block). Write through All writing operations are executed in both, cache and main, which makes sure that the contents are always valid. The main disadvantage is that it generates a lot of traffic with the main and it can originate a bottle neck. Unit 4. Memory 48

Design parameters Cache Writing strategy Write back Writing is only executed in cache. Associated to each cache line there is a modification bit. When a writing order is executed in cache, the modification bit is put to 1. When a cache line has to be replaced and it s at 1, writting will be made in the main line while if it s at 0 it won t be made. Disadvantage: I/O modules are forced to access the main through the cache. This complicates the circuitry and generates a bottle neck. Advantage: It uses less main bandwidth making it suitable for its use in multiprocessors. Problem of data coherence. Unit 4. Memory 49

Design parameters Cache Correspondence function Due to the existence of less lines than blocks, an algorithm (function) to make main blocks correspond to cache lines is needed. Direct correspondence. Main block 12 can be only stored in line 4 (=12 mod 8). Associative correspondence. Storage can be made in any line. Associative correspondence by sets. Storage can be made in any line of the set 0 (=12 mod (8/2)). Número Línea Directa Asociativa Asociativa por conjuntos 0 1 Conjunto 0 2 3 Conjunto 1 4 5 Conjunto 2 6 7 Conjunto 3 Unit 4. Memory 50

Design parameters Cache Correspondence function Next we are going to describe the three techniques before enumerated. In each case we will see the general structure and a concrete example. For the three cases, we are going to work with a system with the following characteristics: Cache size is 4Kb. Data is transferred between main and cache in 16 bytes (K) blocks. This indicates that cache is organized in 256 (4096/16) lines (C). Main is 64Kb, so addresses bus is 16 bits (2 16 = 64K). This indicates that main is composed by 4096 blocks (M). Unit 4. Memory 51

Design parameters Cache Direct correspondence It consists of making correspond each block of main to only one line of cache. Correspondence function is expresed by the following function: Cache L. Number = M.M. B. Number mod C.M. number of lines Unit 4. Memory 52

Design parameters Cache Direct correspondence Each Main address is divided in three different fields: s bits w bits Etiqueta Línea Palabra where w bits = Identifies each word inside a block s bits = Identifies the block number Unit 4. Memory 53

Design parameters Cache Direct correspondence The use of the part of an address as a line number provides a unique allocation of each block of main in cache. s+w Etiq. Datos Memoria Principal Etiqueta Línea Palabra s-r r w Línea 0 Bloque 0 s-r Compar a Acierto w Línea i s w Bloque j Fallo Línea 2 r -1 Bloque 2 s -1 Unit 4. Memory 54

Design parameters Cache Direct correspondence Example: Memoria Principal Etiqueta Línea Palabra 4 bits 8 bits 4 bits Dirección Dato 0000 0001 0002 000F F800 F801 F802 F80F FFF0 FFFD FFFE FFFF 12 34 56 78 AB 0F FF 54 FF 21 33 55 Bloque 0 Bloque 3968 Bloque 4095 Memoria Cache Etiqueta Datos Nº Línea 0 123456...78 0 (00)...... 1 (01)......... F AB0FFF...54 128 (80)......... F FF...213355 255 (FF) 4 bits 16 bytes 8 bits Unit 4. Memory 55

Design parameters Cache Direct correspondence The direct correspondence technique is simple and not very expensive to implement. Disadvantage: there is a concrete position of cache for each given block. Unit 4. Memory 56

Design parameters Cache Associative correspondence It allows a main block to be loaded in any cache line. The logic of control of the cache interprets a direction as a label and a word field. s bits Etiqueta w bits Palabra where Word = Identifies each word inside a MM block Label = Identifies univocally a block as MM Unit 4. Memory 57

Design parameters Cache Associative correspondence In order to determine if a block is in the Cache, we have to examinate simultaneously all labels of the cache lines. s+w s Etiqueta Palabra w s Etiq. Datos Línea 0 Memoria Principal Bloque 0 s Comparador simultáneo Fallo Acierto w s Línea i Línea 2 r -1 s w Bloque j Bloque 2 s -1 Unit 4. Memory 58

Design parameters Cache Associative correspondence Example: Etiqueta 12 bits Palabra 4 bits Memoria Principal Dirección Dato 0000 0001 0002 000F F800 F801 F802 F80F FFF0 FFFD FFFE FFFF 12 34 56 78 AB 0F FF 54 FF 21 33 55 Bloque 0 Bloque 3968 Bloque 4095 Memoria Cache Etiqueta Datos Nº Línea 000 123456...78 0...... 1......... F80 AB0FFF...54 128......... FFF FF...213355 255 12 bits 16 bytes Unit 4. Memory 59

Design parameters Cache Associative correspondence Advantage: very fast access. Disadvantage: a quite complex circuitry is needed. Unit 4. Memory 60

Design parameters Cache Associative correspondence by sets This technique is a commitment that tries to combine the advantages of the techniques previously described. CM divided in T sets of L lines. The relations we have are C = T x L C.M. set Number = MM B Number mod CM C Number Bj block can be associated to any of the i set lines. Unit 4. Memory 61

Design parameters Cache Associative correspondence by sets The logic of control of cache interprets a address with three fields s-d bits d bits w bits Etiqueta Conjunto Palabra where w bits of less weight = Word inside a block s bits = Identifies a main block d bits = Especifies one of the cache sets s-d bits = Label associated to the lines of d set bits Unit 4. Memory 62

Design parameters Cache Associative correspondence by sets To know whether an address is in cache or not, first thing is to apply direct correspondence and then associative correspondence. Etiqueta Conjunto Palabra s-d d w s-d s+w Etiq. Datos Conjunto 0 Memoria Principal Bloque 0 Comparador simultáneo k líneas Fallo Acierto w s-d s-d Conjunto i Conjunto 2 d -1 s w Bloque j Bloque 2 s -1 Unit 4. Memory 63

Design parameters Cache Associative correspondence by sets Example: Etiqueta Conjunto Palabra Memoria Principal 6 bits 6 bits 4 bits Dirección Dato 0000 0001 0002 000F 12 34 56 78 Bloque 0 Memoria Cache Etiqueta Datos Nº Conjunto 00 123456...78 0............... 64 Etiqueta Datos.................. 0400 0401 0402 040F AB 0F FF 54 Bloque 64 Etiqueta Datos Nº Conjunto 04 AB0FFF...54 0......... 3F FF...213355 64 Etiqueta Datos.................. FFF0 FFFD FFFE FFFF FF 21 33 55 Bloque 4095 6 bits 16 bytes 6 bits 16 bytes Unit 4. Memory 64

Design parameters Cache Replacement algorithms When a new block is transfered to cache it has to replace one of the already existing ones if the line is occupied. In the direct correspondence case this algorithms have no sense. Between the different existing algorithms the following are remarkable: LRU FIFO LFU Unit 4. Memory 65

Design parameters Cache Cache examples Intel processors have been incorporating cache to arise their performance. Procesador Nivel 1 Nivel 2 Inferior 80386 NO NO 80386 NO 16K, 32K, 64K 80846 8K datos/instrucciones 64K, 128K, 256K Pentium 8K datos y 8K instrucciones 256K, 512K, 1M Procesador Descripción 80846 Cache interna Tamaño de línea de 16 bytes Organización asociativa por conjuntos de 4 vías Cache externa Tamaño de línea de 32, 64 ó 128 bytes Organización asociativa por conjuntos de 2 vías Pentium Cache interna Tamaño de línea de 32 bytes Organización asociativa por conjuntos de 2 vías Cache externa Tamaño de línea de 32, 64 ó 128 bytes Organización asociativa por conjuntos de 2 vías Unit 4. Memory 66

Concept Associative An associative is characterized by the fact that the position attempted to access is made by specifying its content or part of it and not by its address. Associative is also known as Content Addressable Memory (CAM). Unit 4. Memory 67

Structure of a CAM Associative An associative consists of a set of registries and a matrix of cells, with its logic associated, organized in n words with m bits/word. The set of registries is formed by a argument registry (A) of m bits, a mask registry (K) of m bits and a mark registry (M) of n bits. Unit 4. Memory 68

Structure of a CAM Associative Each word of the is simultaneously compared with the content of the argument registry, and the bit of the mark registry is put to 1 associated to those words whose content agrees with the one of the argument registry. At the end of this process, those bits of the mark registry that are to 1 indicate the coincidence of the corresponding words of the associative and the argument registry. Unit 4. Memory 69

Structure of a CAM Associative The simultaneous comparison is made bit to bit. A j bit (j=1,2,...,m) of the argument registry is compared with all the bits of the j column if K j =1. If there s a coincidence between all the bits M i =1. M i =0 on the contrary. 0 Alicante 0 Registro argumento 0 1 0 Registro máscara Juan Alicante 965254512 Pepe Elda 965383456 Ana Alicante 965907799 Laura Elche 965442233 Pepe Alicante 965223344 Paco Elche 966664455 Paqui Petrer 965375566 Pepi Alicante 965286677 1 0 1 0 1 0 0 1 Registro de marca Unit 4. Memory 70

Structure of a CAM Associative As a rule, in most of the applications the associative stores a table that does not have, for a given mask, two equal rows. The associative is used mainly with cache the way the identification of the label of each line is made simultaneously. TAG RAM is a clear example of associative used as a part of Cache in the Intel Pentium systems. Unit 4. Memory 71

Concept Shared Necessity that different devices have access to a same unit of. Elemento 1 Elemento 2...... Elemento n Petición Espera Petición Espera Petición Espera Arbitro Petición Memoria Multiplexor Memoria Compartida... The referee is the element in charge of allowing the access to the unit, at a given moment, to each one of the elements that ask for this resource. Unit 4. Memory 72

Concept Shared The referee is designed the way it assigns a service time, in average, analog to all the units that ask for the resource. There are different strategies: Allocation of the smaller priority to the served element. Rotation of priorities. In any given state, the next state is calculated rotating the present order of priorities until the just given service element has the lowest priority. Unit 4. Memory 73

Dual port Shared It is shared able to work with two elements a the same time. Based on duplicating the buses, decoders, etc. Bus Datos I/O L IZQUIERDA Buffer E/S DERECHA Buffer E/S Bus Datos I/O R Bus Direcciones A L IZQUIERDA Decodificador Direcciones MEMORIA RAM DOBLE PUERTO DERECHA Decodificador Direcciones Bus Direcciones A R Bus Control CE L, R / W L, OEL, SEM L Lógica de control y arbitraje CE R, R / W R, OE R, Bus Control SEM R Unit 4. Memory 74

Dual port Shared Dual port has practically all its components duplicated (left port (LEFT) y right port (RIGHT)). Puerto Izquierdo Puerto Derecho Descripción I/O L I/O R Bus de datos A L A R Bus de direcciones CEL CER Selección de chip R / W L OEL SEML R / W R OER SEMR Lectura/Escritura Habilita lectura Habilita semáforo VRAM is a clear example of dual port. It can be accessed simultaneously by the monitor controller and the graphics card controller. Unit 4. Memory 75