Computer Architecture

Similar documents
Computer Architecture

Memory unit. 2 k words. n bits per word

Computer Systems Structure Main Memory Organization

Memory ICS 233. Computer Architecture and Assembly Language Prof. Muhamed Mudawar

& Data Processing 2. Exercise 3: Memory Management. Dipl.-Ing. Bogdan Marin. Universität Duisburg-Essen

1. Memory technology & Hierarchy

OPERATING SYSTEM - VIRTUAL MEMORY

Module 2. Embedded Processors and Memory. Version 2 EE IIT, Kharagpur 1

Secondary Storage. Any modern computer system will incorporate (at least) two levels of storage: magnetic disk/optical devices/tape systems

Computer Organization and Architecture. Characteristics of Memory Systems. Chapter 4 Cache Memory. Location CPU Registers and control unit memory

Semiconductor Device Technology for Implementing System Solutions: Memory Modules

The Classical Architecture. Storage 1 / 36

Operating Systems. Virtual Memory

Memory Basics. SRAM/DRAM Basics

Virtual Memory Paging

Chapter 5 :: Memory and Logic Arrays

We r e going to play Final (exam) Jeopardy! "Answers:" "Questions:" - 1 -

Memory Hierarchy. Arquitectura de Computadoras. Centro de Investigación n y de Estudios Avanzados del IPN. adiaz@cinvestav.mx. MemoryHierarchy- 1

Chapter 6. Inside the System Unit. What You Will Learn... Computers Are Your Future. What You Will Learn... Describing Hardware Performance

COMPUTER HARDWARE. Input- Output and Communication Memory Systems

Homework # 2. Solutions. 4.1 What are the differences among sequential access, direct access, and random access?

OpenSPARC T1 Processor

Operating Systems CSE 410, Spring File Management. Stephen Wagner Michigan State University

A N. O N Output/Input-output connection

NAND Flash FAQ. Eureka Technology. apn5_87. NAND Flash FAQ

Network Traffic Monitoring an architecture using associative processing.

The Quest for Speed - Memory. Cache Memory. A Solution: Memory Hierarchy. Memory Hierarchy

CSCA0102 IT & Business Applications. Foundation in Business Information Technology School of Engineering & Computing Sciences FTMS College Global

With respect to the way of data access we can classify memories as:

1 File Management. 1.1 Naming. COMP 242 Class Notes Section 6: File Management

Chapter 11 I/O Management and Disk Scheduling

Chapter 11 I/O Management and Disk Scheduling

Technical Note. Micron NAND Flash Controller via Xilinx Spartan -3 FPGA. Overview. TN-29-06: NAND Flash Controller on Spartan-3 Overview

ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-17: Memory organisation, and types of memory

Memory Testing. Memory testing.1

A3 Computer Architecture

A New Chapter for System Designs Using NAND Flash Memory

User s Manual HOW TO USE DDR SDRAM

Memory. The memory types currently in common usage are:

Lecture 16: Storage Devices

HY345 Operating Systems

Memory Management Outline. Background Swapping Contiguous Memory Allocation Paging Segmentation Segmented Paging

COS 318: Operating Systems. Virtual Memory and Address Translation

Table 1 SDR to DDR Quick Reference

Operating Systems, 6 th ed. Test Bank Chapter 7

Data Storage - I: Memory Hierarchies & Disks

CHAPTER 7: The CPU and Memory

Chapter 13 File and Database Systems

Chapter 13 File and Database Systems

Chapter 12 File Management

Lecture 17: Virtual Memory II. Goals of virtual memory

Chapter 12 File Management

Memory unit sees only the addresses, and not how they are generated (instruction counter, indexing, direct)

Chapter 12 File Management. Roadmap

Machine Architecture and Number Systems. Major Computer Components. Schematic Diagram of a Computer. The CPU. The Bus. Main Memory.

I/O. Input/Output. Types of devices. Interface. Computer hardware

361 Computer Architecture Lecture 14: Cache Memory

File Management. Chapter 12

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 13-1

Record Storage and Primary File Organization

Virtual vs Physical Addresses

Logical Operations. Control Unit. Contents. Arithmetic Operations. Objectives. The Central Processing Unit: Arithmetic / Logic Unit.

RAM. Overview DRAM. What RAM means? DRAM

AMD Opteron Quad-Core

W4118 Operating Systems. Instructor: Junfeng Yang

RAM & ROM Based Digital Design. ECE 152A Winter 2012

Storage in Database Systems. CMPSCI 445 Fall 2010

Chapter 13. Disk Storage, Basic File Structures, and Hashing

Slide Set 8. for ENCM 369 Winter 2015 Lecture Section 01. Steve Norman, PhD, PEng

Switch Fabric Implementation Using Shared Memory

CS 61C: Great Ideas in Computer Architecture Virtual Memory Cont.

INTRODUCTION The collection of data that makes up a computerized database must be stored physically on some computer storage medium.

Operating System Tutorial

Virtual Memory. How is it possible for each process to have contiguous addresses and so many of them? A System Using Virtual Addressing

1 / 25. CS 137: File Systems. Persistent Solid-State Storage

Memory Allocation. Static Allocation. Dynamic Allocation. Memory Management. Dynamic Allocation. Dynamic Storage Allocation

MICROPROCESSOR BCA IV Sem MULTIPLE CHOICE QUESTIONS

Chapter 13 Disk Storage, Basic File Structures, and Hashing.

Communicating with devices

Computer Performance. Topic 3. Contents. Prerequisite knowledge Before studying this topic you should be able to:

Modeling Sequential Elements with Verilog. Prof. Chien-Nan Liu TEL: ext: Sequential Circuit

Random Access Memory (RAM) Types of RAM. RAM Random Access Memory Jamie Tees SDRAM. Micro-DIMM SO-DIMM

Chapter 7 Memory and Programmable Logic

POSIX. RTOSes Part I. POSIX Versions. POSIX Versions (2)

FPGA. AT6000 FPGAs. Application Note AT6000 FPGAs. 3x3 Convolver with Run-Time Reconfigurable Vector Multiplier in Atmel AT6000 FPGAs.

Lecture 9: Memory and Storage Technologies

File-System Implementation

Lecture 25 Symbian OS

Unit Storage Structures 1. Storage Structures. Unit 4.3

Page 1 of 5. IS 335: Information Technology in Business Lecture Outline Operating Systems

FPGA Implementation of IP Packet Segmentation and Reassembly in Internet Router*

Understanding Memory TYPES OF MEMORY

EFFICIENT EXTERNAL SORTING ON FLASH MEMORY EMBEDDED DEVICES

Chapter 13. Chapter Outline. Disk Storage, Basic File Structures, and Hashing

Lecture 1: Data Storage & Index

Chapter 7 Memory Management

Price/performance Modern Memory Hierarchy

Chapter 12. Paging an Virtual Memory Systems

Java Virtual Machine: the key for accurated memory prefetching

Transcription:

Computer Architecture Slide Sets WS 2013/2014 Prof. Dr. Uwe Brinkschulte M.Sc. Benjamin Betting Part 11 Memory Management Computer Architecture Part 11 page 1 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

Main Memory The main memory of a processor is usually implemented as semiconductor memory in MOS technology. Bits are stored statically using so-called flip-flops or dynamically using capacitors in a so-called 1-transistor-cell. The memory is set up as a matrix. The random access is done by the decoders. SRAM Static Random Access DRAM Dynamic Random Access Computer Architecture Part 11 page 2 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

Main Memory The access- and cycle-time of SRAMs is faster than that of DRAMs. But the area consumption of SRAMs is increased considerably, as six transistors are needed to form a flip-flop. Due to these characteristics, DRAMs are about ten times slower and cheaper than SRAMs. Computer Architecture Part 11 page 3 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

Setup Principle of a RAM & SRAM DRAM row write 0 row & R S & write 1 read CE: Chip Enable WE: Write Enable OE: Output Enable I/O: Input/Output Data A: Address D: Data U DD : Power supply U SS : Ground A 0 A 1 A n-1 1 s z 1 address input: row and column address 1 z row (word) decoder 1 s 1 2 z 1 column x row memory cell column (bit) decoder sense amplifier 2 s y memory matrix column CE WE OE control I/O buffer... U DD U SS D 0 D 1 D m I/O-interface data Computer Architecture Part 11 page 4 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

bit 0 bit 1 & R S & R S... decoder & & & & & & R S wired or R S & & & & wired or... word 0 word 1 Setup of an SRAM & & R S R S & & & &... A w r & & & & 1 1 i 0 l 0 i 1 l 1 & &... memory matrix A: address W: write R: read i: input o: output O 0 O 1 Computer Architecture Part 11 page 5 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

General DRAM Principles In a DRAM, the information (a bit) is stored in a capacity. After a certain time or when read out the information is lost. Therefore this method of storage is called dynamic as opposed to the static method, where the bit is represented by the state of a flip-flop. Dynamic semiconductor memories require rewriting the information to the cell after reading it or after a certain time span (some milliseconds). This procedure is called refresh. As a result of the necessity of a refresh, the access time and the cycle time differ observably. Computer Architecture Part 11 page 6 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

General DRAM Principles A chip has only a limited number of connectors. Therefore a reasonable goal is to save on address lines. This is more critical for DRAMs since due to the simple cell structure much larger memory sizes can be realized as for SRAMs Therefore, most DRAMs do this by multiplexing the address and apply it successively in two parts. The synchronization of the address parts is done by the signals RAS (Row Access Strobe) and CAS (Column Access Strobe). The row access time and the column access time sum up to the overall access time. Computer Architecture Part 11 page 7 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

Block Diagram of a DRAM RAS (row address strobe) row address register word selection address column address register CAS (column address strobe) sense amplifier bit-selection and driver data read/ write Computer Architecture Part 11 page 8 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

Speeding up DRAM Access The access time of a DRAM may be shortened by: The nibble mode When the RAS signal is set, the next bits in row are delivered as well The page mode When the RAS signal is set, the full row (page) is delivered Computer Architecture Part 11 page 9 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

DRAM-Variants The DRAM access characteristics can be improved by several techniques. Newer DRAM variants showing much shorter access times than standard DRAMs. EDO-RAM (Extended Data Out) EDO-RAM is dynamic memory supporting address pipelining. An already addressed line is buffered an can be read using the page mode. Computer Architecture Part 11 page 10 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

DRAM-Variants SDRAM (Synchronous DRAM) supports burst access to sequential RAM areas. The access time is approximately that of static RAMs. SDRAMs consist of several banks having the same bit-width as the chip itself. All banks are given the same row address signal simultaneously. A row (page) is spread over several banks. The same page can be accessed repeatedly without being opened again. If a following page is accessed which was not opened, delays occur. Computer Architecture Part 11 page 11 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

Structure of a SDRAM chip column address row address column address counter column address buffer row address buffer refresh counter bank0 bank1 bank2 bank3 input buffer output buffer Data Computer Architecture Part 11 page 12 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

DRAM-Variants RAMBUS (RDRAM) The core of a 64 MB chip consists of e.g. 16 DRAM banks which can be accessed simultaneously. When a DRAM page miss occurs, other accesses may deliver their results instead. The bus clock is 400 MHz and runs at double data rate (DDR). Computer Architecture Part 11 page 13 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

Virtual memory Modern microprocessor systems working on several applications need large amounts of main memory. A cheap method to enlarge the memory capacity is to integrate a mass memory (like a hard disk). The main memory and mass memory are organized to pretend a main memory of nearly unlimited capacity. The available memory area is therefore called virtual memory and the concept is called virtual memory management. Computer Architecture Part 11 page 14 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

Virtual memory virtual memory (addressable memory) main memory (physical memory) physical address...... virtual address mass memory Computer Architecture Part 11 page 15 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

Memory Management Unit (MMU) A special hardware in the processor, the memory management unit (MMU) translates the virtual addresses generated by the processor to physical addresses in the main memory at runtime. The needed table information is provided by the operating system. In case of a missing data in the main memory, the MMU creates an event to indicate the operating system to load (swap) the missing data from mass memory CPU MMU main memory virtual address physical address operating system provides table information and loads missing data Computer Architecture Part 11 page 16 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

Address translation To keep the memory management overhead low, the virtual memory is organized in blocks. The MMU s mapping information therefore refers to contiguous address areas instead of single addresses. Virtual address Physical address block# offset# address translation If the size of the blocks is fixed, we talk about paging. If it is variable depending on the application structure, we talk about segmentation. Computer Architecture Part 11 page 17 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

Segmentation virtual address space physical address space task 1 segment 1 segment 1 Variable size segments usually belong to tasks task 2 task 3 task 4 segment 2 segment 3 segment 4 swapped in swapped out segment 4 unused mass memory Segments reflect the logical program structure and can be rather large (MBytes) A task might consist of several segments (e.g. code segment, data segment, stack segment, heap segment) Segments are either completely swapped in or out Computer Architecture Part 11 page 18 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

Segmentation Address Translation virtual address n bit segment address offset address phys. descriptortable start address + m bit v bit m bit p bit segment descriptor segment type physical segment start address segment size access rights segment swapped out... m bit + m bit physical address part of segment descriptor table maintained by the operating system in the main memory Computer Architecture Part 11 page 19 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

An Example for Segmentation 0 1 2 31 23 0 virtual address segment# offset# 8 24 segment table 7937 10258 258 10000 3843 18195 24 bits segment size 32 bits physical segment start address 255 32 + 31 32 0 pjhysical address Computer Architecture Part 11 page 20 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

An Example for Segmentation mapping of three segments to the physical address space 16M 16M 16M virtual address space 7937 Bytes 258 Bytes 3843 Bytes 0 1 2 virtual segment# 0 10000 10258 18195 22038 physical base address physical address space 258 Bytes 7937 Bytes 3843 Bytes Computer Architecture Part 11 page 21 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

Segmentation: Diskussion Pros: Segmentation reflects the logical structure of the application Changing information about a big connected memory area (like its base address, length, access attributes, or status) represented by a segment needs little effort, because only one table entry (the segment descriptor) is affected. The tables are small, as the number of segments is usually small. Computer Architecture Part 11 page 22 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

Segmentation: Discussion Cons: Segments must be swapped in and out as a whole, even if only a part of them is needed in the main memory. Since segments are of variable size, a suitable free place in main memory has to be found when rolling in a segment This leads to an external fragmentation of the main memory into free and occupied chunks of different sizes. The management of the memory bubbles (free areas) therefore needs additional effort, the so-called garbage collection. Computer Architecture Part 11 page 23 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

Paging task 1 task 2 task 3 logical address space page Task 1 page 2 page 3 page Task 14 page 5 page 6 page Task 17 page 8 page 9 page Task 10 1 unbenutzt page 11 page 12 physical address space frame Task 1 frame 2 frame 3 frame Task 14 frame 5 frame 6 frame Task 17 frame 8 frame 9 frame Task 101 unbenutzt frame 11 frame 12 frame 13 frame 14 frame Task 151 unbenutzt frame 16 frame 17 frame 18 frame 19 frame Task 20 1 unbenutzt frame 21 frame 22... A task is spread over many fixed sized pages Pages are rather small (e.g. 0.5kByte, 1kByte, 2kByte, 4kByte) Pages are assigned to frames of the same size in physical address space Consecutive pages might not be assigned to consecutive frames A task might be partially swapped in Computer Architecture Part 11 page 24 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

Paging Address Translation phys. page table start address m bit + page address logical address v bit n bit offset address p bit due to small page size, the page table might be large m bit c = concatenation frame number of the page m-p bit c m bit physical address page table in main memory Computer Architecture Part 11 page 25 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

Hierarchical Page Tables page directory address page directory logical address page address c offset address avoids large page tables by splitting them not all page tables must be swapped in page table c physical address Computer Architecture Part 11 page 26 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

Translation Look Aside Buffer (TLB) page directory address TLB logical address page address offset address speeds up address translation by caching the latest referenced table entries page directory c page table c physical address Computer Architecture Part 11 page 27 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

Paging: Discussion Pros: Pages can be stored non-consecutively, so that the available main memory is usable in an optimal way. The management of free memory bubbles is much simpler as the pages/frames are all the same size. There is no external fragmentation. Mechanisms like the garbage collection are not needed. It is easy to change the size of a task at run-time by adding or removing pages Swapping is done more efficiently, as only the actually needed pages of a task have to be kept in the main memory. Computer Architecture Part 11 page 28 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

Paging: Discussion Cons: Changes of information concerning the task (e.g. access attributes) may have to be applied to many page descriptors. The translation tables are much larger than that of segmentation. The last page of a task usually is only partly filled (internal fragmentation) Computer Architecture Part 11 page 29 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

Combining Segmentation and Paging logical address segmentation linear address paging combines advantages of both worlds used e.g. in the Pentium family physical address Computer Architecture Part 11 page 30 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

Replacement Algorithms When a page or segment fault occurs, the operating system must decide which page/segment should be removed from the main memory to free up space for the page/segment to be swapped in. If the page/segment to be removed was modified in the main memory, it must be written back to the mass memory to keep it up-to-date. If it was not modified, the new page/segment just overwrites it in the main memory. To keep track of the modification state of a page/segment, a status bit is used. This bit is called the modified-bit or dirty-bit. Replacement algorithms are needed at other layers of the memory hierarchy, as well, e.g. between main memory and cache. Computer Architecture Part 11 page 31 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

Replacement Algorithms The system performance highly depends on the strategy by which the pages or segments to be swapped out are selected. Several strategies are possible, e.g. randomly selecting. However it has proved to be preferable to swap out a page/segment which was seldom referenced in the past. This is because a frequently referenced page/segment has a higher probability that it will be needed again soon after being swapped out and therefore would have to be swapped in again, pushing another page or segment out. This is called the locality principle. Computer Architecture Part 11 page 32 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

The Optimal Replacement Algorithm The best possible replacement algorithm is easy to describe, yet impossible to implement: For every page/segment residing in the main memory it is known how many memory accesses will happen until it is referenced next. If a page/segment fault occurs, the optimal replacement algorithm just swaps out the page with the highest mark. Obviously, this algorithm cannot be implemented, as the operating system has no way to calculate the references in advance. To do this it would have to have a foresight. Computer Architecture Part 11 page 33 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

The Optimal Replacement Algorithm The optimal replacement algorithm has a practical meaning, however: An application can be run on a simulator. During its execution all accesses are logged, so that afterwards, all times of page/segment references are known. They are then used to measure and compare algorithms which actually can be implemented. Computer Architecture Part 11 page 34 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

Referenced-Bit and Modified-Bit Most page replacement algorithms keep track of which pages/segments were referenced and in which mode (read or write). To do this, two status bits R and M are assigned to every page/segment. R is set if a page/segment was referenced. M is set if a page/segment was modified and therefore must be written back to the mass memory if it is to be pushed out. As these bits are set for every access to the main memory, it is necessary to let the hardware do this. A bit is set until it is reset explicitly by the software. Resetting the R-bit introduces a temporal component to the algorithm: aging. Computer Architecture Part 11 page 35 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

1. The Not-Recently-Used Replacement Algorithm (NRU) NRU is a simple algorithm: When a page/segment is loaded to the main memory, R and M are set to 0. R and M are set according to the previously defined rules Periodically all R bits are reset R (referenced) M (modified) class 0 0 0 class 1 0 1 If a page fault occurs, the operating system does a classification (see table). The page/segment to be swapped out is chosen randomly from the lowest nonempty class class 2 1 0 class 3 1 1 Computer Architecture Part 11 page 36 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

2. The First-In-First-Out Replacement Algorithm (FIFO) The basic idea of the FIFO algorithm is to keep all pages/segments in a linked list. When a page/segment is loaded to the main memory, it is appended to this list. If a fault occurs the page/segment at the head of the list is removed. However, the FIFO principle does not consider the frequency of references. In case of a fault always the oldest page/segment is swapped out, regardless if another page/segment was rarely or even never referenced. head tail Computer Architecture Part 11 page 37 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

3. The Second-Chance Replacement Algorithm The second-chance replacement algorithm enhances the FIFO algorithm. When a fault occurs, the R-bit of the oldest page/segment is inspected. If it is set, then it gets reset and the page/segment is put to the tail of the list. The page/segment is then treated like newly loaded and therefore gets a second chance. Only the list element at the head of the list whose R-bit is 0 get swapped out. swap in timestamp 0 3 7 8 12 14 15 18 20 A B C D E F G H A oldest oldest youngest youngest A is treated like newly loaded Computer Architecture Part 11 page 38 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

4. The Clock Replacement Algorithm The maintenance cost of the secondchance algorithm is very high, as it frequently needs inserting and deleting of elements. The clock-page algorithm is more efficient by organizing the elements in a circular list. A pointer references the oldest element. If a fault occurs, the R-bit of the referenced element is inspected. If it is 0 then the element is swapped out, else the bit gets reset. In both cases the pointer advances to the next position. J K I L H A G B F C D E Computer Architecture Part 11 page 39 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

5. The Least-Recently-Used Replacement Algorithm (LRU) A simple implementation of LRU with hardware assistance can be as follows: The hardware provides a counter having an appropriate bit width. Every page/segment descriptor contains a data field big enough to hold the current value of this counter. For every main memory access the current counter value is written to the descriptor of the affected page/segment. If a fault occurs, the page/segment whose descriptor holds the lowest value is pushed out. However, updating the linked list and finding the descriptor with the lowest value remains costly, even with hardware assistance. Computer Architecture Part 11 page 40 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

6. The Least-Frequently-Used Replacement Algorithm (LFU) Another good replacement algorithm can be achieved by considering the following observation: A page/segment which was frequently referenced up to now, will probably be referenced again in the near future. Contrarily, a page/segment which was only seldom referenced will be referenced in the near future with only a small probability. This observation leads to the so-called least-frequently-used strategy (LFU): If a fault occurs, replace the page/segment which was least frequently referenced. Computer Architecture Part 11 page 41 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

6. The Least-Frequently-Used Replacement Algorithm (LFU) A full implementation of LFU creates high maintenance costs: It requires keeping a linked list of all pages/segments currently residing in the main memory. The element most frequently referenced will then be put to the head of the list and the element most rarely referenced to the tail of the list. To do this, a counter is associated with every element, counting the number of references to this page/segment. The high cost arises from the need to update the counter and reordering the complete list at every main memory access. Therefore a special (and expensive) hardware or a good approximation in software is needed. Computer Architecture Part 11 page 42 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

7. The Not-Frequently-Used Replacement Algorithm (NFU) If no full hardware implementation of LFU is available, it can be approximated by software. To do this, a counter is associated to every page/segment residing in the main memory. Periodically (not every main memory access) the R bit of each page/segment is added to the page s or segment's counter. In case of a fault the page/segment having the least counter value will be pushed out. This method is called not-frequently-used algorithm (NFU). Computer Architecture Part 11 page 43 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting

8. The Least-Reference-Density Replacement Algorithm (LRD) LRD is a combination of LRU and LFU It tries to maintain the advantage of LFU keeping frequently used actual elements while avoiding its disadvantage keeping as well old elements very often used a long time ago LRD calculates a reference density of an element by Reference density = number of accesses to element / element age The element with the lowest reference density will be replaced This strategy comes close to the optimal strategy, unfortunately it is very complex to implement. For each element the swap-in-time and the number of accesses must be stored using e.g. a register and a counter Furthermore, a division operation has to be executed for each element when looking for the element with the lowest reference density Computer Architecture Part 11 page 44 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin Betting