CS/COE1541: Introduction to Computer Architecture. Memory hierarchy. Sangyeun Cho. Computer Science Department University of Pittsburgh

Size: px
Start display at page:

Download "CS/COE1541: Introduction to Computer Architecture. Memory hierarchy. Sangyeun Cho. Computer Science Department University of Pittsburgh"

Transcription

1 CS/COE1541: Introduction to Computer Architecture Memory hierarchy Sangyeun Cho Computer Science Department CPU clock rate Apple II MOS 6502 (1975) 1~2MHz Original IBM PC (1981) Intel MHz Intel MHz DEC Alpha EV4 (1991) 100~200MHz EV5 (1995) 266~500MHz EV6 (1998) 450~600MHz EV67 (1999) 667~850MHz Today s processors 2~5GHz 2

2 DRAM access latency How long would it take to fetch a memory block from main memory? Time to determine that data is not on-chip (cache miss resolution time) Time to talk to memory controller (possibly on a separate chip) Possible queuing delay at memory controller Time for opening a DRAM row Time for selecting a portion in the selected row (CAS latency) Time to transfer data from DRAM to CPU Possibly through a separate chip Time to fill caches as needed The total latency can be anywhere from 100~400 CPU cycles 3 Growing gap 4

3 Idea of memory hierarchy Smaller Faster More expensive per byte CPU Regs L1 cache SRAM L2 cache SRAM Main memory DRAM Larger Slower Cheaper per byte Hard disk Magnetics 5 Memory hierarchy goals To create an illusion of fast and large main memory As fast as a small SRAM-based memory As large (and cheap) as DRAM-based memory To provide CPU with necessary data and instructions as quickly as possible Frequently used data must be placed near CPU Cache hit when CPU finds its data in cache Cache hit rate = # of cache hits/# cache accesses Avg. mem. access lat. (AMAL) = cache hit time + (1 cache hit rate) miss penalty To decrease AMAL, decrease, increase, and reduce. Cache helps reduce bandwidth pressure on memory bus Cache becomes a filter 6

4 Cache (From Webster online): 1. a: a hiding place especially for concealing and preserving provisions or implements b: a secure place of storage 3. a computer memory with very short access time used for storage of frequently or recently used instructions or data called also cache memory Notions of cache memory were used in as early as 1967 in IBM mainframe machines 7 Cache operations CPU &A[0] Data Cache No Get A[0] Data Main memory Do I have A[0]? for (i=0; i<n; i++) { C[i] = A[i] + B[i]; SUM = SUM + C[i]; } 8

5 Cache operations CPU-cache operations CPU: Read request Cache: hit, provide data Cache: miss, talk to main memory; hold CPU until data arrives; fill cache (if needed, kick out old data); provide data CPU: Write request Cache: hit, update data Cache: miss, put data in a buffer (pretend data is written); let CPU proceed Cache-main memory operations Cache: Read request Memory: provide data Cache: Write request Memory: update memory 9 Cache organization Caches use blocks or lines (block > byte) as their granule of management Memory > cache: we can only keep a subset of memory blocks Cache is in essence a fixed-width hash table; the memory blocks kept in a cache are thus associated with their addresses (or tagged ) addr N bits Memory addr N bits addr table data table 2 N words supporting circuitry Hit/miss Specified data word Specified data word 10

6 L1 cache vs. L2 cache Their basic parameters are similar Associativity, block size, and cache size (the capacity of data array) Address used to index L1: typically virtual address (to quickly index first) Using a virtual address causes some complexities L2: typically physical address Physical address is available by then System visibility L1: not visible L2: page coloring can affect hit rate Hardware organization (esp. in multicores) L1: private L2: often shared among cores 11 Key questions Where to place a block? How to find a block? Which block to replace for a new block? How to handle a write? Writes make a cache design much more complex! 12

7 Where to place a block? Block placement is a matter of mapping If you have a simple rule to place data, you can find them later using the same rule 13 Direct-mapped cache 2 B byte block 2 M entries N-bit address (N-B-M) bits 2 B bytes (N-B-M) bits M bits B bits Address Tag Data =? Hit/miss Data 14

8 2-way set-associative cache 2 B byte block 2 M entries N-bit address 2 M entries N bits Address DM Cache 2 (M-1) entries DM Cache 2 (M-1) entries Hit/miss Data 15 Fully associative cache 2 B byte block 2 M entries N-bit address CAM (Content addressable memory) Input: content Output: index Used for the tag memory (N-B) bits 2 B bytes (N-B) bits B bits Tag M-bit index Data Address Hit/miss Data 16

9 Why caches work (or do not work) Principle of locality Temporal locality If the location A is accessed now, it ll be accessed again soon Spatial locality If the location A is accessed now, the location nearby (e.g., A+1) will be accessed soon Can you explain how locality is manifested in your program (at the source code level)? Data Instructions Can you write two programs doing the same thing, each having A high degree of locality Very low locality 17 Which block to replace? Which block to replace, to make room for a new block on a miss? Goal: minimize the # of total misses Trivial in a direct-mapped cache N choices in N-way associative cache What is the optimal policy? MRU (most remotely used) is considered optimal This is an oracle scheme we do not know the future Replacement approaches LRU (least recently used) look at the past to predict the future FIFO (first in first out) honor the new ones Random don t remember anything Cost-based what is the cost (e.g., latency) of bringing this block again? 18

10 How to handle a write? Design considerations Performance Design complexity Allocation policy (on a miss) Write-allocate No-write-allocate Write-validate Update policy Write-through Write-back Typical combinations Write-back with write-allocate Write-through with no-write-allocate 19 Write-through vs. write-back L1 cache: advantages of write-through + no-write-allocate Simple control No stalls for evicting dirty data on L1 miss with L2 hit Avoids L1 cache pollution with results that are not read for a while Avoids problems with coherence (L2 is consistent with L1) Allows efficient transient error handling: parity protection in L1 and ECC in L2 What about high traffic between L1 and L2, esp. in a multicore processor? L2 cache: advantages of write-back + write-allocate Typically reduces overall bus traffic by filtering all L1 write-through traffic Better able to capture temporal locality of infrequently written memory locations Provides a safety net for programs where write-allocate helps a lot Garbage-collected heaps Write-followed-by-read situations Linking loaders (if unified cache, need not be flushed before execution) Some ISA/caches support explicitly installing cache blocks with empty contents or common values (e.g., zero) 20

11 Write buffer Waiting for a write to main memory is inefficient for a writethrough scheme Every write to cache causes a write to main memory and the processor stalls for that write Example: base CPI = 1.2, 13% of all instructions are a store, 10 cycles to write to memory CPI = = 2.5 More than doubled the CPI by waiting 21 Write buffer A write buffer holds data while it is waiting to be written to (slow) memory; frees processor to continue executing instructions Processor thinks that write was done (quickly) When write buffer is empty Send written data to buffer and eventually to main memory When write buffer is full, Wait for write buffer to become empty; then send data to write buffer 22

12 Write buffer Is one write buffer enough? Probably not Writes may occur faster than they can be handled by the memory system Write tend to occur in bursts; memory may be fast but can t handle several writes at once Typical buffer size is 4~8, possibly more 23 Revisiting write-back cache Tag array Data array D V Address tag Dirty bit is set when data is modified Valid bit is set when data is brought in Address tag is the id of the associated data 24

13 Alpha example 64B block = 64kB 2 ways 25 More examples IBM Power5 L1I: 64kB 2-way 128B block LRU L1D: 32kB 4-way 128B block write-through LRU L2: 1.875MB (3 banks) 10-way 128B block pseudo LRU Intel Core Duo L1I: 32kB 8-way 64B block LRU L1D: 32kB 8-way 64B block LRU write-through L2: 2MB 8-way 64B line LRU write-back Sun Niagara L1I: 16kB 4-way 32B block random L1D: 8kB 4-way 16B block random write-through write no-allocate L2: 3MB 4 banks 64B block 12-way write-back 26

14 Impact of caches on performance Average memory access latency (AMAL) = cache hit time + (1 cache hit rate) miss penalty Example 1 Hit time = 1 cycle Miss penalty = 100 cycles Miss rate = 2% Average memory access latency? Example 2 1GHz processor Two configurations: 16kB direct-mapped, 16kB 2-way Two miss rates: 3%, 2% Hit time = 1 cycle, but clock cycle time is stretched by 1.1 in 2-way Miss penalty = 100ns (how many cycles?) Average memory access latency? Reducing miss penalty Multi-level caches miss penalty L1 = hit time L2 + miss rate L2 miss penalty L2 Critical word first and early restart When L2-L1 bus width is smaller than L1 cache block Giving priority to read misses Esp. in dynamically scheduled processors Merging write buffer Victim caches Non-blocking caches Esp. in dynamically scheduled processors 28

15 Victim cache w/o victim cache w/ victim cache Requests and replacements Requests L1 cache L2 cache L1 cache L2 cache Data 2:1 Requests and replacements Victim cache Replacements Jouppi, Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers, ISCA Categorizing misses Compulsory I ve not met this block before Capacity Working set is larger than my cache Conflict Cache has space but blocks in use map to busy sets How can you measure contributions from different miss categories? 30

16 2. Reducing miss rate Larger block size Reduces compulsory misses Larger cache size Tackles capacity misses Higher associativity Attacks conflict misses Prefetching Relevancy issues (due to pollution) Pseudo-associative caches Compiler optimizations Loop interchange Blocking Reducing hit time Small & simple cache (e.g., direct-mapped cache) Test different cache configurations using cacti ( Avoid address translation during cache indexing Pipeline access 32

17 Prefetching Memory hierarchy generally works well L1 cache: 1- ~ 3-cycle latency L2 cache: 8- ~ 13-cycle latency Main memory: 100- ~ 300-cycle latency Cache hit rates are critical to high performance Prefetching: if we know what data we ll need from level-n cache a priori, get data from level-(n+1) and place it in level-n cache before the data is accessed by the processor What are design goals and issues? E.g., shall we load prefetched block in L1 or not? E.g., instruction prefetch vs. data prefetch 33 Evaluating a prefetching scheme Coverage By applying a prefetching scheme, how many misses are covered? Timeliness Are prefetched data arriving early enough so that the (otherwise) miss latency is effectively hidden? Relevance and usefulness Are prefetched blocks actually used? Do they replace other useful data? How would program execution time change? Especially, dynamically scheduled processors have intrinsic ability to tolerate long latencies to a degree 34

18 Hardware schemes Prefetch data under hardware control Without software modification Without increasing instruction count Hardware prefetch engines Programmable engines Program this engine with data access pattern information Who programs this engine, then? Automatic Hardware detects access behavior 35 Stride detection A popular method using a reference prediction table Load instruction PC Last address A i-1 Last stride S = A i-1 A i-2 Other flags, e.g., confidence, time stamp, Next address for this PC A i = A i-1 + S In practice, simpler stream buffer like methods are often used IBM Power4/5 use 8 stream buffers between L1 & L2, L2 & L3 (or main memory) 36

19 Power4 example 8 stream buffers: ascending/descending Requires at least 4 sequential misses to install a stream Supports L2 to L1, L3 to L2, memory to L3 Based on physical address When page boundary is met, stop there Software interface To explicitly (and quickly) install a stream 37 Software schemes Use prefetch instruction needs ISA support Check if the desired block is in cache already If not, bring the cache block into the cache If yes, do nothing In any case, prefetch request is a performance hint and not a correctness requirement Not acting on it must not affect program output Compiler or programmer then inserts prefetch instructions in programs Hardware prefetch vs. software prefetch 38

20 Software prefetch example for (int i=0; i<100; ++i) { a[i] = b[i] + c[i]; } prefetch(&b[0]); prefetch(&c[0]); for (i=0; i<100; i++) { prefetch (&b[i+4]); prefetch (&c[i+4]); a[i] = b[i] + c[i]; } 39 Virtual memory Process 1 Process 2 Process 3 E H C E D F C I D A B Physical memory B A B A I H G F G Hard disk 40

21 Virtual memory Motivation Large, private memory space view What is the typical physical memory size today? Virtual view to programmers Virtual view to each process Management goals Efficient sharing of physical memory Protection Paging vs. segmentation Architectural issues Fast translation of virtual addresses Support for protection Impact on cache design 41 Cache vs. virtual memory Block size 16~128B 4k~64kB Hit time 1~3 cycles (L1) 50~150 cycles (main memory) Miss penalty 8~150 cycles (L2/memory) 1M~10M cycles (hard drive) Miss rate 0.1~10% ~0.001% Address mapping 25~45 bits to 14~20 32~64 bits to 25~45 Replacement Hardware-based OS 42

22 Paging vs. segmentation Words per address One Programmer visible? No Replacing a block Trivial (same block size) Memory use inefficiency Internal fragmentation Efficient disk traffic Yes Two (segment & offset) Maybe Can be difficult External fragmentation Not always 43 Paging basics Block placement Anywhere in memory (fully associative) Block identification Page table keeps mapping information Block replacement LRU by OS Write policy Write-back 44

23 Page table Implemented in OS kernel memory space Per process Table size? Architectural support for fast address translation Translation Look-aside Buffer (TLB) Small cache to keep frequently used page table entries 45 Virtually indexed L1 caches Address from a load/store instruction is virtual VA-PA translation is done via TLB If L1 cache access is done after TLB access, L1 latency becomes longer Virtually indexed but physically tagged cache Allows simultaneous access to TLB and cache Overlaps cache access time and TLB access time 46

24 Virtually indexed L1 caches Parallel access Physical address governs in this region 47 Issues Virtually indexed L1 caches Validity of data across context switches (there are many virtual address spaces!) Flush at context switches ( ) Use physical tags Use address space identifiers Handling of aliases (two VAs for one PA) Hardware scheme: limit cache size Software scheme: page coloring I/O and cache coherence I/O can result in inconsistency among data in main memory and caches I/O is between main memory and outside world Solutions Flush cache before & after I/O operations Address bus snooping for cache line invalidation Allocate I/O buffers in a non-cacheable memory region 48

25 Protection Between user and kernel processes User mode vs. privileged mode Between users Address boundary & access type checking Base address upper bound Write allowed? Protections are enforced when accessing page table (TLB) 49 Alpha example 8kB page size 8B page table entry: 4B for physical page number (PPN) and 4B for other information 3FF FFFF FFFF 3FF Reserved (shared libraries) Not yet allocated Dynamic Data 64-bit address is not fully used seg1: bits 63~47: Information maintained by OS, not accessible by users kseg: bits 63~47: Kernel-accessible physical addresses No address translation performed seg0: bits 63~47: User space $sp Static Data Not used Text (Code) Stack Not yet allocated Reserved 50

26 Alpha example 51 DRAM technology DRAM = dynamic RAM Memory cell is based on capacitor If capacitor has enough charge, it s 1 If not, it means 0 Memory content decays as capacitors lose their charges over time Periodic refresh actions are required Reads are also destructive the data read must be written back before reading another row DRAM designs are driven by standards ( SDRAM (synchronous DRAM) is most widely used RDRAM (Rambus DRAM) is a high-performance proprietary design Playstation series has been the major consumer 52

27 DRAM internal Command interface Row address path Memory banks Data I/O interface Column address path 53 DRAM internal 1. A Row in a bank is selected and read using {BA1,BA0,A13~A0} Row address 2. Data is moved to I/O latch 3. Part of data is selected using column address Column address 4. Selected data is moved to chip I/O 54

28 DRAM-related concepts Row address & column address Transfer address (e.g., from CPU to DRAM) in two steps Row address first and column address later (why?) Halves # of address bits Fast page mode Once you ve read a row, it s fast to read parts of it (rather than data from other rows) Cycle time determines bandwidth Synchronous DRAM has a clock DDR (double data rate) use both edges of a clock when transferring data 55 DRAM modules SIMM (single in-line memory module) 32-bit data bus Primarily in 386, 486, 586, and computers from Apple DIMM (dual in-line ) 64-bit data bus RIMM (Rambus in-line ) Heat spreader 1.6 GBps FB-DIMM (fully buffered ) New standard 56

29 FB-DIMM Serial, packet-based interface (new design trend) Industry s acceptance is still low (10% or lower) 57 FB-DIMM Scalability and throughput 192 GB, 6 channels, 8 DIMMS/channel 6.7 GB/s per channel Improved board layouts 69 pins per channel (c.f., 240 pins per channel today) Reliability considerations ECC in commands, address, and data Automatic retry on errors Headroom More open-ended serial differential signaling scheme 58

30 Memory bank interleaving Organization to improve memory bandwidth processor With proper # of banks, successive memory access latencies are hidden 59 Page coloring OS virtual to physical address mapping affects L2 cache performance P-Page 0 V-Page A Page mapping P-Page 1 V-Page B P-Page 2 P-Page 3 Bin 0 P-Page 4 L2 cache Bin 1 Bin 2 Page mapping P-Page 5 P-Page 6 Main memory Bin 3 P-Page 7 60

31 Page coloring This is about deciding which physical page to give to a virtual page on a page allocation event Goal: minimize the # of conflict misses Approaches Basic method: match a few lower bits in PPN with those of VPN Let neighboring pages (in virtual address space) occupy different cache bins Bin hopping: a consecutive page allocation gets the next bin Let pages accessed in the same time frame occupy different cache bins Best bin: track the page count for each bin; choose the bin with the minimum page count Use the OS knowledge about the page allocations Example Solaris supports several different page coloring strategies and lets the user choose one 61

Memory ICS 233. Computer Architecture and Assembly Language Prof. Muhamed Mudawar

Memory ICS 233. Computer Architecture and Assembly Language Prof. Muhamed Mudawar Memory ICS 233 Computer Architecture and Assembly Language Prof. Muhamed Mudawar College of Computer Sciences and Engineering King Fahd University of Petroleum and Minerals Presentation Outline Random

More information

The Quest for Speed - Memory. Cache Memory. A Solution: Memory Hierarchy. Memory Hierarchy

The Quest for Speed - Memory. Cache Memory. A Solution: Memory Hierarchy. Memory Hierarchy The Quest for Speed - Memory Cache Memory CSE 4, Spring 25 Computer Systems http://www.cs.washington.edu/4 If all memory accesses (IF/lw/sw) accessed main memory, programs would run 20 times slower And

More information

1. Memory technology & Hierarchy

1. Memory technology & Hierarchy 1. Memory technology & Hierarchy RAM types Advances in Computer Architecture Andy D. Pimentel Memory wall Memory wall = divergence between CPU and RAM speed We can increase bandwidth by introducing concurrency

More information

361 Computer Architecture Lecture 14: Cache Memory

361 Computer Architecture Lecture 14: Cache Memory 1 361 Computer Architecture Lecture 14 Memory cache.1 The Motivation for s Memory System Processor DRAM Motivation Large memories (DRAM) are slow Small memories (SRAM) are fast Make the average access

More information

Computer Architecture

Computer Architecture Cache Memory Gábor Horváth 2016. április 27. Budapest associate professor BUTE Dept. Of Networked Systems and Services [email protected] It is the memory,... The memory is a serious bottleneck of Neumann

More information

Lecture 17: Virtual Memory II. Goals of virtual memory

Lecture 17: Virtual Memory II. Goals of virtual memory Lecture 17: Virtual Memory II Last Lecture: Introduction to virtual memory Today Review and continue virtual memory discussion Lecture 17 1 Goals of virtual memory Make it appear as if each process has:

More information

COS 318: Operating Systems. Virtual Memory and Address Translation

COS 318: Operating Systems. Virtual Memory and Address Translation COS 318: Operating Systems Virtual Memory and Address Translation Today s Topics Midterm Results Virtual Memory Virtualization Protection Address Translation Base and bound Segmentation Paging Translation

More information

OpenSPARC T1 Processor

OpenSPARC T1 Processor OpenSPARC T1 Processor The OpenSPARC T1 processor is the first chip multiprocessor that fully implements the Sun Throughput Computing Initiative. Each of the eight SPARC processor cores has full hardware

More information

Computer Systems Structure Main Memory Organization

Computer Systems Structure Main Memory Organization Computer Systems Structure Main Memory Organization Peripherals Computer Central Processing Unit Main Memory Computer Systems Interconnection Communication lines Input Output Ward 1 Ward 2 Storage/Memory

More information

Memory Hierarchy. Arquitectura de Computadoras. Centro de Investigación n y de Estudios Avanzados del IPN. [email protected]. MemoryHierarchy- 1

Memory Hierarchy. Arquitectura de Computadoras. Centro de Investigación n y de Estudios Avanzados del IPN. adiaz@cinvestav.mx. MemoryHierarchy- 1 Hierarchy Arturo Díaz D PérezP Centro de Investigación n y de Estudios Avanzados del IPN [email protected] Hierarchy- 1 The Big Picture: Where are We Now? The Five Classic Components of a Computer Processor

More information

Bindel, Spring 2010 Applications of Parallel Computers (CS 5220) Week 1: Wednesday, Jan 27

Bindel, Spring 2010 Applications of Parallel Computers (CS 5220) Week 1: Wednesday, Jan 27 Logistics Week 1: Wednesday, Jan 27 Because of overcrowding, we will be changing to a new room on Monday (Snee 1120). Accounts on the class cluster (crocus.csuglab.cornell.edu) will be available next week.

More information

Homework # 2. Solutions. 4.1 What are the differences among sequential access, direct access, and random access?

Homework # 2. Solutions. 4.1 What are the differences among sequential access, direct access, and random access? ECE337 / CS341, Fall 2005 Introduction to Computer Architecture and Organization Instructor: Victor Manuel Murray Herrera Date assigned: 09/19/05, 05:00 PM Due back: 09/30/05, 8:00 AM Homework # 2 Solutions

More information

COMPUTER HARDWARE. Input- Output and Communication Memory Systems

COMPUTER HARDWARE. Input- Output and Communication Memory Systems COMPUTER HARDWARE Input- Output and Communication Memory Systems Computer I/O I/O devices commonly found in Computer systems Keyboards Displays Printers Magnetic Drives Compact disk read only memory (CD-ROM)

More information

Computer Architecture

Computer Architecture Computer Architecture Random Access Memory Technologies 2015. április 2. Budapest Gábor Horváth associate professor BUTE Dept. Of Networked Systems and Services [email protected] 2 Storing data Possible

More information

Computer Architecture

Computer Architecture Computer Architecture Slide Sets WS 2013/2014 Prof. Dr. Uwe Brinkschulte M.Sc. Benjamin Betting Part 11 Memory Management Computer Architecture Part 11 page 1 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin

More information

Virtual Memory. How is it possible for each process to have contiguous addresses and so many of them? A System Using Virtual Addressing

Virtual Memory. How is it possible for each process to have contiguous addresses and so many of them? A System Using Virtual Addressing How is it possible for each process to have contiguous addresses and so many of them? Computer Systems Organization (Spring ) CSCI-UA, Section Instructor: Joanna Klukowska Teaching Assistants: Paige Connelly

More information

CS 61C: Great Ideas in Computer Architecture Virtual Memory Cont.

CS 61C: Great Ideas in Computer Architecture Virtual Memory Cont. CS 61C: Great Ideas in Computer Architecture Virtual Memory Cont. Instructors: Vladimir Stojanovic & Nicholas Weaver http://inst.eecs.berkeley.edu/~cs61c/ 1 Bare 5-Stage Pipeline Physical Address PC Inst.

More information

Computer Organization and Architecture. Characteristics of Memory Systems. Chapter 4 Cache Memory. Location CPU Registers and control unit memory

Computer Organization and Architecture. Characteristics of Memory Systems. Chapter 4 Cache Memory. Location CPU Registers and control unit memory Computer Organization and Architecture Chapter 4 Cache Memory Characteristics of Memory Systems Note: Appendix 4A will not be covered in class, but the material is interesting reading and may be used in

More information

18-548/15-548 Associativity 9/16/98. 7 Associativity. 18-548/15-548 Memory System Architecture Philip Koopman September 16, 1998

18-548/15-548 Associativity 9/16/98. 7 Associativity. 18-548/15-548 Memory System Architecture Philip Koopman September 16, 1998 7 Associativity 18-548/15-548 Memory System Architecture Philip Koopman September 16, 1998 Required Reading: Cragon pg. 166-174 Assignments By next class read about data management policies: Cragon 2.2.4-2.2.6,

More information

This Unit: Caches. CIS 501 Introduction to Computer Architecture. Motivation. Types of Memory

This Unit: Caches. CIS 501 Introduction to Computer Architecture. Motivation. Types of Memory This Unit: Caches CIS 5 Introduction to Computer Architecture Unit 3: Storage Hierarchy I: Caches Application OS Compiler Firmware CPU I/O Memory Digital Circuits Gates & Transistors Memory hierarchy concepts

More information

Configuring Memory on the HP Business Desktop dx5150

Configuring Memory on the HP Business Desktop dx5150 Configuring Memory on the HP Business Desktop dx5150 Abstract... 2 Glossary of Terms... 2 Introduction... 2 Main Memory Configuration... 3 Single-channel vs. Dual-channel... 3 Memory Type and Speed...

More information

Board Notes on Virtual Memory

Board Notes on Virtual Memory Board Notes on Virtual Memory Part A: Why Virtual Memory? - Letʼs user program size exceed the size of the physical address space - Supports protection o Donʼt know which program might share memory at

More information

RAM. Overview DRAM. What RAM means? DRAM

RAM. Overview DRAM. What RAM means? DRAM Overview RAM In this chapter, you will learn how to Identify the different types of RAM packaging Explain the varieties of DRAM Install RAM properly Perform basic RAM troubleshooting Program Execution

More information

Random Access Memory (RAM) Types of RAM. RAM Random Access Memory Jamie Tees SDRAM. Micro-DIMM SO-DIMM

Random Access Memory (RAM) Types of RAM. RAM Random Access Memory Jamie Tees SDRAM. Micro-DIMM SO-DIMM Random Access Memory (RAM) Sends/Receives data quickly between CPU This is way quicker than using just the HDD RAM holds temporary data used by any open application or active / running process Multiple

More information

ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-17: Memory organisation, and types of memory

ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-17: Memory organisation, and types of memory ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-17: Memory organisation, and types of memory 1 1. Memory Organisation 2 Random access model A memory-, a data byte, or a word, or a double

More information

Chapter 1 Computer System Overview

Chapter 1 Computer System Overview Operating Systems: Internals and Design Principles Chapter 1 Computer System Overview Eighth Edition By William Stallings Operating System Exploits the hardware resources of one or more processors Provides

More information

Switch Fabric Implementation Using Shared Memory

Switch Fabric Implementation Using Shared Memory Order this document by /D Switch Fabric Implementation Using Shared Memory Prepared by: Lakshmi Mandyam and B. Kinney INTRODUCTION Whether it be for the World Wide Web or for an intra office network, today

More information

Chapter 2 Basic Structure of Computers. Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan

Chapter 2 Basic Structure of Computers. Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan Chapter 2 Basic Structure of Computers Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan Outline Functional Units Basic Operational Concepts Bus Structures Software

More information

This Unit: Virtual Memory. CIS 501 Computer Architecture. Readings. A Computer System: Hardware

This Unit: Virtual Memory. CIS 501 Computer Architecture. Readings. A Computer System: Hardware This Unit: Virtual CIS 501 Computer Architecture Unit 5: Virtual App App App System software Mem CPU I/O The operating system (OS) A super-application Hardware support for an OS Virtual memory Page tables

More information

Measuring Cache and Memory Latency and CPU to Memory Bandwidth

Measuring Cache and Memory Latency and CPU to Memory Bandwidth White Paper Joshua Ruggiero Computer Systems Engineer Intel Corporation Measuring Cache and Memory Latency and CPU to Memory Bandwidth For use with Intel Architecture December 2008 1 321074 Executive Summary

More information

The Big Picture. Cache Memory CSE 675.02. Memory Hierarchy (1/3) Disk

The Big Picture. Cache Memory CSE 675.02. Memory Hierarchy (1/3) Disk The Big Picture Cache Memory CSE 5.2 Computer Processor Memory (active) (passive) Control (where ( brain ) programs, Datapath data live ( brawn ) when running) Devices Input Output Keyboard, Mouse Disk,

More information

Slide Set 8. for ENCM 369 Winter 2015 Lecture Section 01. Steve Norman, PhD, PEng

Slide Set 8. for ENCM 369 Winter 2015 Lecture Section 01. Steve Norman, PhD, PEng Slide Set 8 for ENCM 369 Winter 2015 Lecture Section 01 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary Winter Term, 2015 ENCM 369 W15 Section

More information

Memory Management Outline. Background Swapping Contiguous Memory Allocation Paging Segmentation Segmented Paging

Memory Management Outline. Background Swapping Contiguous Memory Allocation Paging Segmentation Segmented Paging Memory Management Outline Background Swapping Contiguous Memory Allocation Paging Segmentation Segmented Paging 1 Background Memory is a large array of bytes memory and registers are only storage CPU can

More information

The Central Processing Unit:

The Central Processing Unit: The Central Processing Unit: What Goes on Inside the Computer Chapter 4 Objectives Identify the components of the central processing unit and how they work together and interact with memory Describe how

More information

Outline. Cache Parameters. Lecture 5 Cache Operation

Outline. Cache Parameters. Lecture 5 Cache Operation Lecture Cache Operation ECE / Fall Edward F. Gehringer Based on notes by Drs. Eric Rotenberg & Tom Conte of NCSU Outline Review of cache parameters Example of operation of a direct-mapped cache. Example

More information

CSE 160 Lecture 5. The Memory Hierarchy False Sharing Cache Coherence and Consistency. Scott B. Baden

CSE 160 Lecture 5. The Memory Hierarchy False Sharing Cache Coherence and Consistency. Scott B. Baden CSE 160 Lecture 5 The Memory Hierarchy False Sharing Cache Coherence and Consistency Scott B. Baden Using Bang coming down the home stretch Do not use Bang s front end for running mergesort Use batch,

More information

Operating Systems. Virtual Memory

Operating Systems. Virtual Memory Operating Systems Virtual Memory Virtual Memory Topics. Memory Hierarchy. Why Virtual Memory. Virtual Memory Issues. Virtual Memory Solutions. Locality of Reference. Virtual Memory with Segmentation. Page

More information

Table 1: Address Table

Table 1: Address Table DDR SDRAM DIMM D32PB12C 512MB D32PB1GJ 1GB For the latest data sheet, please visit the Super Talent Electronics web site: www.supertalentmemory.com Features 184-pin, dual in-line memory module (DIMM) Fast

More information

Memory Basics. SRAM/DRAM Basics

Memory Basics. SRAM/DRAM Basics Memory Basics RAM: Random Access Memory historically defined as memory array with individual bit access refers to memory with both Read and Write capabilities ROM: Read Only Memory no capabilities for

More information

White Paper Utilizing Leveling Techniques in DDR3 SDRAM Memory Interfaces

White Paper Utilizing Leveling Techniques in DDR3 SDRAM Memory Interfaces White Paper Introduction The DDR3 SDRAM memory architectures support higher bandwidths with bus rates of 600 Mbps to 1.6 Gbps (300 to 800 MHz), 1.5V operation for lower power, and higher densities of 2

More information

Logical Operations. Control Unit. Contents. Arithmetic Operations. Objectives. The Central Processing Unit: Arithmetic / Logic Unit.

Logical Operations. Control Unit. Contents. Arithmetic Operations. Objectives. The Central Processing Unit: Arithmetic / Logic Unit. Objectives The Central Processing Unit: What Goes on Inside the Computer Chapter 4 Identify the components of the central processing unit and how they work together and interact with memory Describe how

More information

Chapter 6. Inside the System Unit. What You Will Learn... Computers Are Your Future. What You Will Learn... Describing Hardware Performance

Chapter 6. Inside the System Unit. What You Will Learn... Computers Are Your Future. What You Will Learn... Describing Hardware Performance What You Will Learn... Computers Are Your Future Chapter 6 Understand how computers represent data Understand the measurements used to describe data transfer rates and data storage capacity List the components

More information

Intel X38 Express Chipset Memory Technology and Configuration Guide

Intel X38 Express Chipset Memory Technology and Configuration Guide Intel X38 Express Chipset Memory Technology and Configuration Guide White Paper January 2008 Document Number: 318469-002 INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE,

More information

Intel 965 Express Chipset Family Memory Technology and Configuration Guide

Intel 965 Express Chipset Family Memory Technology and Configuration Guide Intel 965 Express Chipset Family Memory Technology and Configuration Guide White Paper - For the Intel 82Q965, 82Q963, 82G965 Graphics and Memory Controller Hub (GMCH) and Intel 82P965 Memory Controller

More information

A N. O N Output/Input-output connection

A N. O N Output/Input-output connection Memory Types Two basic types: ROM: Read-only memory RAM: Read-Write memory Four commonly used memories: ROM Flash, EEPROM Static RAM (SRAM) Dynamic RAM (DRAM), SDRAM, RAMBUS, DDR RAM Generic pin configuration:

More information

Configuring CoreNet Platform Cache (CPC) as SRAM For Use by Linux Applications

Configuring CoreNet Platform Cache (CPC) as SRAM For Use by Linux Applications Freescale Semiconductor Document Number:AN4749 Application Note Rev 0, 07/2013 Configuring CoreNet Platform Cache (CPC) as SRAM For Use by Linux Applications 1 Introduction This document provides, with

More information

Byte Ordering of Multibyte Data Items

Byte Ordering of Multibyte Data Items Byte Ordering of Multibyte Data Items Most Significant Byte (MSB) Least Significant Byte (LSB) Big Endian Byte Addresses +0 +1 +2 +3 +4 +5 +6 +7 VALUE (8-byte) Least Significant Byte (LSB) Most Significant

More information

Intel P6 Systemprogrammering 2007 Föreläsning 5 P6/Linux Memory System

Intel P6 Systemprogrammering 2007 Föreläsning 5 P6/Linux Memory System Intel P6 Systemprogrammering 07 Föreläsning 5 P6/Linux ory System Topics P6 address translation Linux memory management Linux page fault handling memory mapping Internal Designation for Successor to Pentium

More information

Putting it all together: Intel Nehalem. http://www.realworldtech.com/page.cfm?articleid=rwt040208182719

Putting it all together: Intel Nehalem. http://www.realworldtech.com/page.cfm?articleid=rwt040208182719 Putting it all together: Intel Nehalem http://www.realworldtech.com/page.cfm?articleid=rwt040208182719 Intel Nehalem Review entire term by looking at most recent microprocessor from Intel Nehalem is code

More information

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 412, University of Maryland. Guest lecturer: David Hovemeyer.

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 412, University of Maryland. Guest lecturer: David Hovemeyer. Guest lecturer: David Hovemeyer November 15, 2004 The memory hierarchy Red = Level Access time Capacity Features Registers nanoseconds 100s of bytes fixed Cache nanoseconds 1-2 MB fixed RAM nanoseconds

More information

We r e going to play Final (exam) Jeopardy! "Answers:" "Questions:" - 1 -

We r e going to play Final (exam) Jeopardy! Answers: Questions: - 1 - . (0 pts) We re going to play Final (exam) Jeopardy! Associate the following answers with the appropriate question. (You are given the "answers": Pick the "question" that goes best with each "answer".)

More information

Virtual vs Physical Addresses

Virtual vs Physical Addresses Virtual vs Physical Addresses Physical addresses refer to hardware addresses of physical memory. Virtual addresses refer to the virtual store viewed by the process. virtual addresses might be the same

More information

A+ Guide to Managing and Maintaining Your PC, 7e. Chapter 1 Introducing Hardware

A+ Guide to Managing and Maintaining Your PC, 7e. Chapter 1 Introducing Hardware A+ Guide to Managing and Maintaining Your PC, 7e Chapter 1 Introducing Hardware Objectives Learn that a computer requires both hardware and software to work Learn about the many different hardware components

More information

What is a bus? A Bus is: Advantages of Buses. Disadvantage of Buses. Master versus Slave. The General Organization of a Bus

What is a bus? A Bus is: Advantages of Buses. Disadvantage of Buses. Master versus Slave. The General Organization of a Bus Datorteknik F1 bild 1 What is a bus? Slow vehicle that many people ride together well, true... A bunch of wires... A is: a shared communication link a single set of wires used to connect multiple subsystems

More information

Intel Q35/Q33, G35/G33/G31, P35/P31 Express Chipset Memory Technology and Configuration Guide

Intel Q35/Q33, G35/G33/G31, P35/P31 Express Chipset Memory Technology and Configuration Guide Intel Q35/Q33, G35/G33/G31, P35/P31 Express Chipset Memory Technology and Configuration Guide White Paper August 2007 Document Number: 316971-002 INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION

More information

W4118: segmentation and paging. Instructor: Junfeng Yang

W4118: segmentation and paging. Instructor: Junfeng Yang W4118: segmentation and paging Instructor: Junfeng Yang Outline Memory management goals Segmentation Paging TLB 1 Uni- v.s. multi-programming Simple uniprogramming with a single segment per process Uniprogramming

More information

Intel Itanium Quad-Core Architecture for the Enterprise. Lambert Schaelicke Eric DeLano

Intel Itanium Quad-Core Architecture for the Enterprise. Lambert Schaelicke Eric DeLano Intel Itanium Quad-Core Architecture for the Enterprise Lambert Schaelicke Eric DeLano Agenda Introduction Intel Itanium Roadmap Intel Itanium Processor 9300 Series Overview Key Features Pipeline Overview

More information

Enabling Technologies for Distributed Computing

Enabling Technologies for Distributed Computing Enabling Technologies for Distributed Computing Dr. Sanjay P. Ahuja, Ph.D. Fidelity National Financial Distinguished Professor of CIS School of Computing, UNF Multi-core CPUs and Multithreading Technologies

More information

Virtuoso and Database Scalability

Virtuoso and Database Scalability Virtuoso and Database Scalability By Orri Erling Table of Contents Abstract Metrics Results Transaction Throughput Initializing 40 warehouses Serial Read Test Conditions Analysis Working Set Effect of

More information

Chapter 11 I/O Management and Disk Scheduling

Chapter 11 I/O Management and Disk Scheduling Operating Systems: Internals and Design Principles, 6/E William Stallings Chapter 11 I/O Management and Disk Scheduling Dave Bremer Otago Polytechnic, NZ 2008, Prentice Hall I/O Devices Roadmap Organization

More information

SOLVING HIGH-SPEED MEMORY INTERFACE CHALLENGES WITH LOW-COST FPGAS

SOLVING HIGH-SPEED MEMORY INTERFACE CHALLENGES WITH LOW-COST FPGAS SOLVING HIGH-SPEED MEMORY INTERFACE CHALLENGES WITH LOW-COST FPGAS A Lattice Semiconductor White Paper May 2005 Lattice Semiconductor 5555 Northeast Moore Ct. Hillsboro, Oregon 97124 USA Telephone: (503)

More information

Memory Systems. Static Random Access Memory (SRAM) Cell

Memory Systems. Static Random Access Memory (SRAM) Cell Memory Systems This chapter begins the discussion of memory systems from the implementation of a single bit. The architecture of memory chips is then constructed using arrays of bit implementations coupled

More information

EE482: Advanced Computer Organization Lecture #11 Processor Architecture Stanford University Wednesday, 31 May 2000. ILP Execution

EE482: Advanced Computer Organization Lecture #11 Processor Architecture Stanford University Wednesday, 31 May 2000. ILP Execution EE482: Advanced Computer Organization Lecture #11 Processor Architecture Stanford University Wednesday, 31 May 2000 Lecture #11: Wednesday, 3 May 2000 Lecturer: Ben Serebrin Scribe: Dean Liu ILP Execution

More information

Configuring and using DDR3 memory with HP ProLiant Gen8 Servers

Configuring and using DDR3 memory with HP ProLiant Gen8 Servers Engineering white paper, 2 nd Edition Configuring and using DDR3 memory with HP ProLiant Gen8 Servers Best Practice Guidelines for ProLiant servers with Intel Xeon processors Table of contents Introduction

More information

Quiz for Chapter 6 Storage and Other I/O Topics 3.10

Quiz for Chapter 6 Storage and Other I/O Topics 3.10 Date: 3.10 Not all questions are of equal difficulty. Please review the entire quiz first and then budget your time carefully. Name: Course: Solutions in Red 1. [6 points] Give a concise answer to each

More information

The Classical Architecture. Storage 1 / 36

The Classical Architecture. Storage 1 / 36 1 / 36 The Problem Application Data? Filesystem Logical Drive Physical Drive 2 / 36 Requirements There are different classes of requirements: Data Independence application is shielded from physical storage

More information

Application Note 195. ARM11 performance monitor unit. Document number: ARM DAI 195B Issued: 15th February, 2008 Copyright ARM Limited 2007

Application Note 195. ARM11 performance monitor unit. Document number: ARM DAI 195B Issued: 15th February, 2008 Copyright ARM Limited 2007 Application Note 195 ARM11 performance monitor unit Document number: ARM DAI 195B Issued: 15th February, 2008 Copyright ARM Limited 2007 Copyright 2007 ARM Limited. All rights reserved. Application Note

More information

Communicating with devices

Communicating with devices Introduction to I/O Where does the data for our CPU and memory come from or go to? Computers communicate with the outside world via I/O devices. Input devices supply computers with data to operate on.

More information

Memory. The memory types currently in common usage are:

Memory. The memory types currently in common usage are: ory ory is the third key component of a microprocessor-based system (besides the CPU and I/O devices). More specifically, the primary storage directly addressed by the CPU is referred to as main memory

More information

CHAPTER 7: The CPU and Memory

CHAPTER 7: The CPU and Memory CHAPTER 7: The CPU and Memory The Architecture of Computer Hardware, Systems Software & Networking: An Information Technology Approach 4th Edition, Irv Englander John Wiley and Sons 2010 PowerPoint slides

More information

AMD Opteron Quad-Core

AMD Opteron Quad-Core AMD Opteron Quad-Core a brief overview Daniele Magliozzi Politecnico di Milano Opteron Memory Architecture native quad-core design (four cores on a single die for more efficient data sharing) enhanced

More information

DDR4 Memory Technology on HP Z Workstations

DDR4 Memory Technology on HP Z Workstations Technical white paper DDR4 Memory Technology on HP Z Workstations DDR4 is the latest memory technology available for main memory on mobile, desktops, workstations, and server computers. DDR stands for

More information

Big Picture. IC220 Set #11: Storage and I/O I/O. Outline. Important but neglected

Big Picture. IC220 Set #11: Storage and I/O I/O. Outline. Important but neglected Big Picture Processor Interrupts IC220 Set #11: Storage and Cache Memory- bus Main memory 1 Graphics output Network 2 Outline Important but neglected The difficulties in assessing and designing systems

More information

MICROPROCESSOR AND MICROCOMPUTER BASICS

MICROPROCESSOR AND MICROCOMPUTER BASICS Introduction MICROPROCESSOR AND MICROCOMPUTER BASICS At present there are many types and sizes of computers available. These computers are designed and constructed based on digital and Integrated Circuit

More information

Chapter 11 I/O Management and Disk Scheduling

Chapter 11 I/O Management and Disk Scheduling Operatin g Systems: Internals and Design Principle s Chapter 11 I/O Management and Disk Scheduling Seventh Edition By William Stallings Operating Systems: Internals and Design Principles An artifact can

More information

1 Storage Devices Summary

1 Storage Devices Summary Chapter 1 Storage Devices Summary Dependability is vital Suitable measures Latency how long to the first bit arrives Bandwidth/throughput how fast does stuff come through after the latency period Obvious

More information

Enabling Technologies for Distributed and Cloud Computing

Enabling Technologies for Distributed and Cloud Computing Enabling Technologies for Distributed and Cloud Computing Dr. Sanjay P. Ahuja, Ph.D. 2010-14 FIS Distinguished Professor of Computer Science School of Computing, UNF Multi-core CPUs and Multithreading

More information

Lecture 11: Multi-Core and GPU. Multithreading. Integration of multiple processor cores on a single chip.

Lecture 11: Multi-Core and GPU. Multithreading. Integration of multiple processor cores on a single chip. Lecture 11: Multi-Core and GPU Multi-core computers Multithreading GPUs General Purpose GPUs Zebo Peng, IDA, LiTH 1 Multi-Core System Integration of multiple processor cores on a single chip. To provide

More information

The Orca Chip... Heart of IBM s RISC System/6000 Value Servers

The Orca Chip... Heart of IBM s RISC System/6000 Value Servers The Orca Chip... Heart of IBM s RISC System/6000 Value Servers Ravi Arimilli IBM RISC System/6000 Division 1 Agenda. Server Background. Cache Heirarchy Performance Study. RS/6000 Value Server System Structure.

More information

This Unit: Putting It All Together. CIS 501 Computer Architecture. Sources. What is Computer Architecture?

This Unit: Putting It All Together. CIS 501 Computer Architecture. Sources. What is Computer Architecture? This Unit: Putting It All Together CIS 501 Computer Architecture Unit 11: Putting It All Together: Anatomy of the XBox 360 Game Console Slides originally developed by Amir Roth with contributions by Milo

More information

Computer Performance. Topic 3. Contents. Prerequisite knowledge Before studying this topic you should be able to:

Computer Performance. Topic 3. Contents. Prerequisite knowledge Before studying this topic you should be able to: 55 Topic 3 Computer Performance Contents 3.1 Introduction...................................... 56 3.2 Measuring performance............................... 56 3.2.1 Clock Speed.................................

More information

Chapter 2: Computer-System Structures. Computer System Operation Storage Structure Storage Hierarchy Hardware Protection General System Architecture

Chapter 2: Computer-System Structures. Computer System Operation Storage Structure Storage Hierarchy Hardware Protection General System Architecture Chapter 2: Computer-System Structures Computer System Operation Storage Structure Storage Hierarchy Hardware Protection General System Architecture Operating System Concepts 2.1 Computer-System Architecture

More information

Computer Systems Structure Input/Output

Computer Systems Structure Input/Output Computer Systems Structure Input/Output Peripherals Computer Central Processing Unit Main Memory Computer Systems Interconnection Communication lines Input Output Ward 1 Ward 2 Examples of I/O Devices

More information

In-Memory Databases Algorithms and Data Structures on Modern Hardware. Martin Faust David Schwalb Jens Krüger Jürgen Müller

In-Memory Databases Algorithms and Data Structures on Modern Hardware. Martin Faust David Schwalb Jens Krüger Jürgen Müller In-Memory Databases Algorithms and Data Structures on Modern Hardware Martin Faust David Schwalb Jens Krüger Jürgen Müller The Free Lunch Is Over 2 Number of transistors per CPU increases Clock frequency

More information

OC By Arsene Fansi T. POLIMI 2008 1

OC By Arsene Fansi T. POLIMI 2008 1 IBM POWER 6 MICROPROCESSOR OC By Arsene Fansi T. POLIMI 2008 1 WHAT S IBM POWER 6 MICROPOCESSOR The IBM POWER6 microprocessor powers the new IBM i-series* and p-series* systems. It s based on IBM POWER5

More information

StrongARM** SA-110 Microprocessor Instruction Timing

StrongARM** SA-110 Microprocessor Instruction Timing StrongARM** SA-110 Microprocessor Instruction Timing Application Note September 1998 Order Number: 278194-001 Information in this document is provided in connection with Intel products. No license, express

More information

Secure Cloud Storage and Computing Using Reconfigurable Hardware

Secure Cloud Storage and Computing Using Reconfigurable Hardware Secure Cloud Storage and Computing Using Reconfigurable Hardware Victor Costan, Brandon Cho, Srini Devadas Motivation Computing is more cost-efficient in public clouds but what about security? Cloud Applications

More information

Table Of Contents. Page 2 of 26. *Other brands and names may be claimed as property of others.

Table Of Contents. Page 2 of 26. *Other brands and names may be claimed as property of others. Technical White Paper Revision 1.1 4/28/10 Subject: Optimizing Memory Configurations for the Intel Xeon processor 5500 & 5600 series Author: Scott Huck; Intel DCG Competitive Architect Target Audience:

More information

A Deduplication File System & Course Review

A Deduplication File System & Course Review A Deduplication File System & Course Review Kai Li 12/13/12 Topics A Deduplication File System Review 12/13/12 2 Traditional Data Center Storage Hierarchy Clients Network Server SAN Storage Remote mirror

More information

Module 2. Embedded Processors and Memory. Version 2 EE IIT, Kharagpur 1

Module 2. Embedded Processors and Memory. Version 2 EE IIT, Kharagpur 1 Module 2 Embedded Processors and Memory Version 2 EE IIT, Kharagpur 1 Lesson 5 Memory-I Version 2 EE IIT, Kharagpur 2 Instructional Objectives After going through this lesson the student would Pre-Requisite

More information

Architecture of Hitachi SR-8000

Architecture of Hitachi SR-8000 Architecture of Hitachi SR-8000 University of Stuttgart High-Performance Computing-Center Stuttgart (HLRS) www.hlrs.de Slide 1 Most of the slides from Hitachi Slide 2 the problem modern computer are data

More information

Using Synology SSD Technology to Enhance System Performance Synology Inc.

Using Synology SSD Technology to Enhance System Performance Synology Inc. Using Synology SSD Technology to Enhance System Performance Synology Inc. Synology_SSD_Cache_WP_ 20140512 Table of Contents Chapter 1: Enterprise Challenges and SSD Cache as Solution Enterprise Challenges...

More information

Chapter 6. 6.1 Introduction. Storage and Other I/O Topics. p. 570( 頁 585) Fig. 6.1. I/O devices can be characterized by. I/O bus connections

Chapter 6. 6.1 Introduction. Storage and Other I/O Topics. p. 570( 頁 585) Fig. 6.1. I/O devices can be characterized by. I/O bus connections Chapter 6 Storage and Other I/O Topics 6.1 Introduction I/O devices can be characterized by Behavior: input, output, storage Partner: human or machine Data rate: bytes/sec, transfers/sec I/O bus connections

More information

1. Computer System Structure and Components

1. Computer System Structure and Components 1 Computer System Structure and Components Computer System Layers Various Computer Programs OS System Calls (eg, fork, execv, write, etc) KERNEL/Behavior or CPU Device Drivers Device Controllers Devices

More information

Rethinking SIMD Vectorization for In-Memory Databases

Rethinking SIMD Vectorization for In-Memory Databases SIGMOD 215, Melbourne, Victoria, Australia Rethinking SIMD Vectorization for In-Memory Databases Orestis Polychroniou Columbia University Arun Raghavan Oracle Labs Kenneth A. Ross Columbia University Latest

More information

Parallel Algorithm Engineering

Parallel Algorithm Engineering Parallel Algorithm Engineering Kenneth S. Bøgh PhD Fellow Based on slides by Darius Sidlauskas Outline Background Current multicore architectures UMA vs NUMA The openmp framework Examples Software crisis

More information

Advanced Computer Architecture-CS501. Computer Systems Design and Architecture 2.1, 2.2, 3.2

Advanced Computer Architecture-CS501. Computer Systems Design and Architecture 2.1, 2.2, 3.2 Lecture Handout Computer Architecture Lecture No. 2 Reading Material Vincent P. Heuring&Harry F. Jordan Chapter 2,Chapter3 Computer Systems Design and Architecture 2.1, 2.2, 3.2 Summary 1) A taxonomy of

More information

This Unit: Main Memory. CIS 501 Introduction to Computer Architecture. Memory Hierarchy Review. Readings

This Unit: Main Memory. CIS 501 Introduction to Computer Architecture. Memory Hierarchy Review. Readings This Unit: Main Memory CIS 501 Introduction to Computer Architecture Unit 4: Memory Hierarchy II: Main Memory Application OS Compiler Firmware CPU I/O Memory Digital Circuits Gates & Transistors Memory

More information

I/O. Input/Output. Types of devices. Interface. Computer hardware

I/O. Input/Output. Types of devices. Interface. Computer hardware I/O Input/Output One of the functions of the OS, controlling the I/O devices Wide range in type and speed The OS is concerned with how the interface between the hardware and the user is made The goal in

More information