EECS150 - Digital Design Lecture 17 Memory 1. Memory Basics

Similar documents
Memory Basics. SRAM/DRAM Basics

1. Memory technology & Hierarchy

Chapter 5 :: Memory and Logic Arrays

A N. O N Output/Input-output connection

RAM & ROM Based Digital Design. ECE 152A Winter 2012

Chapter 7 Memory and Programmable Logic

Memory. The memory types currently in common usage are:

Computer Architecture

Module 2. Embedded Processors and Memory. Version 2 EE IIT, Kharagpur 1

Modeling Sequential Elements with Verilog. Prof. Chien-Nan Liu TEL: ext: Sequential Circuit

ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-17: Memory organisation, and types of memory

Homework # 2. Solutions. 4.1 What are the differences among sequential access, direct access, and random access?

Memory unit. 2 k words. n bits per word

Exploiting Stateful Inspection of Network Security in Reconfigurable Hardware

Computer Systems Structure Main Memory Organization

Features. DDR SODIMM Product Datasheet. Rev. 1.0 Oct. 2011

With respect to the way of data access we can classify memories as:

GETTING STARTED WITH PROGRAMMABLE LOGIC DEVICES, THE 16V8 AND 20V8

A New Paradigm for Synchronous State Machine Design in Verilog

Chapter 9 Semiconductor Memories. Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan

SOLVING HIGH-SPEED MEMORY INTERFACE CHALLENGES WITH LOW-COST FPGAS

Read-only memory Implementing logic with ROM Programmable logic devices Implementing logic with PLDs Static hazards

Computers. Hardware. The Central Processing Unit (CPU) CMPT 125: Lecture 1: Understanding the Computer

Random Access Memory (RAM) Types of RAM. RAM Random Access Memory Jamie Tees SDRAM. Micro-DIMM SO-DIMM

Memory ICS 233. Computer Architecture and Assembly Language Prof. Muhamed Mudawar

Semiconductor Memories

CHAPTER 7: The CPU and Memory

Logical Operations. Control Unit. Contents. Arithmetic Operations. Objectives. The Central Processing Unit: Arithmetic / Logic Unit.

Objectives. Units of Memory Capacity. CMPE328 Microprocessors (Spring ) Memory and I/O address Decoders. By Dr.

Computer Architecture

Technical Note. Micron NAND Flash Controller via Xilinx Spartan -3 FPGA. Overview. TN-29-06: NAND Flash Controller on Spartan-3 Overview

Handout 17. by Dr Sheikh Sharif Iqbal. Memory Unit and Read Only Memories

Switch Fabric Implementation Using Shared Memory

Random-Access Memory (RAM) The Memory Hierarchy. SRAM vs DRAM Summary. Conventional DRAM Organization. Page 1

Table 1: Address Table

Chapter 6. Inside the System Unit. What You Will Learn... Computers Are Your Future. What You Will Learn... Describing Hardware Performance

路 論 Chapter 15 System-Level Physical Design

Microprocessor & Assembly Language

Chapter 2 Logic Gates and Introduction to Computer Architecture

MICROPROCESSOR AND MICROCOMPUTER BASICS

Memory Systems. Static Random Access Memory (SRAM) Cell

Open Flow Controller and Switch Datasheet

ECE410 Design Project Spring 2008 Design and Characterization of a CMOS 8-bit Microprocessor Data Path

DDR SDRAM SODIMM. MT8VDDT3264H 256MB 1 MT8VDDT6464H 512MB For component data sheets, refer to Micron s Web site:

Slide Set 8. for ENCM 369 Winter 2015 Lecture Section 01. Steve Norman, PhD, PEng

Design of a High Speed Communications Link Using Field Programmable Gate Arrays

DDR4 Memory Technology on HP Z Workstations

The Central Processing Unit:

Memory Testing. Memory testing.1

DDR SDRAM UDIMM MT16VDDT6464A 512MB MT16VDDT12864A 1GB MT16VDDT25664A 2GB

Lecture 5: Gate Logic Logic Optimization

Introduction to Programmable Logic Devices. John Coughlan RAL Technology Department Detector & Electronics Division

Understanding Memory TYPES OF MEMORY

Semiconductor Device Technology for Implementing System Solutions: Memory Modules

Lecture 9: Memory and Storage Technologies

USB - FPGA MODULE (PRELIMINARY)

NAND Flash FAQ. Eureka Technology. apn5_87. NAND Flash FAQ

DDR SDRAM SODIMM. MT9VDDT1672H 128MB 1 MT9VDDT3272H 256MB MT9VDDT6472H 512MB For component data sheets, refer to Micron s Web site:

Lecture-3 MEMORY: Development of Memory:

SDRAM and DRAM Memory Systems Overview

Optimising the resource utilisation in high-speed network intrusion detection systems.

Prototyping ARM Cortex -A Processors using FPGA platforms

Experiment # 9. Clock generator circuits & Counters. Eng. Waleed Y. Mousa

Static-Noise-Margin Analysis of Conventional 6T SRAM Cell at 45nm Technology

CS250 VLSI Systems Design Lecture 8: Memory

Introduction to CMOS VLSI Design (E158) Lecture 8: Clocking of VLSI Systems

7a. System-on-chip design and prototyping platforms

Configuring Memory on the HP Business Desktop dx5150

User s Manual HOW TO USE DDR SDRAM

Low Power AMD Athlon 64 and AMD Opteron Processors

Table 1 SDR to DDR Quick Reference

EE 42/100 Lecture 24: Latches and Flip Flops. Rev B 4/21/2010 (2:04 PM) Prof. Ali M. Niknejad

Chapter 4 System Unit Components. Discovering Computers Your Interactive Guide to the Digital World

Introduction. Jim Duckworth ECE Department, WPI. VHDL Short Course - Module 1

Computer Performance. Topic 3. Contents. Prerequisite knowledge Before studying this topic you should be able to:

DDR2 SDRAM SODIMM MT16HTF12864H 1GB MT16HTF25664H 2GB

Features. DDR3 SODIMM Product Specification. Rev. 1.7 Feb. 2016

7 Series FPGA Overview

9/14/ :38

Sequential Circuit Design

Contents. Overview Memory Compilers Selection Guide

DDR2 SDRAM SODIMM MT8HTF3264HD 256MB MT8HTF6464HD 512MB MT8HTF12864HD 1GB For component data sheets, refer to Micron s Web site:

A+ Guide to Managing and Maintaining Your PC, 7e. Chapter 1 Introducing Hardware

OpenSPARC T1 Processor

Byte Ordering of Multibyte Data Items

NTE2053 Integrated Circuit 8 Bit MPU Compatible A/D Converter

White Paper Utilizing Leveling Techniques in DDR3 SDRAM Memory Interfaces

Algorithms and Methods for Distributed Storage Networks 3. Solid State Disks Christian Schindelhauer

Layout of Multiple Cells

Lecture 11: Sequential Circuit Design

Lab #5: Design Example: Keypad Scanner and Encoder - Part 1 (120 pts)

ADQYF1A08. DDR2-1066G(CL6) 240-Pin O.C. U-DIMM 1GB (128M x 64-bits)

Digitale Signalverarbeitung mit FPGA (DSF) Soft Core Prozessor NIOS II Stand Mai Jens Onno Krah

Power Reduction Techniques in the SoC Clock Network. Clock Power

Qsys and IP Core Integration

Microprocessor or Microcontroller?

DDR SDRAM SODIMM MT16VDDF6464H 512MB MT16VDDF12864H 1GB

FPGA-based MapReduce Framework for Machine Learning

DS1225Y 64k Nonvolatile SRAM

ETEC 2301 Programmable Logic Devices. Chapter 10 Counters. Shawnee State University Department of Industrial and Engineering Technologies

Transcription:

EECS150 - Digital Design Lecture 17 Memory 1 March 17, 2005 John Wawrzynek Spring 2005 EECS150 - Lec17-mem1 Page 1 Memory Basics Uses: Whenever a large collection of state elements is required. data & program storage general purpose registers buffering table lookups CL implementation Example RAM: Register file from microprocessor clk Types: RAM - random access memory ROM - read only memory EPROM, FLASH - electrically programmable read only memory regid = register identifier (address of word in memory) sizeof(regid) = log2(# of reg) WE = write enable Spring 2005 EECS150 - Lec17-mem1 Page 2 1

Definitions Memory Interfaces for Accessing Data Asynchronous (unclocked): A change in the address results in data appearing Synchronous (clocked): A change in address, followed by an edge on CLK results in data appearing or write operation occurring. A common arrangement is to have synchronous write operations and asynchronous read operations. Volatile: Looses its state when the power goes off. Nonvolatile: Retains it state when power goes off. Spring 2005 EECS150 - Lec17-mem1 Page 3 For read operations, functionally the regfile is equivalent to a 2-D array of flip-flops with tristate outputs on each: Register File Internals Cell with added write logic: These circuits are just functional abstractions of the actual circuits used. How do we go from "regid" to "SEL"? Spring 2005 EECS150 - Lec17-mem1 Page 4 2

Regid (address) Decoding The function of the address decoder is to generate a one-hot code word from the address. The output is use for row selection. Many different circuits exist for this function. A simple one is shown to the right. Spring 2005 EECS150 - Lec17-mem1 Page 5 Standard Internal Memory Organization 2-D arrary of bit cells. Each cell stores one bit of data. Special circuit tricks are used for the cell array to improve storage density. (We will look at these later) RAM/ROM naming convention: examples: 32 X 8, "32 by 8" => 32 8-bit words 1M X 1, "1 meg by 1" => 1M 1-bit words Spring 2005 EECS150 - Lec17-mem1 Page 6 3

Read Only Memory (ROM) Simply for of memory. No write operation needed. Functional Equivalence: Connections to Vdd used to store a logic 1, connections to GND for storing logic 0. address decoder bit-cell array Full tri-state buffers are not needed at each cell point. In practice, single transistors are used to implement zero cells. Logic one s are derived through precharging or bit-line pullup transistor. Spring 2005 EECS150 - Lec17-mem1 Page 7 Column MUX in ROMs and RAMs: Controls physical aspect ratio Important for physical layout and to control delay on wires. In DRAM, allows time-multiplexing of chip address pins Spring 2005 EECS150 - Lec17-mem1 Page 8 4

Cascading Memory Modules (or chips) Example: assemblage of 256 x 8 ROM using 256 x 4 modules: example: 1K x * ROM using 256 x 4 modules: each module has tri-state outputs: Spring 2005 EECS150 - Lec17-mem1 Page 9 Memory Components Types: Volatile: Random Access Memory (RAM): DRAM "dynamic" SRAM "static" Non-volatile: Read Only Memory (ROM): Mask ROM "mask programmable" EPROM "electrically programmable" EEPROM "erasable electrically programmable" FLASH memory - similar to EEPROM with programmer integrated on chip Spring 2005 EECS150 - Lec17-mem1 Page 10 5

Volatile Memory Comparison The primary difference between different memory types is the bit cell. SRAM Cell DRAM Cell word line word line bit line bit line bit line Larger cell lower density, higher cost/bit No refresh required Simple read faster access Standard IC process natural for integration with logic Smaller cell higher density, lower cost/bit Needs periodic refresh, and refresh after read Complex read longer access time Special IC process difficult to integrate with logic circuits Spring 2005 EECS150 - Lec17-mem1 Page 11 Multi-ported Memory Motivation: Consider CPU core register file: 1 read or write per cycle limits processor performance. Complicates pipelining. Difficult for different instructions to simultaneously read or write regfile. Common arrangement in pipelined CPUs is 2 read ports and 1 write port. sel a sel b sel c data b data a Regfile data c Motivation: I/O data buffering: disk/network data buffer CPU dual-porting allows both sides to simultaneously access memory at full bandwidth. addr a data a RW a addr b data b RW b Dual-port Memory Spring 2005 EECS150 - Lec17-mem1 Page 12 6

Dual-ported Memory Internals Add decoder, another set of read/write logic, bits lines, word lines: dec a dec b cell array Example cell: SRAM WL 2 WL 1 b 2 b 1 b 1 b 2 address ports data ports r/w logic r/w logic Repeat everything but crosscoupled inverters. This scheme extends up to a couple more ports, then need to add additional transistors. Spring 2005 EECS150 - Lec17-mem1 Page 13 Memory in Desktop Computer Systems: SRAM (lower density, higher speed) used in CPU register file, on- and off-chip caches. DRAM (higher density, lower speed) used in main memory Closing the GAP: 1. Caches are growing in size. 2. Innovation targeted towards higher bandwidth for memory systems: SDRAM - synchronous DRAM RDRAM - Rambus DRAM EDORAM - extended data out SRAM Three-dimensional RAM hyper-page mode DRAM video RAM multibank DRAM Spring 2005 EECS150 - Lec17-mem1 Page 14 7

Important DRAM Examples: EDO - extended data out (similar to fast-page mode) RAS cycle fetched rows of data from cell array blocks (long access time, around 100ns) Subsequent CAS cycles quickly access data from row buffers if within an address page (page is around 256 Bytes) SDRAM - synchronous DRAM clocked interface uses dual banks internally. Start access in one bank then next, then receive data from first then second. DDR - Double data rate SDRAM Uses both rising (positive edge) and falling (negative) edge of clock for data transfer. (typical 100MHz clock with 200 MHz transfer). RDRAM - Rambus DRAM Entire data blocks are access and transferred out on a high-speed bus-like interface (500 MB/s, 1.6 GB/s) Tricky system level design. More expensive memory chips. Spring 2005 EECS150 - Lec17-mem1 Page 15 Non-volatile Memory Used to hold fixed code (ex. BIOS), tables of data (ex. FSM next state/output logic), slowly changing values (date/time on computer) Mask ROM Used with logic circuits for tables etc. Contents fixed at IC fab time (truly write once!) EPROM (erasable programmable) & FLASH requires special IC process (floating gate technology) writing is slower than RAM. EPROM uses special programming system to provide special voltages and timing. reading can be made fairly fast. rewriting is very slow. erasure is first required, EPROM - UV light exposure, EEPROM electrically erasable Spring 2005 EECS150 - Lec17-mem1 Page 16 8

FLASH Memory Electrically erasable In system programmability and erasability (no special system or voltages needed) On-chip circuitry (FSM) and voltage generators to control erasure and programming (writing) Erasure happens in variable sized "sectors" in a flash (16K - 64K Bytes) See: http://developer.intel.com/design/flash/ for product descriptions, etc. Compact flash cards are based on this type of memory. Spring 2005 EECS150 - Lec17-mem1 Page 17 Memory Specification in Verilog Memory modeled by an array of registers: reg[15:0] memword[0:1023]; // 1,024 registers of 16 bits each //Example Memory Block Specification // Uses enable to control both write and read //----------------------------- //Read and write operations of memory. //Memory size is 64 words of 4 bits each. module memory (Enable,ReadWrite,Address,DataIn,DataOut); input Enable,ReadWrite; input [3:0] DataIn; input [5:0] Address; output [3:0] DataOut; reg [3:0] DataOut; reg [3:0] Mem [0:63]; //64 x 4 memory always @ (Enable or ReadWrite) if (Enable) if (ReadWrite) DataOut = Mem[Address]; //Read else Mem[Address] = DataIn; //Write else DataOut = 4'bz; //High impedance state endmodule Spring 2005 EECS150 - Lec17-mem1 Page 18 9

Memory Blocks in FPGAs LUTs can double as small RAM blocks: 4-LUT is really a 16x1 memory. Normally we think of the contents being written from the configuration bit stream, but Virtex architecture (and others) allow bits of LUT to be written and read from the general interconnect structure. achieves 16x density advantage over using CLB flipflops. Furthermore, the two LUTs within a slice can be combined to create a 16 x 2-bit or 32 x 1-bit synchronous RAM, or a 16x1-bit dual-port synchronous RAM. The Virtex-E LUT can also provide a 16-bit shift register of adjustable length. Newer FPGA families include larger on-chip RAM blocks (usually dual ported): Called block selectrams in Xilinx Virtex series 4k bits each RAM16X1S Spring 2005 EECS150 - Lec17-mem1 Page 19 WE D WCLK D WCLK A0 A1 A2 A3 DPRA0 DPRA1 DPRA2 DPRA3 WE A0 A1 A2 A3 RAM16X1D X4950 SPO DPO X4942 one read port, one write port O Synchronous write, asynchronous read D CE CLK A0 A1 A2 A3 SRL16E_1 X8421 Q Length controlled by A0-A3 Verilog Specification for Virtex LUT RAM module ram16x1(q, a, d, we, clk); output q; input d; input [3:0] a; input clk, we; reg mem [15:0]; always @(posedge clk) begin if(we) mem[a] <= d; end assign q = mem[a]; endmodule Note: synchronous write and asynchronous read. WE D WCLK A0 A1 A2 A3 RAM16X1S X4942 Deeper and/or wider RAMs can be specified and the synthesis tool will do the job of wiring together multiple LUTs. How does the synthesis tool choose to implement your RAM as a collection of LUTs or as block RAMs? O Spring 2005 EECS150 - Lec17-mem1 Page 20 10

Each block SelectRAM (block RAM) is a fully synchronous (synchronous write and read) dual-ported (true dual port) 4096-bit RAM with independent control signals for each port. The data widths of the two ports can be configured independently, providing built-in bus-width conversion. CLKA and CLKB can be independent, providing an easy way to cross clock bounders. Around 160 of these on the 2000E. Multiples can be combined to implement, wider or deeper memories. See chapter 8 of Synplify reference manual on how to write Verilog for implied Block RAMs. Or instead, explicitly instantiate as primitive (project checkpoint will use this method). Virtex Block RAMs Spring 2005 EECS150 - Lec17-mem1 Page 21 Tabl WEA ENA RSTA CLKA ADDRA[#:0] DIA[#:0] WEB ENB RSTB CLKB ADDRB[#:0] DIB[#:0] DOA[#:0] DOB[#:0] 5: Block SelectRAM Port Aspect Ratios Width Depth ADDR Bus Data Bus 1 4096 ADDR<11:0> DATA<0> 2 2048 ADDR<10:0> DATA<1:0> 4 1024 ADDR<9:0> DATA<3:0> 8 512 ADDR<8:0> DATA<7:0> 16 256 ADDR<7:0> DATA<15:0> Relationship between Memory and CL Memory blocks can be (and often are) used to implement combinational logic functions: Examples: LUTs in FPGAs 1Mbit x 8 EPROM can implement 8 independent functions each of log 2 (1M)=20 inputs. The decoder part of a memory block can be considered a minterm generator. The cell array part of a memory block can be considered an OR function over a subset of rows. The combination gives us a way to implement logic functions directly in sum of products form. Several variations on this theme exist in a set of devices called Programmable logic devices (PLDs) Spring 2005 EECS150 - Lec17-mem1 Page 22 11

A ROM as AND/OR Logic Device Spring 2005 EECS150 - Lec17-mem1 Page 23 PLD Summary Spring 2005 EECS150 - Lec17-mem1 Page 24 12

PLA Example Spring 2005 EECS150 - Lec17-mem1 Page 25 PAL Example Spring 2005 EECS150 - Lec17-mem1 Page 26 13