Lecture 11: MOS Memory

Similar documents
Lecture 5: Gate Logic Logic Optimization

Static-Noise-Margin Analysis of Conventional 6T SRAM Cell at 45nm Technology

Semiconductor Memories

Class 18: Memories-DRAMs

Module 7 : I/O PADs Lecture 33 : I/O PADs

Layout of Multiple Cells

CS250 VLSI Systems Design Lecture 8: Memory

Class 11: Transmission Gates, Latches

ECE410 Design Project Spring 2008 Design and Characterization of a CMOS 8-bit Microprocessor Data Path

Chapter 9 Semiconductor Memories. Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan

Memory Systems. Static Random Access Memory (SRAM) Cell

Winbond W971GG6JB-25 1 Gbit DDR2 SDRAM 65 nm CMOS DRAM Process

Introduction to CMOS VLSI Design (E158) Lecture 8: Clocking of VLSI Systems

Memory Basics. SRAM/DRAM Basics

1. Memory technology & Hierarchy

CHAPTER 16 MEMORY CIRCUITS

Chapter 10 Advanced CMOS Circuits

Chapter 5 :: Memory and Logic Arrays

Contents. Overview Memory Compilers Selection Guide

Pass Gate Logic An alternative to implementing complex logic is to realize it using a logic network of pass transistors (switches).

ECE124 Digital Circuits and Systems Page 1

Layout and Cross-section of an inverter. Lecture 5. Layout Design. Electric Handles Objects. Layout & Fabrication. A V i

EE 42/100 Lecture 24: Latches and Flip Flops. Rev B 4/21/2010 (2:04 PM) Prof. Ali M. Niknejad

Lecture 7: Clocking of VLSI Systems

Sequential Circuit Design

RAM & ROM Based Digital Design. ECE 152A Winter 2012

Low Power AMD Athlon 64 and AMD Opteron Processors

Modeling Sequential Elements with Verilog. Prof. Chien-Nan Liu TEL: ext: Sequential Circuit

Computer Architecture

A N. O N Output/Input-output connection

Topics of Chapter 5 Sequential Machines. Memory elements. Memory element terminology. Clock terminology

Interfacing 3V and 5V applications

NTE2053 Integrated Circuit 8 Bit MPU Compatible A/D Converter

Memory. The memory types currently in common usage are:

StarRC Custom: Next-Generation Modeling and Extraction Solution for Custom IC Designs

A New Paradigm for Synchronous State Machine Design in Verilog

GETTING STARTED WITH PROGRAMMABLE LOGIC DEVICES, THE 16V8 AND 20V8

Sequential 4-bit Adder Design Report

Intel Q3GM ES 32 nm CPU (from Core i5 660)

1.1 Silicon on Insulator a brief Introduction

NEW adder cells are useful for designing larger circuits despite increase in transistor count by four per cell.

Module 4 : Propagation Delays in MOS Lecture 22 : Logical Effort Calculation of few Basic Logic Circuits

Evaluating Embedded Non-Volatile Memory for 65nm and Beyond

Memory unit. 2 k words. n bits per word

CHAPTER 11: Flip Flops

MAS.836 HOW TO BIAS AN OP-AMP

Module 2. Embedded Processors and Memory. Version 2 EE IIT, Kharagpur 1

Lecture 10: Latch and Flip-Flop Design. Outline

Introduction to CMOS VLSI Design

INSTITUTE OF AERONAUTICAL ENGINEERING Dundigal, Hyderabad

PLL frequency synthesizer

Here we introduced (1) basic circuit for logic and (2)recent nano-devices, and presented (3) some practical issues on nano-devices.

University of Texas at Dallas. Department of Electrical Engineering. EEDG Application Specific Integrated Circuit Design

Design of Low Power One-Bit Hybrid-CMOS Full Adder Cells

CMOS Binary Full Adder

Design and analysis of flip flops for low power clocking system

Using The PIC I/O Ports

Chapter 2 Logic Gates and Introduction to Computer Architecture

Three-Phase Dual-Rail Pre-Charge Logic

Lecture 39: Intro to Differential Amplifiers. Context

Chapter 7 Memory and Programmable Logic

Leakage Power Reduction Using Sleepy Stack Power Gating Technique

Chapter 19 Operational Amplifiers

Lecture 24. Inductance and Switching Power Supplies (how your solar charger voltage converter works)

The 104 Duke_ACC Machine

Sequential Logic: Clocks, Registers, etc.

Lecture 060 Push-Pull Output Stages (1/11/04) Page ECE Analog Integrated Circuits and Systems II P.E. Allen

3D NAND Technology Implications to Enterprise Storage Applications

Clocking. Figure by MIT OCW Spring /18/05 L06 Clocks 1

This application note is written for a reader that is familiar with Ethernet hardware design.

Lesson 12 Sequential Circuits: Flip-Flops

e.g. τ = 12 ps in 180nm, 40 ps in 0.6 µm Delay has two components where, f = Effort Delay (stage effort)= gh p =Parasitic Delay

(Refer Slide Time: 00:01:16 min)

Lecture 5: Logical Effort

Memory Testing. Memory testing.1

Small Signal Analysis of a PMOS transistor Consider the following PMOS transistor to be in saturation. Then, 1 2

Latch Timing Parameters. Flip-flop Timing Parameters. Typical Clock System. Clocking Overhead

Optimization and Comparison of 4-Stage Inverter, 2-i/p NAND Gate, 2-i/p NOR Gate Driving Standard Load By Using Logical Effort

COMBINATIONAL and SEQUENTIAL LOGIC CIRCUITS Hardware implementation and software design

International Journal of Electronics and Computer Science Engineering 1482

Digital to Analog Converter. Raghu Tumati

Lecture 30: Biasing MOSFET Amplifiers. MOSFET Current Mirrors.

«A 32-bit DSP Ultra Low Power accelerator»

DIGITAL-TO-ANALOGUE AND ANALOGUE-TO-DIGITAL CONVERSION

How To Calculate The Power Gain Of An Opamp

Lecture 10 Sequential Circuit Design Zhuo Feng. Z. Feng MTU EE4800 CMOS Digital IC Design & Analysis 2010

ECE 3401 Lecture 7. Concurrent Statements & Sequential Statements (Process)

- 35mA Standby, mA Speaking pre-defined phrases with up to 1925 total characters.

AUTOMATIC NIGHT LAMP WITH MORNING ALARM USING MICROPROCESSOR

Sequential Circuits. Combinational Circuits Outputs depend on the current inputs

Switch Fabric Implementation Using Shared Memory

A Dual-Mode NAND Flash Memory: 1-Gb Multilevel and High-Performance 512-Mb Single-Level Modes

Lecture 11: Sequential Circuit Design

ECEN 1400, Introduction to Analog and Digital Electronics

VENDING MACHINE. ECE261 Project Proposal Presentaion. Members: ZHANG,Yulin CHEN, Zhe ZHANG,Yanni ZHANG,Yayuan

Chapter 6: From Digital-to-Analog and Back Again

SLC vs MLC NAND and The Impact of Technology Scaling. White paper CTWP010

Application Note, V 2.2, Nov AP32091 TC1766. Design Guideline for TC1766 Microcontroller Board Layout. Microcontrollers. Never stop thinking.

Low Power and Reliable SRAM Memory Cell and Array Design

Transmission Line Terminations It s The End That Counts!

Transcription:

Lecture 11: MOS Memory Mark Horowitz Computer Systems Laboratory Stanford University horowitz@stanford.edu MAH EE271 Lecture 11 1

Memory Reading W&E 8.3.1-8.3.2 - Memory Design Introduction Memories are one of the most useful VLSI building blocks. One reason for their utility is that memory arrays can be extremely dense. This density results from their very regular wiring. Memories come in many different types (RAM, ROM, EEPROM) and there are many different types of cells, but the basic idea and organization is pretty similar. We will look at the most common memory cell that is used today, a 6T sram cell, and then look at the other components needed to build complete memory system. We will also look at other types of memories. MAH EE271 Lecture 11 2

Best Case for Structured Wiring Best example is a memory array: decoder mux It has N 2 elements and only 2N wires. It is an easy way to use 1M transistors. The layout is quite dense. MAH EE271 Lecture 11 3

Basic Memory Cell Could use a basic latch cell: Phi Bus Loadq1_b Load_q1 Read_b Read But the cell needs to be a static latch Needs a large number of control wires to be able to read and write the cell. These wires (and transistors) make the cell pretty large. MAH EE271 Lecture 11 4

Latch Cell Layout The cell is 78 wide by 55 tall, the top and bottom rails can be shared Notice that the wires take all the space MAH EE271 Lecture 11 5

Small Memory Cell Often need to have a large number of bits stored: In some cases more bits are better Willing to take some time to optimize cell Have enough cells it is ok to make peripheral circuits more complex Lead to many innovative cell designs 6T RAM cells 4T RAM cell with poly loads 1T DRAM cell And lots of strange layouts We will look at the 6T RAM, which is the key to all memory cells MAH EE271 Lecture 11 6

Static RAM Uses only six transistors: Bit_b Bit Read and write use the same port. There is one wordline and two bit lines. The bit lines carry the data. The cell is small since it has a small number of wires. MAH EE271 Lecture 11 7

SRAM The key issue in an 6T SRAM is how to distinguish between read and writes. There is only one wordline, so it must be high for both reads and writes. The key is to use the fact there are two bitlines. Read: Both Bit and Bit must start high. A high value on the bitline does not change the value in the cell, so the cell will pulls one of the lines low Write: One (Bit or Bit) is forced low, the other is high This low value overpowers the pmos in the inverter, and this will write the cell. MAH EE271 Lecture 11 8

SRAM Cell Design For the cell to work correctly a zero on the bit line must over power the pmos pull up, but a one on the bit line must not over power the pull down (otherwise reads would not work) For the pull down M3 is passing a zero, so for it to overpower the pmos it must be at least as wide (preferably 1.5x as wide). This gives a 2-3:1 current ratio between the nmos and the pmos. For pull up M3 is passing a one so it is somewhat weaker. Still M3 should be 1.5 to 2x smaller than M1 to make sure a read does not disturb the value of the cell. MAH EE271 Lecture 11 9

SRAM Layout There are many clever SRAM layouts. This is a common one: Vdd Wordline Gnd Bit_b Bit Cell boundary This layout is fairly dense, since the most of the contacts (bitline, Vdd, Gnd) are shared. Also the a clever cross-coupling method is used. MAH EE271 Lecture 11 10

SRAM Layout A conservative cell: - It has substrate and well connects in each cell - It has a wordline poly contact in each cell - pmos transistors are very weak (3:3) - nmos pulldown is 8:2 - All the boundaries are shared - 41 x 28, about 1/4 the size of latch cell A slightly smaller cell - Only nwell contact in cell - pmos transistors are very weak (3:3) - nmos pulldown is 6:2-36 x 28 White box is the repeat box. Cells overlap contacts MAH EE271 Lecture 11 11

SRAM Array This is an array of 3 cells wide and 2 high. Shows how the contacts are shared in both X and Y MAH EE271 Lecture 11 12

Memory Array Lets look at how this cell works in an array: decoder The decoder selects one cell on each set of bit lines. All the other wordlines are low, disconnecting those cells from the bitlines. If you don t need to read all the bits at once, you can add a mux to combine the bitlines into fewer IO lines. MAH EE271 Lecture 11 13

Bitline IO Circuit - Read For reads both bitlines must be high, for write you need to drive the bitlines to the correct value. Bitlines need to be precharged, or use a pseudo nmos load We will use a precharged structure: To avoid a conflict during precharge, make wordline a qualified clock. Φ2 Read_v1 Wordline_q1 bit_b_v1 Read_b_v1 bit_v1 Bitlines are like the outputs of normal precharge gates - _v signals MAH EE271 Lecture 11 14

Bitline IO Circuit - Write Similar to read, but need to drive the bit lines too. Data_s1 *Instead of Φ1 could be ANDed with Φ2 Write_q1* Wordline_q1 bit_b_v1 bit_v1 Φ2 It is safer if the write drivers are complete tristate buffer (had a pullup device too) rather than just pulldowns. This will allow the driver to pull up a bitline that was partially discharged by the cell (if the wordline rises before the write signal) Notice that since the memory cell is a storage element, its enable (the wordline) needs to be a _q signal. That will ensure that the clock falls latching in the data BEFORE the data has a chance to change. The wordline is really the clock to the latch (memory) cell. Need to isolate the write driver so it does not fight with the precharge (power issue), which is why the write signal is qualified. MAH EE271 Lecture 11 15

Write Drivers Also need to worry about the series resistance of the driver The resistance of the 3 series nmos transistors must still be 2x less than the resistance of the pmos in the cell Wordline_q1 Φ2 If the pass device in the cell is 4:2, and the pmos load is 3:4, then each nmos in the driver must be at least 8:2. If the pmos was 3:2, it would be hard to get the cell to write in irsim. MAH EE271 Lecture 11 16

Write Alternative MAH EE271 Lecture 11 17

+ Bitline Muxing The bitline pitch is pretty small (about 28λ) and there is a lot of stuff that is needed for the bitline (read and write circuits). Often many bitlines are muxed together, and one set of IO circuits is used for these bitlines Two basic options: Share mux between read and write circuits + Least amount of logic needed on bit pitch - Adds another series device to write drivers, need to use single write device Use different mux for read and write + No series devices in write path Write mux can be qualified with Write & Clock - Adds some more muxes to bit logic MAH EE271 Lecture 11 18

+ Bitline Mux, Option 1 Should have a precharge on the output of the mux too, since otherwise the output will have a degraded high level. MAH EE271 Lecture 11 19

+ Muxing Example Uses separate mux for read and write Notice that the read mux is precharged You don t need to use pmos devices in the mux Data_s1 A0_s1 WriteOdd_q1 Φ2 Φ2 Write Driver Read Mux A0_s1 Write Mux Cells WriteEven_q1 Bitline Precharge MAH EE271 Lecture 11 20

+ Sense Amplifiers Since Cdiff ~ Cgate, the diffusion contacts of access devices cause large cap on the bitlines, for large arrays Bitline cap becomes an issues around 32 cells/bitline Example: If the diffusion contacts are shared (adjacent cells), 128 cells @4fF/ 2cells = 256fF + wire cap. This would lead to access times of around 3ns. Can take advantage of the differential nature of the bitlines Decrease delay by sensing smaller signals Noise margin is ok, most noise is common mode Build a differential amplifier (sense amp) (Not needed in EE271) MAH EE271 Lecture 11 21

+ Sense Amps Gnd connection goes to sense clk MAH EE271 Lecture 11 22

+ RAM Variations There are many variations to the basic 6T SRAM cell. Some are cells with more functionality, while others are smaller memory cells (that are dynamic, not static). We won t talk about them much in the class, but I will go through them in the notes, so you can see what is possible. Options: Dual read or single write cell True multiported cells (for registerfiles, etc.) Content addressable cells (CAMs) 4T dynamic memory cell 3T 1T DRAM cell MAH EE271 Lecture 11 23

+ Dual Ported Cell Split wordline so there are two wordlines, one for each pass transistor. WL2 WL1 Bit_b Bit Nearly the same size as SRAM, 46 x 28 Can read two different cells in one cycle or perform one write. - Raise WL1 on Register 5, and WL2 on Register 7. Register 5 value will be on bit_b (complemented, of course), and Register 7 will be on bit. Since you need both bit and bit_b to write the cell, you can only do one write per cycle. MAH EE271 Lecture 11 24

+ Multiported Memory Can build true multi-ported memory cell, by adding more bitline pairs and wordlines to a cell. WL1 Bit2_b Bit1_b Bit1 Bit2 WL2 Shown in the figure is a true dual port cell. You can read or write on each port every cycle. Since it has more bitlines than the previous cell, it is much larger in area. MAH EE271 Lecture 11 25

+ Content Addressable Memory (CAM) In some applications it is nice to find out if anything in the memory matches a certain key value. This can be done by a special memory cell called a CAM cell. Each CAM cell contains an XOR gate that compares the cell value with the data on the bitlines. If this bit matches, nothing happens. If it does not match the bitline value, it pulls the match line low. Connecting all the bits in a word to a precharged Match line (the bitline must be low during the match line precharge) means that the Match line will remain high, only if the value in the memory matches the key. One can even have don t cares in the key by driving both bitlines low. Bit_b Bit Vdd WL Match Gnd MAH EE271 Lecture 11 26

+ 4T RAM and 3T RAM Take 6T cell and remove pmos transistors Makes cell smaller (28 x 28) Circuit design is much harder - Internal nodes don t go to Vdd - Cell won t work at low Vdd - High value stored is degraded, Bit_b effective strength of nmos pulldown is reduced Bit Make this circuit single ended, and get 3T cell Need 2 wordlines, Read WL, Write WL Can have 1 or 2 bit lines (Read / Write) Not very small, since it has more wires ReadBit_b WriteBit MAH EE271 Lecture 11 27

+ 1T DRAM WriteBit Continue to remove transistors and wires Get 1 T, one WL, and one BL Pretty weird, no read bus - Read by cap charge sharing When WL goes high, cap shares charge with bitline, changing its value slightly. It is this change in value that is detected by the sense amps - Read are destructive Once the charge sharing occurs, the cell does not have its value any more. After every read, the cell need to be rewritten by driving the bitline high (or low) before the wordline is lowered - Wordlines generally go > Vdd Would like to get as much charge as possible on the cap. Loosing a V th does not leave enough signal on the capacitor for sensing. MAH EE271 Lecture 11 28