CS250 VLSI Systems Design Lecture 8: Memory
|
|
|
- Nicholas Murphy
- 9 years ago
- Views:
Transcription
1 CS250 VLSI Systems esign Lecture 8: Memory John Wawrzynek, Krste Asanovic, with John Lazzaro and Yunsup Lee (TA) UC Berkeley Fall 2010
2 CMOS Bistable 1 0 Flip State 0 1 Cross-coupled inverters used to hold state in CMOS Static storage in powered cell, no refresh needed If a storage node leaks or is pushed slightly away from correct value, non-linear transfer function of high-gain inverter removes noise and recirculates correct value To write new state, have to force nodes to opposite state 2
3 CMOS Transparent Latch Latch transparent (output follows input) when clock is high, holds last value when clock is low Q Optional Input Buffer Optional Output Buffer Transmission gate switch with both pmos and nmos passes both ones and zeros well Schematic Symbols Q 3 Q Transparent on clock low
4 Latch Operation 0 1 Q Q Q Clock High Latch Transparent Clock Low Latch Holding 4
5 Flip-Flop as Two Latches Sample Hold Q Q Schematic Symbols Q This is how standard cell flip-flops are built (usually with extra in/out buffers) 5
6 Small Memories from Stdcell Latches Write Address Write ata Read Address Write Address ecoder ata held in transparent-low latches Read Address ecoder Write by clocking latch Add additional ports by replicating read and write port logic (multiple write ports need mux in front of latch) Expensive to add many ports Combinational logic for read port (synthesized) Optional read output latch 6
7 6-Transistor SRAM (Static RAM) Wordline Bit Bit Large on-chip memories built from arrays of static RAM bitcells, where each bit cell holds a bistable (crosscoupled inverters) and two access transistors. Other clocking and access logic factored out into periphery 7
8 )*%!+,-%."/$%012# Intel s 22nm SRAM cell um 2 SRAM cell for high density applications um 2 SRAM cell for low voltage applications!"!#$%&' $ ()%*+,%)'-..,)*%/012%3,..% 8 (4%5678(49%3(73&(*)%7,:67*,;%*6%;-*, [Bohr, Intel, Sept 2009]
9 General SRAM Structure Bitline Prechargers Address ecode and Wordline river Usually maximum of bits per row or column Address Write Enable ifferential Read Sense Amplifiers ifferential Write rivers Write ata 9 Read ata
10 Unary 1-of-4 encoding Address ecoder Structure Word Line 0 Word Line 1 Word Line 15 2:4 Predecoders A0 A1 A2 Address 10 A3 Clocked Word Line Enable
11 Read Cycle Prechargers From ecoder Storage Cells Bit/Bit Wordline Bitline differential Wordline Clock Sense Bit Bit ata/ata Sense Sense Amp 1) 2) 3) Full-rail swing ata Output Set-Reset Latch ata 1) Precharge bitlines and senseamp 2) Pulse wordlines, develop bitline differential voltage 3) isconnect bitlines from senseamp, activate sense pulldown, develop full-rail data signals Pulses generated by internal self-timed signals, often using replica circuits representing critical paths 11
12 Write Cycle Prechargers Bit/Bit From ecoder Storage Cells Wordline 1) 2) Wordline Clock Bit Bit 1) Precharge bitlines 2) Open wordline, pull down one bitline full rail Write Enable Write ata Write-enable can be controlled on a per-bit level. If bit lines not driven during write, cell retains value (looks like a read to the cell). 12
13 Column-Muxing at Sense Amps From ecoder Wordline Clock Sel0 Sel1 ata Sense Amp ifficult to pitch match sense amp to tight SRAM bit cell spacing so often 2-8 columns share one sense amp. Impacts power dissipation as multiple bitline pairs swing for each bit read. 13 ata
14 Building Larger Memories Bit cells I/O Bit cells Bit cells I/O Bit cells e c e c e c e c Bit cells I/O Bit cells Bit cells I/O Bit cells Bit cells I/O Bit cells Bit cells I/O Bit cells e c e c e c e c Bit cells I/O Bit cells Bit cells I/O Bit cells Large arrays constructed by tiling multiple leaf arrays, sharing decoders and I/O circuitry e.g., sense amp attached to arrays above and below Leaf array limited in size to bits in row/column due to RC delay of wordlines and bitlines Also to reduce power by only activating selected sub-bank In larger memories, delay and energy dominated by I/O wiring 14
15 Adding More Ports WordlineB WordlineA BitB BitB ifferential Read or Write ports BitA BitA Wordline 15 Read Bitline Optional Single-ended Read port
16 Memory Compilers In ASIC flow, memory compilers used to generate layout for SRAM blocks in design Often hundreds of memory instances in a modern SoC Memory generators can also produce built-in self-test (BIST) logic, to speed manufacturing testing, and redundant rows/ columns to improve yield Compiler can be parameterized by number of words, number of bits per word, desired aspect ratio, number of sub banks, degree of column muxing, etc. Area, delay, and energy consumption complex function of design parameters and generation algorithm Worth experimenting with design space Usually only single read or write port SRAM and one read and one write SRAM generators in ASIC library 16
17 Small Memories Compiled SRAM arrays usually have a high overhead due to peripheral circuits, BIST, redundancy. Small memories are usually built from latches and/or flipflops in a stdcell flow Cross-over point is usually around 1K bits of storage Should try design both ways 17
18 Memory esign Patterns 18
19 Multiport Memory esign Patterns Often we require multiple access ports to a common memory True Multiport Memory As describe earlier in lecture, completely independent read and write port circuitry Banked Multiport Memory Interleave lesser-ported banks to provide higher bandwidth Stream-Buffered Multiport Memory Use single wider access port to provide multiple narrower streaming ports Cached Multiport Memory Use large single-port main memory, but add cache to service 19
20 True Multiport Memory Problem: Require simultaneous read and write access by multiple independent agents to a shared common memory. Solution: Provide separate read and write ports to each bit cell for each requester Applicability: Where unpredictable access latency to the shared memory cannot be tolerated. Consequences: High area, energy, and delay cost for large number of ports. Must define behavior when multiple writes on same cycle to same word (e.g., prohibit, provide priority, or combine writes). 20
21 True Multiport Example: Itanium-2 Regfile IEEE JOURNAL OF SOLI-STATE CIRCUITS, VOL. 37, NO. 11, N Intel Itanium-2 [Fetzer et al, IEEE JSSCC 2002] 21
22 Itanium-2 Regfile Timing Fig. 1. IEU state and timing diagram. 22
23 Banked Multiport Memory Problem: Require simultaneous read and write access by multiple independent agents to a large shared common memory. Solution: ivide memory capacity into smaller banks, each of which has fewer ports. Requests are distributed across banks using a fixed hashing scheme. Multiple requesters arbitrate for access to same bank/port. Applicability: Requesters can tolerate variable latency for accesses. Accesses are distributed across address space so as to avoid hotspots. Consequences: Requesters must wait arbitration delay to determine if request will complete. Have to provide interconnect between each requester and each bank/port. Can have greater, equal, or lesser number of banks*ports/bank compared to total number of external access ports. 23
24 Banked Multiport Memory Port A Port B Arbitration and Crossbar Bank 0 Bank 1 Bank 2 Bank 3 24
25 Banked Multiport Memory Example Pentium (P5) 8-way interleaved data cache, with two ports ache, all rocessor that the re band- CPU can that the ith data pipeline. ferences Third, in r design, nd data a unified 8-Kbyte, ual-ported TLB ual-ported cache tags Bank conflict detection Figure 7. ual-access data cache. 25 b 7 7 Singe-ported and interleaved cache data I I I [Alpert et al, IEEE Micro, May 1993] frequency.) For the Pentium microprocessor, with its higher
26 Stream-Buffered Multiport Memory Problem: Require simultaneous read and write access by multiple independent agents to a large shared common memory, where each requester usually makes multiple sequential accesses. Solution: Organize memory to have a single wide port. Provide each requester with an internal stream buffer that holds width of data returned/ consumed by each access. Each requester can access stream buffer without contention, but arbitrates with others to read/write stream buffer. Applicability: Requesters make mostly sequential requests and can tolerate variable latency for accesses. Consequences: Requesters must wait arbitration delay to determine if request will complete. Have to provide stream buffers for each requester. Need sufficient access width to serve aggregate bandwidth demands of all requesters, but wide data access can be wasted if not all used by requester. Have to specify memory consistency model between ports (e.g., provide stream flush operations). 26
27 Stream-Buffered Multiport Memory Stream Buffer A Stream Buffer B Port A Port B Arbitration Wide Memory 27
28 Stream-Buffered Multiport Examples IBM Cell microprocessor local store [Chen et al., IBM, 2005] The SPUs SIM support can perform operations on sixteen 8-bit integers, eight 16-bit integers, four 32-bit integers, or four single-precision floating-point numbers per cycle. At 3.2GHz, each SPU is capable of perfo up to Lecture 51.2 billion 8, Memory 8-bit integer operations or 25.6GFLOPs 28 in single precision. CS250, Figure UC 3 Berkeley, shows Fall the 2010 main func units in an SPU: (1) an SPU floating point unit for single-precision, double-precision, and integer multiplies
29 Cached Multiport Memory Problem: Require simultaneous read and write access by multiple independent agents to a large shared common memory. Solution: Provide each access port with a local cache of recently touched addresses from common memory, and use a cache coherence protocol to keep the cache contents in sync. Applicability: Request streams have significant temporal locality, and limited communication between different ports. Consequences: Requesters will experience variable delay depending on access pattern and operation of cache coherence protocol. Tag overhead in both area, delay, and energy/access. Complexity of cache coherence protocol. 29
30 Cached Multiport Memory Port A Port B Cache A Cache B Arbitration and Interconnect Common Memory 30
31 Replicated State Multiport Memory Problem: Require simultaneous read and write access by multiple independent agents to a small shared common memory. Cannot tolerate variable latency of access. Solution: Replicate storage and divide read ports among replicas. Each replica has enough write ports to keep all replicas in sync. Applicability: Many read ports required, and variable latency cannot be tolerated. Consequences: Potential increase in latency between some writers and some readers. 31
32 Replicated State Multiport Memory Write Port 0 Write Port 1 Copy 0 Copy 1 Read Ports 32 Example: Alpha Regfile clusters
33 Memory Hierarchy esign Patterns Use small fast memory together large slow memory to provide illusion of large fast memory. Explicitly managed local stores Automatically managed cache hierarchies 33
Static-Noise-Margin Analysis of Conventional 6T SRAM Cell at 45nm Technology
Static-Noise-Margin Analysis of Conventional 6T SRAM Cell at 45nm Technology Nahid Rahman Department of electronics and communication FET-MITS (Deemed university), Lakshmangarh, India B. P. Singh Department
CHAPTER 16 MEMORY CIRCUITS
CHPTER 6 MEMORY CIRCUITS Chapter Outline 6. atches and Flip-Flops 6. Semiconductor Memories: Types and rchitectures 6.3 Random-ccess Memory RM Cells 6.4 Sense-mplifier and ddress Decoders 6.5 Read-Only
Memory Basics. SRAM/DRAM Basics
Memory Basics RAM: Random Access Memory historically defined as memory array with individual bit access refers to memory with both Read and Write capabilities ROM: Read Only Memory no capabilities for
Chapter 2 Logic Gates and Introduction to Computer Architecture
Chapter 2 Logic Gates and Introduction to Computer Architecture 2.1 Introduction The basic components of an Integrated Circuit (IC) is logic gates which made of transistors, in digital system there are
Chapter 5 :: Memory and Logic Arrays
Chapter 5 :: Memory and Logic Arrays Digital Design and Computer Architecture David Money Harris and Sarah L. Harris Copyright 2007 Elsevier 5- ROM Storage Copyright 2007 Elsevier 5- ROM Logic Data
Clocking. Figure by MIT OCW. 6.884 - Spring 2005 2/18/05 L06 Clocks 1
ing Figure by MIT OCW. 6.884 - Spring 2005 2/18/05 L06 s 1 Why s and Storage Elements? Inputs Combinational Logic Outputs Want to reuse combinational logic from cycle to cycle 6.884 - Spring 2005 2/18/05
EE 42/100 Lecture 24: Latches and Flip Flops. Rev B 4/21/2010 (2:04 PM) Prof. Ali M. Niknejad
A. M. Niknejad University of California, Berkeley EE 100 / 42 Lecture 24 p. 1/20 EE 42/100 Lecture 24: Latches and Flip Flops ELECTRONICS Rev B 4/21/2010 (2:04 PM) Prof. Ali M. Niknejad University of California,
Computer Systems Structure Main Memory Organization
Computer Systems Structure Main Memory Organization Peripherals Computer Central Processing Unit Main Memory Computer Systems Interconnection Communication lines Input Output Ward 1 Ward 2 Storage/Memory
1. Memory technology & Hierarchy
1. Memory technology & Hierarchy RAM types Advances in Computer Architecture Andy D. Pimentel Memory wall Memory wall = divergence between CPU and RAM speed We can increase bandwidth by introducing concurrency
Low Power AMD Athlon 64 and AMD Opteron Processors
Low Power AMD Athlon 64 and AMD Opteron Processors Hot Chips 2004 Presenter: Marius Evers Block Diagram of AMD Athlon 64 and AMD Opteron Based on AMD s 8 th generation architecture AMD Athlon 64 and AMD
ECE410 Design Project Spring 2008 Design and Characterization of a CMOS 8-bit Microprocessor Data Path
ECE410 Design Project Spring 2008 Design and Characterization of a CMOS 8-bit Microprocessor Data Path Project Summary This project involves the schematic and layout design of an 8-bit microprocessor data
Memory. The memory types currently in common usage are:
ory ory is the third key component of a microprocessor-based system (besides the CPU and I/O devices). More specifically, the primary storage directly addressed by the CPU is referred to as main memory
Power Reduction Techniques in the SoC Clock Network. Clock Power
Power Reduction Techniques in the SoC Network Low Power Design for SoCs ASIC Tutorial SoC.1 Power Why clock power is important/large» Generally the signal with the highest frequency» Typically drives a
Sequential 4-bit Adder Design Report
UNIVERSITY OF WATERLOO Faculty of Engineering E&CE 438: Digital Integrated Circuits Sequential 4-bit Adder Design Report Prepared by: Ian Hung (ixxxxxx), 99XXXXXX Annette Lo (axxxxxx), 99XXXXXX Pamela
Module 2. Embedded Processors and Memory. Version 2 EE IIT, Kharagpur 1
Module 2 Embedded Processors and Memory Version 2 EE IIT, Kharagpur 1 Lesson 5 Memory-I Version 2 EE IIT, Kharagpur 2 Instructional Objectives After going through this lesson the student would Pre-Requisite
Lecture 10: Latch and Flip-Flop Design. Outline
Lecture 1: Latch and Flip-Flop esign Slides orginally from: Vladimir Stojanovic Computer Systems Laboratory Stanford University [email protected] 1 Outline Recent interest in latches and flip-flops
RAM & ROM Based Digital Design. ECE 152A Winter 2012
RAM & ROM Based Digital Design ECE 152A Winter 212 Reading Assignment Brown and Vranesic 1 Digital System Design 1.1 Building Block Circuits 1.1.3 Static Random Access Memory (SRAM) 1.1.4 SRAM Blocks in
8 Gbps CMOS interface for parallel fiber-optic interconnects
8 Gbps CMOS interface for parallel fiberoptic interconnects Barton Sano, Bindu Madhavan and A. F. J. Levi Department of Electrical Engineering University of Southern California Los Angeles, California
Contents. Overview... 5-1 Memory Compilers Selection Guide... 5-2
Memory Compilers 5 Contents Overview... 5-1 Memory Compilers Selection Guide... 5-2 CROM Gen... 5-3 DROM Gen... 5-9 SPSRM Gen... 5-15 SPSRM Gen... 5-22 SPRM Gen... 5-31 DPSRM Gen... 5-38 DPSRM Gen... 5-47
A New Paradigm for Synchronous State Machine Design in Verilog
A New Paradigm for Synchronous State Machine Design in Verilog Randy Nuss Copyright 1999 Idea Consulting Introduction Synchronous State Machines are one of the most common building blocks in modern digital
NEW adder cells are useful for designing larger circuits despite increase in transistor count by four per cell.
CHAPTER 4 THE ADDER The adder is one of the most critical components of a processor, as it is used in the Arithmetic Logic Unit (ALU), in the floating-point unit and for address generation in case of cache
Architectural Level Power Consumption of Network on Chip. Presenter: YUAN Zheng
Architectural Level Power Consumption of Network Presenter: YUAN Zheng Why Architectural Low Power Design? High-speed and large volume communication among different parts on a chip Problem: Power consumption
Efficient Interconnect Design with Novel Repeater Insertion for Low Power Applications
Efficient Interconnect Design with Novel Repeater Insertion for Low Power Applications TRIPTI SHARMA, K. G. SHARMA, B. P. SINGH, NEHA ARORA Electronics & Communication Department MITS Deemed University,
CHAPTER 11: Flip Flops
CHAPTER 11: Flip Flops In this chapter, you will be building the part of the circuit that controls the command sequencing. The required circuit must operate the counter and the memory chip. When the teach
Semiconductor Memories
Semiconductor Memories Semiconductor memories array capable of storing large quantities of digital information are essential to all digital systems Maximum realizable data storage capacity of a single
A New, High-Performance, Low-Power, Floating-Point Embedded Processor for Scientific Computing and DSP Applications
1 A New, High-Performance, Low-Power, Floating-Point Embedded Processor for Scientific Computing and DSP Applications Simon McIntosh-Smith Director of Architecture 2 Multi-Threaded Array Processing Architecture
Naveen Muralimanohar Rajeev Balasubramonian Norman P Jouppi
Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0 Naveen Muralimanohar Rajeev Balasubramonian Norman P Jouppi University of Utah & HP Labs 1 Large Caches Cache hierarchies
Switch Fabric Implementation Using Shared Memory
Order this document by /D Switch Fabric Implementation Using Shared Memory Prepared by: Lakshmi Mandyam and B. Kinney INTRODUCTION Whether it be for the World Wide Web or for an intra office network, today
Lecture 5: Gate Logic Logic Optimization
Lecture 5: Gate Logic Logic Optimization MAH, AEN EE271 Lecture 5 1 Overview Reading McCluskey, Logic Design Principles- or any text in boolean algebra Introduction We could design at the level of irsim
Nexus: An Asynchronous Crossbar Interconnect for Synchronous System-on-Chip Designs
Nexus: An Asynchronous Crossbar Interconnect for Synchronous System-on-Chip Designs Andrew Lines Fulcrum Microsystems 26775 Malibu Hills Road, Calabasas, CA 9131 lines@fulcrummicrocom Abstract Asynchronous
ESP-CV Custom Design Formal Equivalence Checking Based on Symbolic Simulation
Datasheet -CV Custom Design Formal Equivalence Checking Based on Symbolic Simulation Overview -CV is an equivalence checker for full custom designs. It enables efficient comparison of a reference design
Chapter 9 Semiconductor Memories. Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan
Chapter 9 Semiconductor Memories Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan Advanced Reliable Systems (ARES) Lab. Jin-Fu Li, EE, NCU 2 Outline Introduction
Sequential Logic: Clocks, Registers, etc.
ENEE 245: igital Circuits & Systems Lab Lab 2 : Clocks, Registers, etc. ENEE 245: igital Circuits and Systems Laboratory Lab 2 Objectives The objectives of this laboratory are the following: To design
Introduction to CMOS VLSI Design
Introduction to CMOS VLSI esign Slides adapted from: N. Weste,. Harris, CMOS VLSI esign, Addison-Wesley, 3/e, 24 Introduction Integrated Circuits: many transistors on one chip Very Large Scale Integration
Topics of Chapter 5 Sequential Machines. Memory elements. Memory element terminology. Clock terminology
Topics of Chapter 5 Sequential Machines Memory elements Memory elements. Basics of sequential machines. Clocking issues. Two-phase clocking. Testing of combinational (Chapter 4) and sequential (Chapter
Alpha CPU and Clock Design Evolution
Alpha CPU and Clock Design Evolution This lecture uses two papers that discuss the evolution of the Alpha CPU and clocking strategy over three CPU generations Gronowski, Paul E., et.al., High Performance
PowerPC Microprocessor Clock Modes
nc. Freescale Semiconductor AN1269 (Freescale Order Number) 1/96 Application Note PowerPC Microprocessor Clock Modes The PowerPC microprocessors offer customers numerous clocking options. An internal phase-lock
So far we have investigated combinational logic for which the output of the logic devices/circuits depends only on the present state of the inputs.
equential Logic o far we have investigated combinational logic for which the output of the logic devices/circuits depends only on the present state of the inputs. In sequential logic the output of the
Sequential Circuits. Combinational Circuits Outputs depend on the current inputs
Principles of VLSI esign Sequential Circuits Sequential Circuits Combinational Circuits Outputs depend on the current inputs Sequential Circuits Outputs depend on current and previous inputs Requires separating
Lecture 11: Sequential Circuit Design
Lecture 11: Sequential Circuit esign Outline Sequencing Sequencing Element esign Max and Min-elay Clock Skew Time Borrowing Two-Phase Clocking 2 Sequencing Combinational logic output depends on current
NTE2053 Integrated Circuit 8 Bit MPU Compatible A/D Converter
NTE2053 Integrated Circuit 8 Bit MPU Compatible A/D Converter Description: The NTE2053 is a CMOS 8 bit successive approximation Analog to Digital converter in a 20 Lead DIP type package which uses a differential
Lecture 7: Clocking of VLSI Systems
Lecture 7: Clocking of VLSI Systems MAH, AEN EE271 Lecture 7 1 Overview Reading Wolf 5.3 Two-Phase Clocking (good description) W&E 5.5.1, 5.5.2, 5.5.3, 5.5.4, 5.5.9, 5.5.10 - Clocking Note: The analysis
S. Venkatesh, Mrs. T. Gowri, Department of ECE, GIT, GITAM University, Vishakhapatnam, India
Power reduction on clock-tree using Energy recovery and clock gating technique S. Venkatesh, Mrs. T. Gowri, Department of ECE, GIT, GITAM University, Vishakhapatnam, India Abstract Power consumption of
EE482: Advanced Computer Organization Lecture #11 Processor Architecture Stanford University Wednesday, 31 May 2000. ILP Execution
EE482: Advanced Computer Organization Lecture #11 Processor Architecture Stanford University Wednesday, 31 May 2000 Lecture #11: Wednesday, 3 May 2000 Lecturer: Ben Serebrin Scribe: Dean Liu ILP Execution
Memory Systems. Static Random Access Memory (SRAM) Cell
Memory Systems This chapter begins the discussion of memory systems from the implementation of a single bit. The architecture of memory chips is then constructed using arrays of bit implementations coupled
What is a System on a Chip?
What is a System on a Chip? Integration of a complete system, that until recently consisted of multiple ICs, onto a single IC. CPU PCI DSP SRAM ROM MPEG SoC DRAM System Chips Why? Characteristics: Complex
Interfacing Analog to Digital Data Converters
Converters In most of the cases, the PIO 8255 is used for interfacing the analog to digital converters with microprocessor. We have already studied 8255 interfacing with 8086 as an I/O port, in previous
A 10,000 Frames/s 0.18 µm CMOS Digital Pixel Sensor with Pixel-Level Memory
Presented at the 2001 International Solid State Circuits Conference February 5, 2001 A 10,000 Frames/s 0.1 µm CMOS Digital Pixel Sensor with Pixel-Level Memory Stuart Kleinfelder, SukHwan Lim, Xinqiao
Lecture 10: Sequential Circuits
Introduction to CMOS VLSI esign Lecture 10: Sequential Circuits avid Harris Harvey Mudd College Spring 2004 Outline q Sequencing q Sequencing Element esign q Max and Min-elay q Clock Skew q Time Borrowing
Modeling Sequential Elements with Verilog. Prof. Chien-Nan Liu TEL: 03-4227151 ext:34534 Email: [email protected]. Sequential Circuit
Modeling Sequential Elements with Verilog Prof. Chien-Nan Liu TEL: 03-4227151 ext:34534 Email: [email protected] 4-1 Sequential Circuit Outputs are functions of inputs and present states of storage elements
Memory Hierarchy. Arquitectura de Computadoras. Centro de Investigación n y de Estudios Avanzados del IPN. [email protected]. MemoryHierarchy- 1
Hierarchy Arturo Díaz D PérezP Centro de Investigación n y de Estudios Avanzados del IPN [email protected] Hierarchy- 1 The Big Picture: Where are We Now? The Five Classic Components of a Computer Processor
Computer Systems Structure Input/Output
Computer Systems Structure Input/Output Peripherals Computer Central Processing Unit Main Memory Computer Systems Interconnection Communication lines Input Output Ward 1 Ward 2 Examples of I/O Devices
International Journal of Electronics and Computer Science Engineering 1482
International Journal of Electronics and Computer Science Engineering 1482 Available Online at www.ijecse.org ISSN- 2277-1956 Behavioral Analysis of Different ALU Architectures G.V.V.S.R.Krishna Assistant
Introduction to CMOS VLSI Design (E158) Lecture 8: Clocking of VLSI Systems
Harris Introduction to CMOS VLSI Design (E158) Lecture 8: Clocking of VLSI Systems David Harris Harvey Mudd College [email protected] Based on EE271 developed by Mark Horowitz, Stanford University MAH
Layout of Multiple Cells
Layout of Multiple Cells Beyond the primitive tier primitives add instances of primitives add additional transistors if necessary add substrate/well contacts (plugs) add additional polygons where needed
Memory Testing. Memory testing.1
Memory Testing Introduction Memory Architecture & Fault Models Test Algorithms DC / AC / Dynamic Tests Built-in Self Testing Schemes Built-in Self Repair Schemes Memory testing.1 Memory Market Share in
1. True or False? A voltage level in the range 0 to 2 volts is interpreted as a binary 1.
File: chap04, Chapter 04 1. True or False? A voltage level in the range 0 to 2 volts is interpreted as a binary 1. 2. True or False? A gate is a device that accepts a single input signal and produces one
With respect to the way of data access we can classify memories as:
Memory Classification With respect to the way of data access we can classify memories as: - random access memories (RAM), - sequentially accessible memory (SAM), - direct access memory (DAM), - contents
Digital Integrated Circuit (IC) Layout and Design
Digital Integrated Circuit (IC) Layout and Design! EE 134 Winter 05 " Lecture Tu & Thurs. 9:40 11am ENGR2 142 " 2 Lab sections M 2:10pm 5pm ENGR2 128 F 11:10am 2pm ENGR2 128 " NO LAB THIS WEEK " FIRST
Interfacing 3V and 5V applications
Authors: Tinus van de Wouw (Nijmegen) / Todd Andersen (Albuquerque) 1.0 THE NEED FOR TERFACG BETWEEN 3V AND 5V SYSTEMS Many reasons exist to introduce 3V 1 systems, notably the lower power consumption
Sequential Circuit Design
Sequential Circuit Design Lan-Da Van ( 倫 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall, 2009 [email protected] http://www.cs.nctu.edu.tw/~ldvan/ Outlines
Two-Phase Clocking Scheme for Low-Power and High- Speed VLSI
International Journal of Advances in Engineering Science and Technology 225 www.sestindia.org/volume-ijaest/ and www.ijaestonline.com ISSN: 2319-1120 Two-Phase Clocking Scheme for Low-Power and High- Speed
TRUE SINGLE PHASE CLOCKING BASED FLIP-FLOP DESIGN
TRUE SINGLE PHASE CLOCKING BASED FLIP-FLOP DESIGN USING DIFFERENT FOUNDRIES Priyanka Sharma 1 and Rajesh Mehra 2 1 ME student, Department of E.C.E, NITTTR, Chandigarh, India 2 Associate Professor, Department
Lecture-3 MEMORY: Development of Memory:
Lecture-3 MEMORY: It is a storage device. It stores program data and the results. There are two kind of memories; semiconductor memories & magnetic memories. Semiconductor memories are faster, smaller,
The Central Processing Unit:
The Central Processing Unit: What Goes on Inside the Computer Chapter 4 Objectives Identify the components of the central processing unit and how they work together and interact with memory Describe how
SPARC64 VIIIfx: CPU for the K computer
SPARC64 VIIIfx: CPU for the K computer Toshio Yoshida Mikio Hondo Ryuji Kan Go Sugizaki SPARC64 VIIIfx, which was developed as a processor for the K computer, uses Fujitsu Semiconductor Ltd. s 45-nm CMOS
Computer Performance. Topic 3. Contents. Prerequisite knowledge Before studying this topic you should be able to:
55 Topic 3 Computer Performance Contents 3.1 Introduction...................................... 56 3.2 Measuring performance............................... 56 3.2.1 Clock Speed.................................
Computer Architecture
Computer Architecture Slide Sets WS 2013/2014 Prof. Dr. Uwe Brinkschulte M.Sc. Benjamin Betting Part 11 Memory Management Computer Architecture Part 11 page 1 of 44 Prof. Dr. Uwe Brinkschulte, M.Sc. Benjamin
Design and analysis of flip flops for low power clocking system
Design and analysis of flip flops for low power clocking system Gabariyala sabadini.c PG Scholar, VLSI design, Department of ECE,PSNA college of Engg and Tech, Dindigul,India. Jeya priyanka.p PG Scholar,
3D NAND Technology Implications to Enterprise Storage Applications
3D NAND Technology Implications to Enterprise Storage Applications Jung H. Yoon Memory Technology IBM Systems Supply Chain Outline Memory Technology Scaling - Driving Forces Density trends & outlook Bit
L4: Sequential Building Blocks (Flip-flops, Latches and Registers)
L4: Sequential Building Blocks (Flip-flops, Latches and Registers) Acknowledgements: Materials in this lecture are courtesy of the following sources and are used with permission. Prof. Randy Katz (Unified
Implementation Details
LEON3-FT Processor System Scan-I/F FT FT Add-on Add-on 2 2 kbyte kbyte I- I- Cache Cache Scan Scan Test Test UART UART 0 0 UART UART 1 1 Serial 0 Serial 1 EJTAG LEON_3FT LEON_3FT Core Core 8 Reg. Windows
SOLVING HIGH-SPEED MEMORY INTERFACE CHALLENGES WITH LOW-COST FPGAS
SOLVING HIGH-SPEED MEMORY INTERFACE CHALLENGES WITH LOW-COST FPGAS A Lattice Semiconductor White Paper May 2005 Lattice Semiconductor 5555 Northeast Moore Ct. Hillsboro, Oregon 97124 USA Telephone: (503)
Memory unit. 2 k words. n bits per word
9- k address lines Read n data input lines Memory unit 2 k words n bits per word n data output lines 24 Pearson Education, Inc M Morris Mano & Charles R Kime 9-2 Memory address Binary Decimal Memory contents
CACTI 3.0: An Integrated Cache Timing, Power, and Area Model
A U G U S T 0 0 WRL Research Report 0/ CACTI 3.0: An Integrated Cache Timing, Power, and Area Model Premkishore Shivakumar and Norman P. Jouppi Western Research Laboratory 50 University Avenue Palo Alto,
Logical Operations. Control Unit. Contents. Arithmetic Operations. Objectives. The Central Processing Unit: Arithmetic / Logic Unit.
Objectives The Central Processing Unit: What Goes on Inside the Computer Chapter 4 Identify the components of the central processing unit and how they work together and interact with memory Describe how
DESIGN CHALLENGES OF TECHNOLOGY SCALING
DESIGN CHALLENGES OF TECHNOLOGY SCALING IS PROCESS TECHNOLOGY MEETING THE GOALS PREDICTED BY SCALING THEORY? AN ANALYSIS OF MICROPROCESSOR PERFORMANCE, TRANSISTOR DENSITY, AND POWER TRENDS THROUGH SUCCESSIVE
Lizy Kurian John Electrical and Computer Engineering Department, The University of Texas as Austin
BUS ARCHITECTURES Lizy Kurian John Electrical and Computer Engineering Department, The University of Texas as Austin Keywords: Bus standards, PCI bus, ISA bus, Bus protocols, Serial Buses, USB, IEEE 1394
7. Latches and Flip-Flops
Chapter 7 Latches and Flip-Flops Page 1 of 18 7. Latches and Flip-Flops Latches and flip-flops are the basic elements for storing information. One latch or flip-flop can store one bit of information. The
OpenSPARC T1 Processor
OpenSPARC T1 Processor The OpenSPARC T1 processor is the first chip multiprocessor that fully implements the Sun Throughput Computing Initiative. Each of the eight SPARC processor cores has full hardware
University of Texas at Dallas. Department of Electrical Engineering. EEDG 6306 - Application Specific Integrated Circuit Design
University of Texas at Dallas Department of Electrical Engineering EEDG 6306 - Application Specific Integrated Circuit Design Synopsys Tools Tutorial By Zhaori Bi Minghua Li Fall 2014 Table of Contents
Gates, Circuits, and Boolean Algebra
Gates, Circuits, and Boolean Algebra Computers and Electricity A gate is a device that performs a basic operation on electrical signals Gates are combined into circuits to perform more complicated tasks
Latch Timing Parameters. Flip-flop Timing Parameters. Typical Clock System. Clocking Overhead
Clock - key to synchronous systems Topic 7 Clocking Strategies in VLSI Systems Peter Cheung Department of Electrical & Electronic Engineering Imperial College London Clocks help the design of FSM where
Interconnection Network of OTA-based FPAA
Chapter S Interconnection Network of OTA-based FPAA 5.1 Introduction Aside from CAB components, a number of different interconnect structures have been proposed for FPAAs. The choice of an intercmmcclion
Digital Logic Design. Basics Combinational Circuits Sequential Circuits. Pu-Jen Cheng
Digital Logic Design Basics Combinational Circuits Sequential Circuits Pu-Jen Cheng Adapted from the slides prepared by S. Dandamudi for the book, Fundamentals of Computer Organization and Design. Introduction
Memory ICS 233. Computer Architecture and Assembly Language Prof. Muhamed Mudawar
Memory ICS 233 Computer Architecture and Assembly Language Prof. Muhamed Mudawar College of Computer Sciences and Engineering King Fahd University of Petroleum and Minerals Presentation Outline Random
Spacecraft Computer Systems. Colonel John E. Keesee
Spacecraft Computer Systems Colonel John E. Keesee Overview Spacecraft data processing requires microcomputers and interfaces that are functionally similar to desktop systems However, space systems require:
Lecture 10 Sequential Circuit Design Zhuo Feng. Z. Feng MTU EE4800 CMOS Digital IC Design & Analysis 2010
EE4800 CMOS igital IC esign & Analysis Lecture 10 Sequential Circuit esign Zhuo Feng 10.1 Z. Feng MTU EE4800 CMOS igital IC esign & Analysis 2010 Sequencing Outline Sequencing Element esign Max and Min-elay
Lecture 11: Multi-Core and GPU. Multithreading. Integration of multiple processor cores on a single chip.
Lecture 11: Multi-Core and GPU Multi-core computers Multithreading GPUs General Purpose GPUs Zebo Peng, IDA, LiTH 1 Multi-Core System Integration of multiple processor cores on a single chip. To provide
Chapter 1 Computer System Overview
Operating Systems: Internals and Design Principles Chapter 1 Computer System Overview Eighth Edition By William Stallings Operating System Exploits the hardware resources of one or more processors Provides
2 : BISTABLES. In this Chapter, you will find out about bistables which are the fundamental building blocks of electronic counting circuits.
2 : BITABLE In this Chapter, you will find out about bistables which are the fundamental building blos of electronic counting circuits. et-reset bistable A bistable circuit, also called a latch, or flip-flop,
Pentium vs. Power PC Computer Architecture and PCI Bus Interface
Pentium vs. Power PC Computer Architecture and PCI Bus Interface CSE 3322 1 Pentium vs. Power PC Computer Architecture and PCI Bus Interface Nowadays, there are two major types of microprocessors in the
Introduction to Cloud Computing
Introduction to Cloud Computing Parallel Processing I 15 319, spring 2010 7 th Lecture, Feb 2 nd Majd F. Sakr Lecture Motivation Concurrency and why? Different flavors of parallel computing Get the basic
Design of Energy Efficient Low Power Full Adder using Supply Voltage Gating
Design of Energy Efficient Low Power Full Adder using Supply Voltage Gating S.Nandhini 1, T.G.Dhaarani 2, P.Kokila 3, P.Premkumar 4 Assistant Professor, Dept. of ECE, Nandha Engineering College, Erode,
IL2225 Physical Design
IL2225 Physical Design Nasim Farahini [email protected] Outline Physical Implementation Styles ASIC physical design Flow Floor and Power planning Placement Clock Tree Synthesis Routing Timing Analysis Verification
Computer Organization and Architecture. Characteristics of Memory Systems. Chapter 4 Cache Memory. Location CPU Registers and control unit memory
Computer Organization and Architecture Chapter 4 Cache Memory Characteristics of Memory Systems Note: Appendix 4A will not be covered in class, but the material is interesting reading and may be used in
Weste07r4.fm Page 183 Monday, January 5, 2004 1:39 AM. 7.1 Introduction
Weste07r4.fm Page 183 Monday, January 5, 2004 1:39 AM 7 7.1 Introduction The previous chapter addressed combinational circuits in which the output is a function of the current inputs. This chapter discusses
Binary search tree with SIMD bandwidth optimization using SSE
Binary search tree with SIMD bandwidth optimization using SSE Bowen Zhang, Xinwei Li 1.ABSTRACT In-memory tree structured index search is a fundamental database operation. Modern processors provide tremendous
Digital to Analog Converter. Raghu Tumati
Digital to Analog Converter Raghu Tumati May 11, 2006 Contents 1) Introduction............................... 3 2) DAC types................................... 4 3) DAC Presented.............................
