{The Non-Volatile Memory Technology Database (NVMDB)}, UCSD-CSE Techreport CS2015-1011



Similar documents
Embedded STT-MRAM for Mobile Applications:

NAND Flash Architecture and Specification Trends

Flash & DRAM Si Scaling Challenges, Emerging Non-Volatile Memory Technology Enablement - Implications to Enterprise Storage and Server Compute systems

Computer Systems Structure Main Memory Organization

Solid State Technology What s New?

1 / 25. CS 137: File Systems. Persistent Solid-State Storage

A PRAM and NAND Flash Hybrid Architecture for High-Performance Embedded Storage Subsystems

Emerging storage and HPC technologies to accelerate big data analytics Jerome Gaysse JG Consulting

Memory unit. 2 k words. n bits per word

State-of-the-Art Flash Memory Technology, Looking into the Future

1. Memory technology & Hierarchy

A Study of Application Performance with Non-Volatile Main Memory

Flash and Storage Class Memories. Technology Overview & Systems Impact. Los Alamos/HECFSIO Conference August 6, 2008

In-Block Level Redundancy Management for Flash Storage System

Price/performance Modern Memory Hierarchy

Destroying Flash Memory-Based Storage Devices (draft v0.9)

Implementation of Buffer Cache Simulator for Hybrid Main Memory and Flash Memory Storages

Computer Architecture

3D NAND Technology Implications to Enterprise Storage Applications

2014 EMERGING NON- VOLATILE MEMORY & STORAGE TECHNOLOGIES AND MANUFACTURING REPORT

Nasir Memon Polytechnic Institute of NYU

Module 2. Embedded Processors and Memory. Version 2 EE IIT, Kharagpur 1

Sistemas Operativos: Input/Output Disks

The Bleak Future of NAND Flash Memory

Crossbar Resistive Memory:

NAND Flash FAQ. Eureka Technology. apn5_87. NAND Flash FAQ

Advantages of e-mmc 4.4 based Embedded Memory Architectures

Storage Class Memory and the data center of the future

Chapter 9 Semiconductor Memories. Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan

Chapter Introduction. Storage and Other I/O Topics. p. 570( 頁 585) Fig I/O devices can be characterized by. I/O bus connections

NAND Flash & Storage Media

Trends in NAND Flash Memory Error Correction

A New Chapter for System Designs Using NAND Flash Memory

Scaling from Datacenter to Client

Technical Note. Micron NAND Flash Controller via Xilinx Spartan -3 FPGA. Overview. TN-29-06: NAND Flash Controller on Spartan-3 Overview

Chapter 5 :: Memory and Logic Arrays

MCF54418 NAND Flash Controller

SRAM Scaling Limit: Its Circuit & Architecture Solutions

Memory Basics. SRAM/DRAM Basics

Technical Note DDR2 Offers New Features and Functionality

Semiconductor Memories

Mobile SDRAM. MT48H16M16LF 4 Meg x 16 x 4 banks MT48H8M32LF 2 Meg x 32 x 4 banks

A N. O N Output/Input-output connection

Choosing the Right NAND Flash Memory Technology

Toshiba America Electronic Components, Inc. Flash Memory

1 Gbit, 2 Gbit, 4 Gbit, 3 V SLC NAND Flash For Embedded

FLASH TECHNOLOGY DRAM/EPROM. Flash Year Source: Intel/ICE, "Memory 1996"

Flash Memories. João Pela (52270), João Santos (55295) December 22, 2008 IST

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2014/15

Memory ICS 233. Computer Architecture and Assembly Language Prof. Muhamed Mudawar

The Evolving Role of Flash in Memory Subsystems. Greg Komoto Intel Corporation Flash Memory Group

Table 1 SDR to DDR Quick Reference

OpenSPARC T1 Processor

Input/output (I/O) I/O devices. Performance aspects. CS/COE1541: Intro. to Computer Architecture. Input/output subsystem.

Intel Solid-State Drive 320 Series

Table 1: Address Table

Charge-Trapping (CT) Flash and 3D NAND Flash Hang-Ting Lue

Chapter 7 Memory and Programmable Logic

NAND Flash Architecture and Specification Trends

Non-Volatile Memory and Its Use in Enterprise Applications

Flash s Role in Big Data, Past Present, and Future OBJECTIVE ANALYSIS. Jim Handy

Chapter 9: Peripheral Devices: Magnetic Disks

Nanotechnologies for the Integrated Circuits

An Analysis Of Flash And HDD Technology Trends

W25Q80, W25Q16, W25Q32 8M-BIT, 16M-BIT AND 32M-BIT SERIAL FLASH MEMORY WITH DUAL AND QUAD SPI

Evaluating Embedded Non-Volatile Memory for 65nm and Beyond

Performance Beyond PCI Express: Moving Storage to The Memory Bus A Technical Whitepaper

Computer Architecture

RAM & ROM Based Digital Design. ECE 152A Winter 2012

MICROPROCESSOR. Exclusive for IACE Students iacehyd.blogspot.in Ph: /422 Page 1

Memory. The memory types currently in common usage are:

Non-Volatile Memory. Non-Volatile Memory & its use in Enterprise Applications. Contents

Memory Architecture and Management in a NoC Platform

DDR4 Memory Technology on HP Z Workstations

Memory Hierarchy. Arquitectura de Computadoras. Centro de Investigación n y de Estudios Avanzados del IPN. adiaz@cinvestav.mx. MemoryHierarchy- 1

Power Reduction Techniques in the SoC Clock Network. Clock Power

DDR3 SDRAM UDIMM MT8JTF12864A 1GB MT8JTF25664A 2GB

Semiconductor Device Technology for Implementing System Solutions: Memory Modules

NAND 201: An Update on the Continued Evolution of NAND Flash

ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-17: Memory organisation, and types of memory

Flash Memory Jan Genoe KHLim Universitaire Campus, Gebouw B 3590 Diepenbeek Belgium

Providing Battery-Free, FPGA-Based RAID Cache Solutions

Samsung DDR4 SDRAM DDR4 SDRAM

DESIGN CHALLENGES OF TECHNOLOGY SCALING

Introduction to Microprocessors

Eureka Technology. Understanding SD, SDIO and MMC Interface. by Eureka Technology Inc. May 26th, Copyright (C) All Rights Reserved

Byte Ordering of Multibyte Data Items

Handout 17. by Dr Sheikh Sharif Iqbal. Memory Unit and Read Only Memories

Transcription:

The Non-Volatile Memory Technology Database (NVMDB) UCSD-CSE Techreport CS2015-1011 Kosuke Suzuki Fujitsu Laboratories Ltd. kosuzuki@jp.fujitsu.com Steven Swanson UC San Diego swanson@cs.ucsd.edu Flash, phase-change, spin-torque, and resistive memories are rapidly transforming how system designers think about memory devices, memory hierarchies, processor architectures, storage systems, operating systems, and applications. We have gathered a comprehensive survey of the (at last count) 340 non-volatile memory technology papers published between 2000 and 2014 in International Solid-State Circuits Conference (ISSCC), Symposia on VLSI Technology and Circuits (VLSI Technology, VLSI Circuits), and International Electron Devices Meeting (IEDM). The resulting data set provides a clear picture of how these memory technologies have evolved over time. The links below provide access to a full bibliography for all 340 papers, an Excel spreadsheet summarizing the results, and a technical report describing our methodology and the contents of the spreadsheet. The data is available online at http://nvmdb.ucsd.edu, and a paper describing the database and our methodology is included below as an appendix. It was presented at the 2015 International Memory Workshop. If you use the NVMDB in your research please cite it using the this bibtex entry: @TECHREPORT{NVMDB, AUTHOR = {Kosuke Suzuki and Steven Swanson}, TITLE = {The Non-Volatile Memory Technology Database (NVMDB)}, NUMBER = {CS2015-1011}, INSTITUTION = {Department of Computer Science \& Engineering, University of California, San Diego}, NOTE = {http://nvmdb.ucsd.edu}, MONTH = {May}, YEAR = {2015}, AUTHOR1_URL = {http://nvmdb.ucsd.edu}, AUTHOR1_EMAIL = {swanson@cs.ucsd.edu}, URL = {http://nvmdb.ucsd.edu}, } 1

A Survey of Trends in Non-Volatile Memory Technologies: 2000 2014 Kosuke Suzuki Fujitsu Laboratories Ltd. kosuzuki@jp.fujitsu.com Steven Swanson UC San Diego swanson@cs.ucsd.edu Abstract We present a survey of non-volatile memory technology papers published between 2000 and 2014 in leading journals and conference proceedings in the area of integrated circuit design and semiconductor devices. We present a summary of the data provided in these papers and use that data to model basic aspects of their performance at an architectural level. The full data set and complete bibliography will be published online. Index terms,, PCM,, flash, SCM, Trends survey I. INTRODUCTION Flash, phase-change, spin-torque, and resistive memories are rapidly transforming how system designers think about memory devices, memory hierarchies, processor architectures, storage systems, operating systems, and applications. However, with the exception of flash memory, none of these memories are, as yet, commercially available. This makes fully appreciating their potential impact on these aspects of computer systems very difficult. One source of potential guidance is the vast number of paper describing new cell designs and complete prototype memory devices. These published reports should make it possible to identify trends in memory technology evolution, predict what form a commercially viable device might take, and predict its performance and efficiency characteristics. Collecting collating all this data is a logistical challenge. By our count, there have been over 300 such papers published since 2000, spanning a wide range of venues in several research communities. The papers present a large but messy data set: they present different sets of metrics and speak the different languages of their publication venues. Finally, while these papers provide important technical information about prototype devices, they frequently omit higher-level metrics that would be of use to processor architects, storage engineers, operating system developers, and application programmers. This paper presents a comprehensive survey of non-volatile memory technology papers published between 2000 and 2014 in International Solid-State Circuits Conference (ISSCC), Symposia on VLSI Technology and Circuits (VLSI Technology, VLSI Circuits), and International Electron Devices Meeting (IEDM). Besides, the survey includes the latest papers of ISSCC 2015. The resulting data set provides a clear picture of how these memory technologies have evolved over time. In addition the basic metrics provided by the papers, we also present data for several higher-level metrics (e.g., random access latency and bandwidth) based on a simple memory array model. This analysis should provide guidance to system architects as well as serve as the basis for more detailed studies of how these memory technologies will affect systems when they become commercially viable. This short paper presents a summary of the data we collected. A complete version of the data set with a full bibliography and additional metrics is in preparation for posting online at http://nvmdb.ucsd.edu. The remainder of this paper is organized as follows. Section II describes the methodology we used and describes each of the metrics we collected. Section III presents a sample of the data we have collected. Section IV concludes. Due to space constraints, the references section does not include citations for all the data in the figures. II. METHODOLOGY To collect our data, we surveyed all the papers in ISSCC, VLSI Technology, VLSI Circuits, and IEDM and identified the papers related to flash, phase-change (PCM or ), spin-torque MRAM (), and resistive ram (). This resulted in 340 papers. We also include data from the International Technology Roadmap for Semiconductors (ITRS) [4] projections for 2013. For each paper we recorded or computed two kinds of metrics: technology metrics and architectural metrics. Technology metrics are the basic facts about a devices, including its cell size, cell write time, cell read energy, etc. The architectural metrics are higher level measures of how the memory might perform in system. They include read and write bandwidth for a chip and the overall access latency. Table 1 summarizes the technology and architectural metrics we selected. Most papers provide the majority of the technology metrics either directly or indirectly. For instance, some provide cell size directly (expressed in F 2 ) while others expressed it indirectly by providing the cell size in square microns the technology feature size. Very few of the papers provide architectural metrics, and these metrics require us to make assumptions about parts of the system beyond the cell itself. For instance, to compute sequential read bandwidth or random write energy, we need to know the dimensions of the memory array. Likewise, to compute the random read latency for a technology, we need to know the RAS-CAS delay for the array. III. DATA This section provides a sampling (for space reasons) of the technology and architectural metrics we collected. A complete set of graphs and the full bibliography will be available as a

Average cell size (mm 2 ) Cell size (F 2 ) Tech. Metric Description Process Technology Minimum feature size in nm Cell size Size of a single cell in F 2 Read time Cell read time in ns Write time Cell write time in ns Chip capacity Bits per die (if applicable) Average cell size Average cell size in µm 2, including peripheral logic Endurance Write cycle before the cell become unreliable Bits/cell Bits stored per cell Average bit density Average bit density in GB/cm 2, including peripheral logic Arch. Metric Description Random read BW Sustained bandwidth for 64 bit random reads Random write BW Sustained bandwidth for 64 bit random writes Seq. read BW Sustained bandwidth for sequential reads Seq. write BW Sustained bandwidth for sequential writes Table 1: Technology and Architectural Metrics: Section III provides summary graphs for the metrics in bold. (ITRS '13) (ITRS '13) (ITRS '13) Figure 1: Cell size (ITRS '13) Figure 2: Average cell size tech report and all of the data will be available online. Cell size Cell size is a phisycal size of a single cell. The effective area required to store a bit may be smaller due multilevel cells and/or 3D stacking. Figure 1 shows cell size of each NVM. has already shrunk to 4 F 2. devices use several techniques to switch cell resistance (e.g., bipolar or unipolar, filament or non-filament) and use a range of materials (e.g., metal oxide or Electro-chemical Metalization Bridge) [4]. These difference affect cell geometry, so there is no clear trend. cell designs also differ. This is because some papers use two magnetic tunnel junctions (MTJ) to make read time shorter [7]. Cell size of 1-MTJ cells can be as small as 6 F 2 [8]. Using two MTJs increases the size to over 100 F 2. Cell size of has been around 4 F 2 since 2000. Average cell size Average cell size is similar to cell size, but it includes the peripheral cirucuits. The smallest average cell sizes for and are 0.0069 µm 2 [1] and 0.0076 µm 2 [10] respectively 5 larger than the smallest cell (0.0015 µm 2 [3]). cells are much larger: 0.26 µm 2 [5]. Average bit density Average bit density is capacity (Gbit) devided by die size (cm 2 ). Figure 3 shows average bit density of each NVM. density is doubling roughly every 1.66 years and has reached 13.5 Gbit/cm 2 [1]. density has been doubling every 1.91 years since 2000. Currently the highest density device achieves 185.8 Gbit/cm 2 [3]. devices are denser than. SanDisk and Toshiba have reported a 24.5 Gbit/cm 2 [10] device. STT- MRAM s best reported density to date is 0.36 Gbit/cm 2 [5]. For comparison, modern DRAMs achieve 9.1 Gbit/cm 2 [9]. Read & write time Read and write times (Figures 5 and 6) reflect the time required to read from a single cell, but they do not include the latency of the peripheral circuitry (e.g., the row and column decoders). For devices with different set and reset times, we report the large of the two. For, we report the program time. Read time increases with capacity for all technologies except (e.g., [1, 2]). read time has been decreasing due to the use of dual MTJ cells. Write times for and are rising, and although is denser, its write time is worse. For example, write time of 32 Gbit and 16 Gbit are 23 µs and 10 µs respectively [2, 10]. On the other hand, write time of 8 Gbit is 150 ns [1]. has good write time but capacity is less than 100 Mbit. Random read/write bandwidth To compute random read/write bandwidth we use the cell read and write times and assume a DRAM-like DIMM-based architecture (Figure 4) which has 64 I/O pins for data. The read/write timing diagrams per pin are shown in Figure 9 and 10. We assumed read time and write time to set t RCD and t RP _W RIT E in these figures. We also assumed burst length (BL) of four [6]. As a result, block size of each access is 32 bytes. The other parameters (t RP _READ, t RT P, etc) were also taken from a DRAM datasheet [6]. The best random read bandwidth of,, STT-

32B random read BW (MB/s) 32B random write BW (MB/s) Read time (ns) Write time (ns) Col. Average bit density (Gbit/cm 2 ) DRAM,,,, or 64 pins for data input/output 1E-4 1E-5 Figure 3: Average bit density Row Bank 1 Bank 0 SA I/O gating Bank 7 k 64 MUX 8:1 8 Read driver 1:8 8 DEMUX Write driver 8 DQ MUX: Multiplexer DEMUX: De-multiplexer SA: Sense Amplifier Figure 4: Memory architecture for read/write bandwidth (ITRS '13) (ITRS '13) Figure 5: Read time (ITRS '13) (ITRS '13) (ITRS '13) 1E+8 1E+7 1E+6 1E+7 1E+6 Figure 6: Write time (ITRS '13) (ITRS '13) (BEST) (ITRS '13) (BEST) (BEST) (ITRS '13) (BEST) (ITRS '13) Figure 7: 32B random read BW (BEST) (ITRS '13) (BEST) (BEST) (ITRS '13) (BEST) (ITRS '13) Figure 8: 32B random write BW Rand./seq. read latency trp_read trcd Command PRE ACT READ Address ROW of Bank X Bank X, Col m DQ a a+1 a+n Figure 9: Definition of seq./rand. read latency tcl (tal = 0) n = 3 for rand. read n = Page Size - 1 for seq. read

Sequential write BW (MB/s) Sequential read BW (MB/s) Rand./seq. write latency trcd twr trp_write Command ACT WRITE PRE ACT Address ROW of Bank X Bank X, Col m DQ a a+1 a+n tcwl (tal = 0) n = 3 for rand. write n = Page Size - 1 for seq. write Figure 10: Definition of seq./rand. write latency in the evolution of these technologies and will allow system designers to make better estimates of how these memories will affect the design and performance of future systems. (BEST) (ITRS '13) (BEST) (BEST) (ITRS '13) (BEST) (ITRS '13) Figure 11: Sequential read BW (BEST) (ITRS '13) (BEST) (BEST) (ITRS '13) (BEST) (ITRS '13) Figure 12: Sequential write BW MRAM are 802, 974, 1030 MB/s respectively (Figure 7). On the other hand, the best random write bandwidth of them are 716, 874, 1044 MB/s (Figure 8). The read and write bandwidths for the best of these memories may exceed DRAM s (775 MB/s for read, 632 MB/s for write). Sequential read/write bandwidth We assumed the same memory architecture to compute sequential bandwidths, and we assume that the device reads (or writes) an entire row of the memory array and streams it out over the pins. Papers describing devices provided a page size, but we had to estimate the page size for the other technologies. The main limiter on page size is the need to limit write power. Based on the write energy values given the papers, we use page sizes of 512 B, 1 kb, and 2 kb, respectively, for,, and. Figures 11 and 12 summarize the data. Sequential access is competitive with DRAM. V. REFERENCES [1] Y. Choi, I. Song, M.-H. Park, H. Chung, S. Chang, et al. A 20nm 1.8v 8gb pram with 40mb/s program bandwidth. In ISSCC, pages 46 48, Feb 2012. [2] R. Fackenthal, M. Kitagawa, W. Otsuka, K. Prall, D. Mills, et al. 19.7 a 16gb reram with 200mb/s write and 1gb/s read in 27nm technology. In ISSCC, pages 338 339, Feb 2014. [3] J.-W. Im, W.-P. Jeong, D.-H. Kim, S.-W. Nam, D.-K. Shim, et al. 7.2 a 128gb 3b/cell v-nand flash memory with 1gb/s i/o rate. In ISSCC, pages 130 131, Feb 2015. [4] ITRS 2013 Emerging Research Devices (ERD), 2013. [5] J. Kim, T. Kim, W. Hao, H. Rao, K. Lee, et al. A 45nm 1mb embedded stt-mram with design techniques to minimize read-disturbance. In VLSI Circuits, pages 296 297, June 2011. [6] Micron 4Gb: x4, x8, x16 DDR3L SDRAM Description, 2011. [7] H. Noguchi, K. Ikegami, N. Shimomura, T. Tetsufumi, J. Ito, et al. Highly reliable and low-power nonvolatile cache memory with advanced perpendicular stt-mram for high-performance cpu. In VLSI Circuits, pages 1 2, June 2014. [8] S. Oh, J. Jeong, W. Lim, W. Kim, Y. Kim, et al. On-axis scheme and novel mtj structure for sub-30nm gb density stt-mram. In Electron Devices Meeting (IEDM), 2010 IEEE International, pages 12.6.1 12.6.4, Dec 2010. [9] T.-Y. Oh, H. Chung, Y.-C. Cho, J.-W. Ryu, K. Lee, et al. 25.1 a 3.2gb/s/pin 8gb 1.0v lpddr4 sdram with integrated ecc engine for sub-1v dram core operation. In ISSCC, pages 430 431, Feb 2014. [10] T. yi Liu, T. H. Yan, R. Scheuerlein, Y. Chen, J. Lee, et al. A 130.7mm2 2-layer 32gb reram memory device in 24nm technology. In ISSCC, pages 210 211, Feb 2013. IV. CONCLUSION We have surveyed NVM technologies papers over the last 14 years and presented a range of device-level and architecture-level metrics. The data illuminate several trends