Phase Change Memory for Neuromorphic Systems and Applications



Similar documents
THE POSSIBILITY of using resistive memory devices as

Implementing Deep Neural Networks with Non Volatile Memories

Learning to classify complex patterns using a VLSI network of spiking neurons

Conditioned Reflex Mimic Circuit Design

DESIGN CHALLENGES OF TECHNOLOGY SCALING

Static-Noise-Margin Analysis of Conventional 6T SRAM Cell at 45nm Technology

A Digital Neurosynaptic Core Using Embedded Crossbar Memory with 45pJ per Spike in 45nm

Pavlov s Dog Associative Learning Demonstrated on Synaptic-like Organic Transistors

International Journal of Electronics and Computer Science Engineering 1482

Reconfigurable Architecture Requirements for Co-Designed Virtual Machines

An Open Architecture through Nanocomputing

CHAPTER 6 PRINCIPLES OF NEURAL CIRCUITS.

FLASH TECHNOLOGY DRAM/EPROM. Flash Year Source: Intel/ICE, "Memory 1996"

TRAINING A LIMITED-INTERCONNECT, SYNTHETIC NEURAL IC

Nanoelectronic Programmable Synapses Based on Phase Change Materials for Brain-Inspired Computing

A 1-GSPS CMOS Flash A/D Converter for System-on-Chip Applications

Crossbar Resistive Memory:

CONTENTS. Preface Energy bands of a crystal (intuitive approach)

Padma Charan Das Dept. of E.T.C. Berhampur, Odisha, India

Yaffs NAND Flash Failure Mitigation

Development of High Frequency Link Direct DC to AC Converters for Solid Oxide Fuel Cells (SOFC)

NEW adder cells are useful for designing larger circuits despite increase in transistor count by four per cell.

A Digital Timer Implementation using 7 Segment Displays

ISSCC 2003 / SESSION 13 / 40Gb/s COMMUNICATION ICS / PAPER 13.7

CHARGE pumps are the circuits that used to generate dc

TRUE SINGLE PHASE CLOCKING BASED FLIP-FLOP DESIGN

THE HUMAN BRAIN. observations and foundations

Introduction to Artificial Neural Networks

Implementation Of High-k/Metal Gates In High-Volume Manufacturing

Nanotechnologies for the Integrated Circuits

Architectures and Design Methodologies for Micro and Nanocomputing

Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski

Logical Operations. Control Unit. Contents. Arithmetic Operations. Objectives. The Central Processing Unit: Arithmetic / Logic Unit.

Nanocomputer & Architecture

Automotive Lithium-ion Batteries

Introduction to Digital System Design

PFP Technology White Paper

Biological Neurons and Neural Networks, Artificial Neurons

Computer Systems Structure Main Memory Organization

Efficient Interconnect Design with Novel Repeater Insertion for Low Power Applications

Power Quality For The Digital Age INVERTING SOLAR POWER A N E N V IRONME N TA L P OT E N T I A L S W HI T E PA PER

MANAGING QUEUE STABILITY USING ART2 IN ACTIVE QUEUE MANAGEMENT FOR CONGESTION CONTROL

Handout 17. by Dr Sheikh Sharif Iqbal. Memory Unit and Read Only Memories

PASSENGER/PEDESTRIAN ANALYSIS BY NEUROMORPHIC VISUAL INFORMATION PROCESSING

A High-Yield Area-Power Efficient DWT Hardware for Implantable Neural Interface Applications

Micro-Step Driving for Stepper Motors: A Case Study

Memory Basics. SRAM/DRAM Basics

Implementation of emulated digital CNN-UM architecture on programmable logic devices and its applications

LAB4: Audio Synthesizer

L innovazione tecnologica dell industria italiana verso la visione europea del prossimo futuro

Method of Combining the Degrees of Similarity in Handwritten Signature Authentication Using Neural Networks

Programmable Single-/Dual-/Triple- Tone Gong SAE 800

Implementation of Reliable Fault Tolerant Data Storage System over Cloud using Raid 60

3D NAND Technology Implications to Enterprise Storage Applications

A 10,000 Frames/s 0.18 µm CMOS Digital Pixel Sensor with Pixel-Level Memory

Data Mining and Neural Networks in Stata

A Dynamic Link Allocation Router

Dot-Product Engine for Neuromorphic Computing: Programming 1T1M Crossbar to Accelerate Matrix-Vector Multiplication

A bachelor of science degree in electrical engineering with a cumulative undergraduate GPA of at least 3.0 on a 4.0 scale

Test Driven Development of Embedded Systems Using Existing Software Test Infrastructure

Research Article. ISSN (Print) *Corresponding author Nathan David

Recurrent Neural Networks

Agenda. Michele Taliercio, Il circuito Integrato, Novembre 2001

Emerging storage and HPC technologies to accelerate big data analytics Jerome Gaysse JG Consulting

American International Journal of Research in Science, Technology, Engineering & Mathematics

NEURAL NETWORKS IN DATA MINING

Sequential 4-bit Adder Design Report

Performance Comparison of an Algorithmic Current- Mode ADC Implemented using Different Current Comparators

Appendix 4 Simulation software for neuronal network models

Lecture N -1- PHYS Microcontrollers

Ultra-High Density Phase-Change Storage and Memory

Spike-Based Sensing and Processing: What are spikes good for? John G. Harris Electrical and Computer Engineering Dept

A New, High-Performance, Low-Power, Floating-Point Embedded Processor for Scientific Computing and DSP Applications

Power Noise Analysis of Large-Scale Printed Circuit Boards

Chapter 2 Heterogeneous Multicore Architecture

Masters research projects. 1. Adapting Granger causality for use on EEG data.

ISSCC 2003 / SESSION 9 / TD: DIGITAL ARCHITECTURE AND SYSTEMS / PAPER 9.3

SLC vs MLC: Which is best for high-reliability apps?

Photonic Networks for Data Centres and High Performance Computing

Design and analysis of flip flops for low power clocking system

Fig 3. PLC Relay Output

How Will Rebooting Computing Help IoT?

Digital to Analog Converter. Raghu Tumati

Implementation of Short Reach (SR) and Very Short Reach (VSR) data links using POET DOES (Digital Opto- electronic Switch)

Fault Modeling. Why model faults? Some real defects in VLSI and PCB Common fault models Stuck-at faults. Transistor faults Summary

nanoetxexpress Specification Revision 1.0 Figure 1 nanoetxexpress board nanoetxexpress Specification Rev 1.

AN1837. Non-Volatile Memory Technology Overview By Stephen Ledford Non-Volatile Memory Technology Center Austin, Texas.

What will I learn as an Electrical Engineering student?

A 0.18µm CMOS Low-Power Charge-Integration DPS for X-Ray Imaging

================================================================

Neural Network Design in Cloud Computing

A Novel Low Power Fault Tolerant Full Adder for Deep Submicron Technology

{The Non-Volatile Memory Technology Database (NVMDB)}, UCSD-CSE Techreport CS

Computer Organization & Architecture Lecture #19

Module 2. Embedded Processors and Memory. Version 2 EE IIT, Kharagpur 1

Chapter 7 Memory and Programmable Logic

Hardware Implementation of Improved Adaptive NoC Router with Flit Flow History based Load Balancing Selection Strategy

An Artificial Neural Networks-Based on-line Monitoring Odor Sensing System

Chapter 8 Memory Units

The Advantages of Dynamic Rate Memory

Transcription:

Phase Change Memory for Neuromorphic Systems and Applications M. Suri 1, O. Bichler 2, D. Querlioz 3, V. Sousa 1, L. Perniola 1, D. Vuillaume 4, C. Gamrat 2, and B. DeSalvo 1 (manan.suri@cea.fr, barbara.desalvo@cea.fr) 1 CEA-LETI-Grenoble, 2 CEA-LIST-Gif Sur Yvette, 3 IEF-Paris, 4 IEMN-Villeneuve d Ascq, France ABSTRACT In this paper we summarize our recent results in the field of neuromorphic hardware design using synapses based on Phase Change Memory (PCM) devices. Emulation of synaptic plasticity effects such as long term potentiation/depression (LTP/LTD) and learning rules like spike-timing dependent plasticity (STDP) are emphasized. We discuss implementation of large-scale low-power spiking neural networks (SNN) with two fundamentally different approaches: (i) Multi-level deterministic synapses and (ii) Binary stochastic synapses. Prototype application of fully unsupervised learning such as complex visual pattern extraction is also discussed. Key words: Neuromorphic Hardware, Spiking Neural Networks, PCM, Synapse. 1. Introduction Interest in the field of bio-inspired computational hardware has increased dramatically over the last few years. Several research groups are actively exploring neuromorphic hardware due to its promising advantages such as low-power, high defect/variability- tolerance, and efficient handling of large scale non-linear computations. Among the different implementations of neuromorphic hardware, of particular interest are the ones that exploit hybrid architectures: CMOS + emerging resistive memory (RRAM). In these hybrid architectures (Fig. 1), the neuronal functionality is implemented by using standard CMOS circuits while the synaptic functionality is implemented through densely integrated arrays or crossbars of RRAM technologies. These RRAM technologies may include devices such as phase change memory (PCM) [1], [2] conductive-bridge (CBRAM) or programmable metallization cell (PMC) [3], [4] and oxide based resistive memory (OXRAM) [5], [6]. Owing to their compact size, resistive-memory devices help in achieving a high synaptic density which is not possible with pure CMOS-synapse circuits, as a large number of transistors are required (>10) per synapse [7]. RRAM also provides the much needed non-volatility to the neuromorphic system. Here, we focus on hybrid neuromorphic architectures that use chalcogenide based PCM devices as synapses. PCM devices have been used for mimicking synaptic-plasticity effects such as potentiation (LTP-like) [8], depression (LTD-like) [1], and bio-inspired learning rules like spike-timing dependent plasticity (STDP) [1]. In previous works [2], [9] we presented a novel low power PCM based neuromorphic architecture called the 2-PCM Synapse. Using our architecture we demonstrated a two layer spiking neural network (SNN) capable of complex visual pattern extraction. In this paper we summarize our approach and key results for PCM synapses based on GST, GeTe [10] and GST+HfO 2 [11] material stacks. We also discuss our Binary-PCM Synapse approach that uses stochastic STDP learning rule and helps to overcome the impact of PCM-resistance drift in neuromorphic systems [12]. 2. Experiments and Approach A. The 2-PCM Synapse Our main motivation for developing the 2-PCM Synapse architecture (Fig. 2) was to emulate synaptic behavior (i.e. gradual synaptic-potentiation and -depression) using identical neuron spikes [2]. In this approach, we use two PCM devices to implement a single synapse and connect them in a complementary configuration to the post-synaptic output neuron.

Fig. 1 Schematic showing a pre- and post- neuron connected with a PCM synapse (TEM). Fig. 2 The 2-PCM Synapse architecture. Current contribution of the LTP device is positive while that of the LTD device is negative towards output neuron integration. Fig. 3(a) LTP characteristics of the GST-PCM device. Initially the device is amorphized to a high resistance state, and then several partially crystallizing pulses are applied. Fig. 3(b) LTP characteristics of the GeTe-PCM device. Fig. 3(c) LTP characteristics for the GST+HfO 2 PCM devices. Fig. 4 Schematic depicting the difference between (a) multilevel deterministic and (b) stochastic binary, synapses. One of the PCM device implements synaptic potentiation (LTP-device), while the other implements synaptic depression (LTD-device). Both devices are initialized to a high resistive amorphous state.

When synaptic potentiation is required, the LTP device is partially crystallized, while when synaptic depression is required, the LTD device is partially crystallized. Fig. 3(a),(b) and (c) shows the characteristic resistance evolution curves for our GST, GeTe and GST+HfO 2 PCM devices respectively. The conductance window saturates in about 30 partially crystalizing pulses for nucleation dominated GST, 10 for growth dominated GeTe and about 60 for interface-engineered GST+HfO 2 devices. Slower crystallization leads to more intermediate conductance states or synaptic weights. In [10], [11] we discuss in detail the underlying physical phenomena for the difference in the synaptic crystallization curves and their impact on the final neuromorphic application. The detailed programming schemes (Read, Write, and Refresh) and the simplified STDP learning rule used are described in [2]. Note that as the neural network undergoes learning, with time the PCM devices become more and more crystallized and finally saturate to a minimum resistance value. In order to enable continuous learning of the network, we defined a refresh-sequence, explained in detail in [9]. In this refresh-sequence, the saturated PCM devices are amorphized (reset) and the effective weights of the corresponding synapses are re-programmed. B. Binary-PCM Synapse and Partial-Reset Alternatively, we propose to use PCM devices as binary synapses but with a stochastic-stdp rule, similar to the one we developed for bipolar CBRAM devices [3]. The Binary-PCM Synapse approach has several advantages such as simpler programming schemes, non-requirement of a multilevel capable technology; low-power consumption and very high tolerance to impact of PCM resistance drift [12]. At the system level, a statistical equivalence exists between deterministic multi-level synapses and binary stochastic synapses (Fig. 4). In this approach, there is 1-PCM device per synapse. Two resistance states (or weights) can be defined for the PCM synapse. The high-resistance state should be chosen such that it is a partial-reset (Fig. 5) state. The partial-reset state should lie in the negligible or low-driftable region. For GST based devices a resistance value < 50 k will lie in low or negligible drift regime [13], [14]. Fig. 6(a) shows an example of our optimized stochastic-stdp rule. The y-axis represents a probability to switch from the set-to-reset or reset-to-set states for the PCM synapses. Fig. 6(b) shows the architecture and the programming scheme required to implement the stochastic learning with binary PCM synapses. The stochasticity is controlled by an extrinsic PRNG (pseudorandom number generator) circuit. The PRNG circuit controls the probability of LTP and LTD with a 2-bit signal. Initially, the input neurons (A-D) generate small read pulses when they encounter any stimuli event. The read current is integrated in the output neurons, A and B. When the output neuron reaches its firing threshold it generates a feedback signal (3) and a post-spike signal. In the example shown in Fig. 6 output neuron A fires and B doesn t fire. The signal (3) activates the gates of all the select transistors on the synaptic line connected to A. If LTP is to be implemented the Input neuron will send a signal (1), as shown for Input neuron A in this example. In the case of LTD the input neuron will send a signal (2), as shown for the input neuron D. The probabilities of LTP/LTD can be tuned according to the learning rule (Fig. 6(a)). 3. Results & Discussion We benchmarked the two different architectures discussed in the previous section, by simulating a large-scale real world pattern extraction application. We used our custom built XNET SNN simulator [15] to simulate a complex visual pattern application. Our 2-layer SNN with 70 neurons and about 2 million PCM synapses is shown in Fig. 7. AER video of cars passing on a freeway was recorded using a special DVS-Sensor [16] and used as the input data for our SNN. During the learning, the neurons were able to extract patterns (car-shapes) in different lanes (different positions and orientations) in a fully unsupervised manner. The detailed learning mechanism of the network is described in [17]. Fig. 8 provides the final synaptic weight (resistance) distributions at the end of the learning experiment. Note that in the case of the 2-PCM synapse about 60% of the synapses post learning are in the lowresistance state.

Fig. 5 Standard RV curve for our GST PCM devices. Partial reset state is less prone to resistance-drift, compared to a fully reset state. We exploit the partial reset states for stochastic binary synapse implementation. Fig. 6(a) Optimized stochastic STDP learning rule. (b) Architecture of the Binary-Stochastic PCM Synapse. The stochasticity is implemented extrinsically by using pseudo-random number generator (PRNG) circuits. Programming pulse schemes are shown on right. This happens because the 2-PCM synapse is predominantly based on crystallization, and even LTD is realized by crystallizing. However in the case of the Binary-PCM synapse, more than 95% of the synapses are in the high resistance state after learning. Thus if the high-resistance state is carefully defined in a low-driftable regime (partial reset), the loss of synaptic information due to drift can be minimized. Fig. 9 and Fig. 10, summarize the SNN performance and detection rates for some of the simulations performed. From Fig. 10 note that the number of reset events in 2-PCM Synapse is less than set events, because even LTD is realized by crystallization. However in the case of Binary-PCM synapse, the number of reset events is higher. Overall the LTD events occur more frequently in the learning experiments because our STDP learning rule is pre-dominantly LTD based. The total energy dissipated for synaptic programming in the case of 2-PCM Synapse was about 400 µj, and 50 µj for binary PCM synapse. The energy dissipation is reduced as a partial reset state requires much less programming current compared to a full hard reset operation. Fig. 10 provides the performance statistics for two different levels of Roff in the Binary-PCM Synapse architecture. A detailed analysis on the impact of PCM resistance window and choice of Roff on the neuromorphic system performance is discussed in [18].

Fig. 7 Feed forward Spiking Neural Network. L1 contains 60 LIF neurons, while L2 contains 10 LIF neurons. Lateral inhibition is implemented in both layers and all the neurons are fully interconnected with the help of PCM synapses. (a) (b) Fig. 8 Final synaptic resistance distribution at the end of the learning experiment for (a) 2-PCM Synapse, about 60% of the devices are in ON state. (b) Binary-PCM Synapse. More than 95% of the synapses are in the OFF state. Fig. 9 (a) Final neuron sensitivity maps for different lanes for three different materials. (b) Average car-detection rates for the three different materials at the end of the learning experiment. The average car detection rate I always > 90% for the lanes which were learnt.

Fig. 10 Energy and event statistics for the cars experiment showing a comparison between the two PCM based architectures. 4. Conclusion We demonstrated that PCM devices can be used as synapses in large-scale low-power neuromorphic systems for emulation of synaptic plasticity and bio-inspired learning rules such as STDP. We developed two unique and fundamentally different methodologies and programming schemes for PCM synapses in unsupervised spiking neural networks: (a) Multi-level deterministic (2-PCM Synapse) and (b) Binary stochastic synapse. We benchmarked the two approaches using prototype applications such as complex visual pattern extraction from real world data. REFERENCES [1] D. Kuzum et.al, IEDM-2011, 30.3. [2] M.Suri, et.al, IEDM-2011, 4.4. [3] M.Suri, et.al, IEDM-2012, 10.3. [4] S. Yu, et.al, IEDM-2010, 21.1 [5] S. Yu, et.al, IEDM-2012, 10.4. [6] S. Park, et.al, IEDM-2012, 10.2. [7] S. Mitra, et.al, ISCAS-2006, 2777. [8] M. Suri, et.al, IJCNN-2011, 619. [9] O. Bichler, et.al, IEEE TED, 2012, 59,8,2206. [10] M.Suri, et.al, JAP, 2012, 112,5, 054904. [11] M.Suri, et.al, IMW, 2012. [12] M.Suri, et.al, Nanoarch-2013. [13] N. Papandreou et.al, IMW-2011. [14] D.Ielmini, IEDM-2007, 939., [15] O. Bichler, et.al, IJCNN-2011, 859. [16] P. Lichtsteiner, et.al, JSSC-2008, 43,2, 566. [17] O. Bichler, et.al, Neural Nets-2012, 32, 339 [18] D. Garbin, et.al, IEEE Nano, 2013. Biographies Manan Suri is a PhD student working at the Advanced Memory Technology Group at CEA-LETI, Grenoble. He obtained his M. Eng (2010) and B.S (2009) in Electrical and Computer Engineering from Cornell University. His research interests include emerging non-volatile memory technology, neuromorphic hardware, bio-inspired computing and nanoscale semiconductor devices. O. Bichler and C. Gamrat are researchers working at CEA-LIST. D. Querlioz is a researcher working at IEF-Paris. D. Vuillaume is a research director working at CNRS-IEMN. V. Sousa, L. Perniola and B. DeSalvo are researchers working at the Advanced Memory Technology Group at CEA- LETI, Grenoble.