AMAGNETIC TUNNEL JUNCTION (MTJ) is a vertical



Similar documents
Transcription:

IEEE TRANSACTIONS ON MAGNETICS, VOL. 47, NO. 11, NOVEMBER 2011 4611 A High-Reliability, Low-Power Magnetic Full Adder Yi Gang 1;2, Weisheng Zhao 1;2, Jacques-Olivier Klein 1;2, Claude Chappert 1;2, and Pascale Mazoyer 3 IEF, Univ. Paris-Sud, Orsay 91405, France CNRS, UMR 8622, Orsay 91405, France STMicroelectronics, Crolles 38926, France Recently, ultra-low power circuits based on logic-in magnetic tunnel junction (MTJ) memory structure have been studied thanks to its non-volatility, infinite endurance, high access speed, and easy integration with CMOS process. However, this type of circuit suffers from low reliability both in memory cell and sensing amplifier circuits, which greatly limits its practical applications for logic computation. In this paper, we present a new design of magnetic full adder (MFA) to overcome this issue based on the thermally assisted switching (TAS) MTJ cell and pre-charge sensing amplifier (PCSA) circuit. By using CMOS 65 nm design kit and a precise TAS-MTJ model, mixed simulations have been performed to demonstrate its high reliability keeping low power and small die area. Index Terms Full adder, high reliability, low power, magnetic circuits, magnetic full adder, magnetic tunnel junction, pre-charge sensing amplifier. I. INTRODUCTION AMAGNETIC TUNNEL JUNCTION (MTJ) is a vertical nanopillar composed of three thin films (two ferromagnetic layers and one barrier), shown in Fig. 1 [1], [2]. Its resistance depends on the relative spin orientations of the two ferromagnetic layers. In standard applications, the magnetization of one layer (reference layer) is fixed, while that of the other ferromagnetic layer (storage layer) can take two opposite orientations: either parallel (low resistance: ) or anti-parallel (high resistance: ) to the magnetization of storage layer. MTJ is considered as one of the most promising emerging technologies to overcome the high leakage power issue of CMOS circuits ( 90 nm technology node) [3] thanks to its non-volatility, infinite endurance, and fast random access, etc. [4], [5]. Moreover, MTJ can be implemented easily at the back-end process above CMOS circuits with only a few numbers (2 4, according to different write technologies) of additional masks [7], [8]. This allows hybrid MTJ/CMOS circuits to be implemented for both memory (MRAM) and logic applications (magnetic logic), which combine the advantages of both technologies [8] [11]. Recently, a number of innovative circuits based on hybrid MTJ/CMOS circuits have been proposed. For example, magnetic look-up-table (MLUT) and magnetic flip-flop (MFF) for reconfigurable logic circuits [12]; magnetic full adder (MFA) based on logic-in-mtj memory cell architecture towards ultra-low power high density ICs [13] [15]. Moreover, MFA could also overcome the bottleneck of data communication between separated logic module and memory block. However, the circuit schematic of MFA shown in [15] is based on dynamic current mode (DCM) sensing circuit [16] (see Fig. 2), which is not suitable for deep submicron technology ( 90 Manuscript received August 02, 2010; revised November 05, 2010 and April 25, 2011; accepted April 26, 2011. Date of publication May 05, 2011; date of current version October 26, 2011. Corresponding author: W. Zhao (e-mail: weisheng.zhao@u-psud.fr). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TMAG.2011.2150238 Fig. 1. Nanopillar form MTJ: an oxide barrier with MgO, two ferromagnetic layers, the free layer and the pinned layer. Tunnel magnetoresistance TMR = (R 0 R )=R [1] characterizes the amplitude of this resistance change. In practical samples used for MRAM, MgO is used for the tunnel barrier [4] and the TMR rises up to 200% for practical applications. In laboratory devices with some special materials and fabrication methods, TMR can reach up to 604% at room temperature [6]. nm) for two reasons. The first is its high sensitivity to process parameter variation, which becomes more and more significant for future miniaturization of fabrication technology node; the second one is related to the capacitor C0, which acts as a virtual ground to limit the amount of charge transferred from output. As C0 should be large enough to accommodate the charges from output nodes and load capacitance, it is difficult to be scaled down to ensure the correct function of DCM circuits [16]. These make sense also that DCM CMOS circuits are not current trends in high-performance designs with advanced fabrication technologies. It is important to mention that reliability is one of the most important performances for advanced logic circuits. Unlike memory chip where there is often an error correction codes (ECC) block attached to each word, the computing logic result cannot resist data error. Thereby the reliability of magnetic logic circuits should be nearly perfect to be implemented in practical applications. In the last year, a high reliability sense amplifier based on pre-charge principle, pre-charge sense amplifier (PCSA) has been proposed in [17]. It shows remarkable improvement in term of reliability comparing with other sense amplifiers for MTJ sensing. In this paper, we present a new MFA design based on PCSA circuit and logic-in memory architecture. By using STMicroelectronics 65 nm design kit [18] and a precise MTJ model based on thermally assisted switching (TAS) approach [19], simulations and calculations have been performed to demonstrate its 0018-9464/$26.00 2011 IEEE

4612 IEEE TRANSACTIONS ON MAGNETICS, VOL. 47, NO. 11, NOVEMBER 2011 Fig. 2. Six transistors and one capacitance based dynamic current mode (DCM) sense amplifier [16]. high reliability, low power and small die. The rest of the paper is organized as follows: Section II describes briefly writing and reading scheme used in this MFA design; In Section III, we present the detail circuit schematic, and then the results for reliability analysis; after that, we compare the proposed MFA with conventional CMOS full adder in term of power consumption and die area; some conclusions are addressed in the last section. II. WRITING AND READING SCHEME A. Thermally Assisted MTJ Writing There are mainly three methods to switch MTJ: field induced magnetization switching (FIMS), thermally assisted switching (TAS) and spin torque transfer (STT). FIMS is used in the first generation of MRAM, which was successfully commercialized since 2006 [20]. However it suffers from the intrinsic issues such as power consumption, poor selectivity, and poor scalability due to the high switching currents ( 10 ma). TAS and STT approaches promise less writing power consumption and good scalability, they are considered as the second generation of MRAM [7], [21]. STT requires only one bi-directional low switching current ( 150 ua@65 nm). Its fast writing access and good scalability make it the most promising technology for MRAM. A number of STT-MRAM prototypes have been developed recently [5], [8] and thermal reliability issue at very small dimension was found due to the in-plane anisotropy storage, which leads to random state disturbance during sensing and difficulty to keep data 10 years. A new trend of STT-MRAM is to use perpendicular anisotropy for the storage [22], which promises to overcome the thermal reliability issue. However, it presents lower TMR ratio ( 120%), which limits its interest for logic applications (see Fig. 9). TAS is based on the temperature dependent exchange bias effect, which plays the role as a key to access the random magnetic switching and then improve significantly the thermal stability of magnetization storage. Fig. 3 shows the TAS-MTJ in nanopillar form. The fixed (reference) layer is pinned by an anti-ferromagnetic (AF) layer with a high blocking temperature to prevent any switching. The free layer (storage) is pinned by an AF layer with a low blocking temperature. To perform TAS switching, a low current ( 100 ua@65 nm) is used to pass through MTJ and heat it up to the blocking Fig. 3. (a) TAS magnetic tunnel junction (MTJ) is the basic TAS-MRAM cell. It is mainly composed of five layers, two ferromagnetic layers (reference layer and storage layer), two anti-ferromagnetic layers with different blocking temperatures (Tb1: 300 and Tb2: 150 ) and one oxide barrier. For the practical applications, the magnetization of the reference layer is often fixed and that of storage layer can be changed to parallel (P) or anti-parallel (AP) representing the logic value 1 and 0. (b) The shape of TAS-MTJ is circular allowing easy fabrication and low process variation. (c) The mechanism to switch the MTJ from P to AP state, I is used at first to heat the MTJ device and I is then activated to align the magnetization of storage layer. (d) The mechanism to switch the MTJ from AP to P state. TABLE I COMPARISON BETWEEN TAS AND STT (IN-PLANE) temperature of free layer when a greatly reduced switching current ( 4 ma) can change the state of MTJ [23], [24] (see Fig. 3). Besides the stability improvement, TAS keeps high write/read speed, unlimited endurance, high TMR ratio, and easy 3-D integration as classical MTJ (see Table I). Moreover, its fabrication technology is mature for practical applications [25]. Thereby, TAS approach is used in our MFA design. B. Pre-Charge Sense Amplifier (PCSA) There are mainly three types of sense amplifier (SA) suitable for hybrid MTJ/CMOS logic circuits, which require extremely high sensing speed. They are SRAM based SA, DCM based SA, and pre-charge SA (PCSA). Among them, PCSA provides the best sensing reliability and power efficiency while keeping high-speed performance ( 200 ps) [17]. As shown in Fig. 4, it consists of pre-charge sub-circuit (MP2-3), discharge sub-circuit (MN2) and a pair of inverters (MN0-1 and MP0-1), which act as current sense amplifier. During the pre-charge phase (low level of SEN ), transistors MP2-3 turn on, discharge transistor MN2 turns off. Qm and Qm_bar are both pulled up to Vdd. The logic evaluation can be achieved as SEN is set to high level, when MP2-3 turn off, and MN2 turns on allowing the discharge of both Qm and Qm_bar.

GANG et al.: A HIGH-RELIABILITY, LOW-POWER MAGNETIC FULL ADDER 4613 Fig. 4. Seven transistors based on pre-charged sense amplifier (PCSA), MTJ0 and MTJ1 are always in opposite configuration. As the pair of MTJs is always in opposite configuration (MTJ1: and MTJ0:, shown in Fig. 4), the discharge currents in these two branches are different. In this case, Qm drops down more quickly than Qm_bar. When Qm reaches the switching threshold voltage of MP0 firstly, Qm_bar will be pulled up to Vdd or logic 1 and Qm will be further pulled down to Gnd or logic 0. Same as DCM SA, PCSA does not dissipate any static power because its pre-charge and discharge transistors would never turn on simultaneously. PCSA circuit can overcome the two major drawbacks of DCM SA. Firstly, error rate of output versus mismatch variation is greatly improved by changing the voltage feedback from inverters (MN0-1 and MP0-1) instead of from simple pull-up PMOS (MP0-1) [17]. Secondly, unlike DCM that requires a capacitance as a virtual ground to prevent the static current from Vdd to Gnd, PCSA achieves this feature by using the inverters, which are immune to static current by nature. Previously, PCSA was also proposed to build CMOS low power dynamic logic circuit as DCM [26]. III. DESIGN OF MAGNETIC FULL ADDER (MFA) PCSA generates both true and complementary outputs, which allow it to design a complete logic family without additional inverters. Combined with MOS logic tree, all basic logic functions could be easily implemented such as 2-input AND and XOR shown in Fig. 5. A. PCSA Magnetic Full Adder (MFA) Design Following the same method, we design a magnetic full adder (MFA) consisting of PCSA, MOS logic tree and MTJ cells [see Fig. 6(b)]. The 1-bit full adder function can be given by following equations, where three 1-bit inputs A, B, Carry in (Ci) generate two 1-bit outputs SUM and Carry out (Co): (1) (2) It s important to note that the four NMOS transistors MN16-20 are used to generate the heating currents passing through the MTJs during write operation. However the circuit to generate (see Fig. 3) is not shown in Fig. 6(b), as it Fig. 5. Logic gate based on PCSA, (a) AND logic, (b) XOR logic. A is volatile logic data and B is non-volatile data. is independent from the MTJ sensing/heating circuit and could be shared globally by a number of MTJs to obtain the better power and area efficiency [27]. B. Functional Simulation STMicroelectronics CMOS 65 nm low power (LP) design kit [18] and a precise TAS-MTJ electrical model [19] are used for hybrid MTJ/CMOS circuit simulation. For MTJ, resistance area product (R.A), TMR ratio and its diameter are set respectively to 30\, 150% and 65 nm. All transistors in CMOS adder and MFA circuits are in minimum width 0.135 if not specified. Fig. 7 shows the transient hybrid simulation of PCSA based MFA. All eight possible combinations of input signals A, B, and Ci have been given in the simulation. The outputs of SUM and Co confirm the function of MFA and the sense delay is as low as 78 ps for the carry out (Co) and 136 ps for SUM function. The sense delay is not the same for the different input B and the maximum delay obtained in our simulation is 170 ps for the Co as 1. The writing access time for TAS-MTJ is about 15 ns, which is composed of heating phase ( 10 ns) and cooling phase ( 5 ns). The heating phase duration could be reduced by increasing the heating current value at a cost of larger transistor width. However the cooling phase is not scalable as it is mainly defined by the material composition of MTJ nanopillar. C. Reliability Simulation As the technology node scales down below 90 nm, process and mismatch variations increase significantly. For IC design, high reliability becomes one of the most crucial performances,

4614 IEEE TRANSACTIONS ON MAGNETICS, VOL. 47, NO. 11, NOVEMBER 2011 Fig. 7. Transient simulation of MFA. Outputs SUM and Co are pre-charged as Clock is set to 0 and are evaluated as Clock is changed to 1. To enumerate all inputs patterns, the input bit B which is stored in pair of MTJs is programmed from 1 to 0 and then back to 1. Fig. 6. (a) Magnetic full adder (MFA) based on DCM SA. MTJ is switched by spin transfer torque (STT) approach. (b) MFA based on high stability low power pre-charged sense amplifier (PCSA). MTJ is switched by TAS approach. The MTJ corresponding to B stores data 1 or 0 in non-volatile mode when it is in configuration parallel or anti-parallel. especially for analog amplifier circuit, in which MOS transistors operate in linear region and extremely sensitive to the threshold voltage variation. Cadence Monte-Carlo statistical analysis tool [28] is used to perform the reliability simulation. Both CMOS (STMicroelectronics 65 nm default parameter) and MTJ (TMR ratio and R.A) parameter variations have been taken into account. The purpose of this simulation is to observe the functional error ratio (see Fig. 8) due to the mismatch and process variation, and then find solution to improve the reliability with low degradation of other performance like die area, speed and power etc. Fig. 9 shows the error percentage of three MFA designs versus TMR ratio changing from 100% to 350% with. For DCM MFA with MOS transistors in the minimum size, the error rate can be up to 45% as the TMR ratio is set to 150%, shown in the curve (a). PCSA MFA design can reduce significantly the error rate down to 15% keeping the same transistor size and MTJ parameters (see curve (b)). As TMR is set to 350%, PCSA MFA shows only one Fig. 8. Transient response of statistical analysis for DCM MFA. 3 outputs in error are found for 10 runs due to the mismatch and process variation. error for 100 simulations and there are still 5 errors for DCM MFA. In average, PCSA MFA design presents 4 times better reliability than DCM MFA circuit at minimum technology node. Different from PCSA MFA where there are only MOS transistors and MTJs, there are also two capacitors in DCM MFA [see Fig. 6(a)], which should be more than 60 ff to ensure the expected full adder operations according to the (3) [16]. In the reliability simulation shown in curve (a) in Fig. 9, the capacitance is set to 60 ff, which occupies about 5 @65 nm technology node. These two capacitors take about 30% the whole die area of DCM MFA. Thereby, the minimum area of DCM MFA is about 1.7X (X: minimum area of PCSA MFA) that of PCSA MFA. The curve (c) shows the error rate evaluation for PCSA MFA with larger size transistors, which leads to 0.6X area overhead and presents nearly the same area with DCM MFA at minimum technology node. For this design, the error rate can be reduced below 0.01% (there is zero functional error for simulations) as TMR ratio is set to 350%. (3)

GANG et al.: A HIGH-RELIABILITY, LOW-POWER MAGNETIC FULL ADDER 4615 Fig. 9. Error rate versus TMR ratio counted for DCM based full adder and proposed design using Monte-Carlo statistical analysis. ( is output voltage swing, is loaded capacitance.) The Monte-Carlo simulation results shown in Fig. 9 demonstrate the higher reliability of PCSA MFA than DCM MFA and suggest two efficient methods: higher TMR ratio and larger die area to reduce further the error rate. However, it is very difficult to increase the TMR ratio up to 250% for practical applications from technological point of view as some special techniques/materials are required [6]. Larger die area of MFA can improve the reliability but a tradeoff should be done regarding the area budget and reliability performance. D. Performance Comparison With CMOS Full Adder (FA) Along with further scaling down of CMOS technology, it is well known that the leakage currents increase exponentially leading to high static power dissipation. PCSA MFA design could be a good solution to overcome this issue. In this subsection, we demonstrate the comparison between PCSA MFA design and conventional CMOS FA in terms of die area, calculating delay, static and dynamic power at 65 nm technology node (see Table II). As mentioned above, DCM FA circuits [15], [16] not suitable for deep submicron technology ( 90 nm), thereby they are not considered in the comparison. The conventional CMOS FA, shown in Fig. 10 is taken from standard cell library of STMicroelectronics design kit, which is based on pass gate logic. In order to characterize these two designs in equivalent condition, two full latches are added in CMOS FA to synchronize the outputs with clock signal, as PCSA MFA is naturally synchronized. Minimum width transistors (0.135 @65 nm) are used for both CMOS FA and PCSA MFA except for the MTJ switching circuits. All the simulations are performed at 500 MHz and. Our simulation shows that PCSA MFA can achieve nearly 3X dynamic energy reduction as compared with conventional CMOS FA in 65 nm technology node. However the speed is lower and the gain of power-delay product is then shrunken down to only 33% (see Table II). Recently, a theoretical study on energy-performance analysis between DCM MFA and CMOS FA has been shown in [29], which confirms our simulation results. The most important advantage of MFA is that it can be completely powered off in standby mode thanks to the non-volatility of MTJ and then consume nearly zero static power. However, the CMOS FA should be always active to keep the data and then dissipate important static power due Fig. 10. Schematic of 1 bit pass-gate based conventional CMOS full adder with synchronized outputs. TABLE II COMPARISON BETWEEN CMOS FA AND PCSA MFA to the high leakage currents. As static power plays a more and more important role in the whole power dissipation of logic circuits following the miniaturization of fabrication node beyond 90 nm [3], PCSA MFA could present more and more interest for ultra low power logic. Note that the write energy when updating stored inputs in this non-volatile MFA is not compared here as it depends on a number of elements such as speed requirement [15], architecture design and computing algorithm. For example, the switching current of TAS can be shared by a word (e.g. 32 bits/ word) to reduce the write energy per bit [27]. The chip area could as well as be reduced, as the number of MOS devices is 35% less. The area of MTJs is not taken into account because of its 3D integration over CMOS circuit at the back-end process [7]. We note that there is only 13% reduction in this PCSA MFA design (see Fig. 11) due to the TAS MTJ switching approach, which requires the transistors (MN16-20) with 4X minimum width to generate locally heating currents. The heating current can be reduced greatly by either shrinking the size of MTJ or improving the R.A parameter. This allows the die area of MFA to be reduced further. In addition, a new switching approach: Thermally Assisted Spin Transfer Torque Switching (TAS+STT) [30], [31] is demonstrated experimentally recently, which could combine high reliability of TAS and low power of STT. IV. CONCLUSION AND PERSPECTIVE MFA design based on PCSA sense amplifier is presented in this paper. It promises high reliability, low power and small

4616 IEEE TRANSACTIONS ON MAGNETICS, VOL. 47, NO. 11, NOVEMBER 2011 Fig. 11. (a) Full layout of PCSA MFA using CMOS 65 nm design kit and 65 nm TAS-MTJ. Based on parameter cell (p-cell) design, the area of our design takes about 20 m. (b) Zoom on the structure to connect the MTJ with CMOS circuits by using Metal4 (M4) and Metal5 (M5). (c) 3D Mixed CMOS/MTJ process allows MTJs integrated above CMOS circuits at the back end process. As a result, integration of MTJ doesn t take more die area. die area. By using STMicroelectronics 65 nm design kit and a precise TAS MTJ model, Monte-Carlo statistical simulations show a great improvement of reliability and die area compared with former MFA design. The comparison with a conventional CMOS FA confirms the advantages of power efficiency and die area. Based on this PCSA MFA, more complex logic circuits (e.g. Magnetic CPU) could be built for high reliable low power applications. With rapid progress in the write/read performance of MTJ [8], [22], [30], [31], we believe that hybrid MTJ/CMOS circuit can become a mainstream to build non-volatile, highspeed, ultra low-power, and compact VLSI. ACKNOWLEDGMENT The authors wish to acknowledge support the French Agence Nationale de la Recherche ANR through all partners of 2006 CILOMAG project, 2009 NANOINNOV-SPIN and Nano2012 project with STMicroelectronics. The views expressed are solely those of the authors, and the other Contractors and/or ANR and/or ST Microelectronics cannot be held liable for any use that may be made of the information contained herein. The authors also thank Guillaume Prenat and Bernard Dieny from SPINTEC laboratory for decisive inputs scientific discussions and crucial help with the simulation model of TAS-MRAM. REFERENCES [1] M. Jullière, Tunneling between ferromagnetic films, Phys. Lett., vol. 54A, pp. 225 226, 1975. [2] J. S. Moodera, L. R. Kinder, T. M. Wong, and R. Meservey, Large magnetoresistance at room temperature in ferromagnetic thin film tunnel junctions, Phys. Rev. Lett., vol. 74, pp. 3273 3276, 1995. [3] N. S. Kim et al., Leakage current: Moore s law meets static power, IEEE Computer, vol. 36, no. 12, pp. 68 75, 2003. [4] B. Engel et al., A 4-Mb toggle MRAM based on a novel bit and switching method, IEEE Trans. Magn., vol. 41, no. 1, pp. 132 136, Jan. 2005. [5] T. Kawahara et al., 2 Mb spin-transfer torque RAM (SPRAM) with bit-by-bit bidirectional current write and parallelizing-direction current read, in Proc. IEEE-ISSCC, 2007, pp. 480 481. [6] S. Ikeda et al., Tunnel magnetoresistance of 604% at 300 K by suppression of Ta diffusion in CoFeB/MgO/CoFeB pseudo-spin-valves annealed at high temperature, Appl. Phys. Lett., vol. 93, p. 082508, 2008. [7] M. Hosomi et al., A novel non-volatile memory with spin torque transfer magnetization switching spin-ram, in Proc. IEEE IEDM, 2005, pp. 473 476. [8] C. J. Lin et al., 45 nm low power CMOS logic compatible embedded STT MRAM utilizing a reverse-connection 1 T/1 MTJ cell, in Proc. IEEE IEDM, 2009, pp. 279 282. [9] J. P. Wang and X. Yao, Programmable spintronic logic devices for reconfigurable computation and beyond history and outlook, J. Nanoelectron. Optoelectron., vol. 3, pp. 12 23, 2008. [10] W. S. Zhao, E. Belhaire, C. Chappert, and P. Mazoyer, Power and area optimization for run-time reconfiguration system on programmable chip based on magnetic random access memory, IEEE Trans. Magn., vol. 45, no. 2, pp. 776 780, Feb. 2009. [11] G. Prenat et al., CMOS/magnetic hybrid architectures, in Proc. IEEE ICECS, Morocco, 2007, pp. 190 193. [12] W. S. Zhao, E. Belhaire, C. Chappert, and P. Mazoyer, Spin transfer torque (STT)-MRAM based run time reconfiguration FPGA circuit, ACM Trans. Embedded Comput. Syst., vol. 9, no. 2, Oct. 2009, article 14. [13] H. Meng, J. Wang, and J. P. Wang, A spintronics full adder for magnetic CPU, IEEE Electron Device Lett., vol. 26, no. 6, pp. 360 362, 2005. [14] S. Lee, S. Seo, S. Lee, and H. Shin, A full adder design using serially connected single-layer magnetic tunnel junction elements, IEEE Trans. Electron Devices, vol. 55, no. 3, pp. 890 895, 2008. [15] S. Matsunaga et al., Fabrication of a nonvolatile full adder based on logic-in-memory architecture using magnetic tunnel junctions, Appl. Phys. Express (APEX), vol. 1, pp. 091301-1 091301-3, 2008. [16] M. W. Allam and M. I. Elmasry, Dynamic current mode logic (DyCML): A new low-power high-performance logic style, IEEE J. Solid-State Circuits, vol. 36, no. 3, pp. 550 558, 2001. [17] W. S Zhao, C. Chappert, V. Javerliac, and J.-P. Noizière, High speed, high stability and low power sensing amplifier for MTJ/CMOS hybrid logic circuits, IEEE Trans. Magn., vol. 45, no. 10, pp. 3784 3787, Oct. 2009. [18] STMicroelectronics, Manuel of Design Kit for CMOS 65 nm 2009. [19] M. Elbaraji et al., Dynamic compact model of thermally assisted switching magnetic tunnel junctions, J. Appl. Phys., vol. 106, p. 123906, 2010. [20] Everspin [Online]. Available: http://www.everspin.com [21] O. Redon et al., Thermo assisted MRAM for low power applications, in Proc. ICMTD, France, 2005, pp. 113 114. [22] S. Ikeda et al., A perpendicular-anisotropy CoFeB-MgO magnetic tunnel junction, Nature Mater., vol. 9, pp. 721 724, 2010. [23] I. L. Prejbeanu et al., Thermally assisted MRAM, J. Phys.-Condens. Matter, p. 165218, 2007. [24] James and G. Deak, Thermal MRAM, in Proc. IEEE-ICCD, USA, 2005. [25] Crocus Technology [Online]. Available: http://www.crocus-technology.com [26] D. Somasekhar and K. Roy, Differential current switch logic: A low power DCVS logic family, IEEE J. Solid-State Circuits, vol. 31, no. 7, pp. 981 991. [27] W. S. Zhao et al., TAS-MRAM based non-volatile FPGA logic circuit, in Proc. ICFPT, 2007, pp. 153 160. [28] Cadence 5.1.41 Spectre, Virtuoso Analog Design Environment User Guide, 2006. [29] F. Ren and D. Markovic, True energy-performance analysis of the MTJ-based logic-in-memory architecture (1-bit full adder), IEEE Trans. Electron Devices, vol. 57, no. 5, pp. 1023 1028, 2010. [30] B. Dieny et al., Spin-transfer effect and its use in spintronic components, Int. J. Nanotechnol., vol. 7, pp. 591 614, 2010. [31] H. W. Xi et al., Spin transfer torque memory with thermal assist mechanism: A case study, IEEE Trans. Magn., vol. 46, no. 3, pp. 860 865, Mar. 2010.