

# **Power-aware voltage tuning for STT-MRAM reliability**

Elena Ioana Vatajelu, Rosa Rodríguez-Montañés, Stefano Di Carlo, Marco Indaco, Michel Renovell, Paolo Prinetto, Joan Figueras

### **To cite this version:**

Elena Ioana Vatajelu, Rosa Rodríguez-Montañés, Stefano Di Carlo, Marco Indaco, Michel Renovell, et al.. Power-aware voltage tuning for STT-MRAM reliability. ETS: European Test Symposium, May 2015, Cluj-Napoca, Romania. 10.1109/ETS.2015.7138748. lirmm-01922971

# **HAL Id: lirmm-01922971 <https://hal-lirmm.ccsd.cnrs.fr/lirmm-01922971v1>**

Submitted on 14 Nov 2018

**HAL** is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L'archive ouverte pluridisciplinaire **HAL**, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d'enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

# Power-Aware Voltage Tuning for STT-MRAM Reliability

Elena I. Vatajelu<sup>1</sup>, R. Rodriguez-Montañés<sup>2</sup>, S. Di Carlo<sup>1</sup>, M. Indaco<sup>1</sup>, M. Renovell<sup>3</sup>, P. Prinetto<sup>1</sup>, J. Figueras<sup>2</sup><br><sup>2</sup> Pont ef Electronic Engineering, University Politègries de Catalunya (UPC), Perselane Spain

<sup>2</sup>Dept. of Electronic Engineering, Universitat Politècnica de Catalunya (UPC), Barcelona, Spain

<sup>3</sup>LIRMM, Montpellier, France

*Abstract***—One of the most promising emerging memory technologies is the Spin-Transfer-Torque Magnetic Random Access Memory (STT-MRAM), due to its high speed, high endurance, low area, low power consumption, and good scaling capability. In this paper we estimate the STT-MRAM cell reliability under fabrication- and aging-induced process variability by evaluating its failure probability. We analyze the effect of control voltage tuning on the fresh and aged cell failure probabilities and as a result, we propose a power- and agingaware circuit level variability mitigation technique based on control voltage tuning. We observed that increasing the values of control voltages, the cell failure probability is reduced at different extends (according to the control voltage under variation) but also that the power consumption is increased. As a result, we have identified the control voltage with the highest impact on the fresh cell reliability and on the endurance of the cell under study. Subsequently, by performing a power/reliability trade-off analysis the appropriate value of this control voltage is determined.**

#### *Keywords— STT-MRAM, Process Variability, Reliability, Endurance, Voltage Tuning, Power-Aware Analysis.*

#### I. INTRODUCTION

With technology scaling, the short-comings of wellestablished memory technologies, like SRAM, DRAM, and flash are becoming insurmountable. This is especially due to the ever increasing need of high capacity, high performance memories running with very low power. These issues brought forth an increased interest in new memory technologies as Magnetic RAM (MRAM) or Resistive RAM (RRAM). One of the most promising emerging technologies is the MRAM based on the Spin-Transfer-Torque phenomenon (the STT-MRAM), due to its high speed, high endurance, low area, low power consumption, and good scaling capability. The Spin-Transfer Torque Magnetic Random Access Memory (STT-MRAM) is a promising candidate for next generation embedded memories [1]. It offers faster read and write access time and better CMOS integration than other available technologies with similar features. However, the STT-MRAM cell fabrication is facing a set of challenges that impact performance and reliability. These issues are mainly related to process variations of MOS and MTJ devices ([2][3][4][5]), and to the thermal fluctuations in the MTJ switching ([6]).

In the recent years, considerable effort has been dedicated to the evaluation and improvement of STT-MRAM cell reliability. For instance, [7] proposes a classification of STT-MRAM failures based on their physical characteristics, in 'soft failures' (due to stochasting switching and limited thermal stability) and 'hard failures' (due to oxide barrier breakdown and oxide thickness variability), based on which the week cells can be identified. A statistical modeling of failure events of an STT-MRAM cell under process variability is presented in [2]. Here, the failure mechanisms of the MRAM cell are classified and modeled as read failures (decision failure and disturbance failure) and write failure. Robustness metrics for cell evaluation have been proposed in [8]. These metrics provide a way to estimate the extreme parameter variations causing a cell failure, current noise margins and cell failure probability (when failures are observed).

Several circuit techniques have been proposed to improve the robustness of an STT-MRAM memory, like multi-terminal structures [9], new design paradigm decoupling conflicting design requirements between read stability and writability [2], using complementary polarizers in the cell design for selfreferencing and improved write current [10], or using asymmetrically doped transistors which mitigates the conflict between writability and write power [11]. Circuit-level solutions that enable smaller bit-cell area with improved yield, (bit-line voltage boosting, word-line voltage boosting, access transistor body biasing, and an applied external magnetic field) are proposed in [12].

In this paper, we propose a power- and aging-aware circuit level variability mitigation technique based on control voltage tuning. We estimate the STT-MRAM cell reliability under fabrication- and aging-induced process variability by evaluating its failure probability. This analysis is performed at different control voltages (i.e., supply voltage, word-line voltage, bit-line voltage, source-line voltage, access NMOS body-bias), from here on referred to as *'knobs'*, to analyze the effect of aforementioned voltages on the reliability of memory cells affected by variability. Based on this analysis we identify the most efficient reliability boosting knob and, on the bases of a power/reliability trade off analysis, its optimum value.

The rest of the paper is organized as follows. In Section II the basic operation principle of an STT-MRAM cell. Section III contains a failure analysis of the memory cell, based on its electrical characteristic, including also a discussion on how modifying the control voltages of the cell can affect the cell functionality margins and reliability. In Section IV, we include an estimation of the cell reliability evaluated under process variability and aging effects, assuming nominal and non-nominal values of the control voltages. In Section V we show which of the tested control voltages has the dominant effect on the cell reliability and estimate its optimum value analyzing reliability/power trade-off. Section VI concludes the paper.

#### II. STT-MRAM OPERATION

The storing element of an STT-MRAM memory cell is the magnetic tunneling junction (MTJ) device. Typically, an MTJ element is manufactured in thin-film technology. It consists of two ferromagnetic layers (FLs), characterized by their magnetic orientation, separated by an oxide barrier (Fig. 1(a)). If the barrier is thin enough (typically 1-3nm), electrons can tunnel between ferromagnetic layers. The magnetic orientation of one of the magnetic layers is fixed, set at fabrication time. This layer is referred to as *pinned layer*. The other magnetic layer, referred to as *free layer*, has a freely rotating magnetic orientation that can be dynamically changed by forcing sufficiently large tunneling currents across the device. The conductance of such a magnetic tunneling junction is defined by the relative magnetic orientations of the two layers. If the

magnetizations are in parallel orientations, it is more likely that electrons will tunnel through the thin oxide layer then if the magnetizations are in anti-parallel orientations. This effect is called *tunneling magnetoresistance* effect (TMR). Therefore, the MTJ device exhibits high electrical conductance (low electrical resistance,  $R_{MTJ} = R_L$ ) when the magnetization directions of the two FLs are parallel and low conductance (high electrical resistance,  $R_{MTJ} = R_H$ ) when they are anti-parallel. The TMR effect is characterized by means of the *TMR ratio*, which is defined as the relative resistance change between the two magnetized states. The TMR ratio is therefore defined as:  $TMR = (R_H - R_L)/R_L$ . In order to change the relative magnetic orientation of the MTJ device, there must be sufficient current (*I<sub>MTJ</sub>*) flowing through it long enough to be able to switch the magnetic orientation of the free-layer [13]. The electrical resistance of the MTJ device  $(R<sub>MTJ</sub>)$  changes with the voltage drop across the device; the voltage-resistance behavior exhibits a hysteresis characteristic (Fig. 1(c)) from which the electrical properties of the MTJ pillar (its low and high resistance and the threshold switching currents) can be extracted.

When the MTJ device is used as binary data storage, the parallel state is associated with the logic value '0' and the anti-parallel state is associated with the logic value '1' (Fig.  $1(a)$ ).

Several STT-MRAM cell implementations have been proposed. In this work we target the popular 1T-1MTJ structure. In this topology, the memory cell consists of one MTJ device connected in series with one NMOS transistor. The cell is accessed by the corresponding control lines, i.e., Bit Line (BL), Source Line (SL) and Word Line (WL) (Fig.  $1(b)$ 

When a transition from anti-parallel to parallel relative magnetizations, i.e., a writing '0' operation (*W0*), is desired, the Word Line (*WL*) and Source Line (*SL*) are connected to power supply  $(V_{DD})$ , while Bit Line  $(BL)$  is grounded, hence a current  $I_{MTJ}$  flows in the MTJ device. Provided that  $I_{MTJ} > I_{HL}$ , the cell switches to the parallel state, in which case,  $R_{MTJ}$ becomes equal to *RL*. When a transition from parallel to antiparallel relative magnetizations is desired, i.e., a writing '1' operation  $(WI)$ , the power supply voltage  $(V_{DD})$  is applied to Word Line (*WL*) and Bit Line (*BL*), while Source Line (*SL*) is grounded. Provided that the resulting current *IMTJ>ILH*, the cell switches to the anti-parallel state  $(R_{MT} = R_H)$ . Here,  $I_{HL}$ represents the switching threshold current from anti-parallel to parallel state, while *ILH* represents the switching threshold current from parallel to anti-parallel state. The switching conditions are marked in Fig. 1(c) (by blue circles), where  $V_{DC}$  is the voltage drop across the MTJ device.



Fig. 1. The STT-RAM Memory cell: a) MTJ configurations; b) Electric circuit of 1T1MTJ structure; c) The  $R_{MTJ} - V_{DC}$  hysteresis characteristic.

During the read operation, a small bias voltage is applied on the control lines, resulting in a current  $(I_R)$ . Based on this current a decision is made on the memorized state by comparing it against a reference value  $(I_{REF})$ . A reading current higher than the reference value  $(I_R > I_{REF})$  translates in a

read '0' operation, while a reading current lower than the reference value  $(I_R < I_{REF}$ ) translates in a read '1' operation.

The value of the current required for switching between magnetization states depends on several factors, including the physical dimensions of the MTJ and the materials used, the temperature of operation, and the duration of the applied control signal [14].

#### III. STT-MRAM PARAMETRIC RELIABILITY ANALYSIS

The major sources of process variations affecting the electrical resistance of the MTJ device include variations in the tunneling oxide thickness and the cross-section area of the free ferromagnetic layer. Variations in these parameters result in a spread of  $R_H$  and  $R_L$  values. In addition, the NMOS transistor may also suffer from process parameter variations, which impact its threshold voltage  $(V_{TH})$ , resulting in variations of operation current. Under these assumptions, we perform the cell reliability analysis starting from a three dimensional space of parameter variations  $(R_L, R_H, V_{TH})$ , and applying the Satisfiability Boundary–Statistical Integration (*SB-SI*) method [17] for failure probability estimation. In this 3D space, the correct/faulty response of the cell has been characterized as explained in [8] and summarized below.

#### *A. STT-MRAM Failure Mechanisms*

An STT-MRAM cell can fail due to an unsuccessful write operation (write failure – WF), a destructive read operation (read disturb – RD) or a wrong decision during the read operation (read failure  $-$  RF), or due to spontaneous magnetic direction flip during data retention (data retention failure – DRF).

In the case of a writing operation, the current flowing through MTJ has to be large enough, and of sufficiently long duration, to allow the switching of the magnetization direction of the free ferromagnetic layer. To allow for a correct write operation (sufficient current) the high and low values of the MTJ electrical resistance must be below *RHMAX-W* and *RLMAX-W*, respectively. The region in the MTJ resistance space in which the  $R_H$  values are higher than  $R_{HMAX-W}$  indicates write '0' faulty behavior:  $W0F$ , while the one in which  $R_L$  values are higher than *RLMAX-W* indicates write '1' faulty behavior: *W1F*. Furthermore,  $(R_L, R_H)$  pairs must guarantee that  $R_H > R_L$ (*TMR>0%*). All these boundaries are shown in Fig. 2(a) in the two dimensional space of the MTJ resistance.

During read operation the current flowing through the cell  $(I_R)$  is compared with a reference value  $(I_{REF})$ . The reference current is assumed ideal and equal to the average current flowing through two ideal cells in complementary states, biased for read operation [16]. If  $I_R < I_{REF}$ , the state is read as '1', i.e., the MTJ is in its anti-parallel state,  $R_{MTJ}=R_H$ . In this case,  $R_H$  must be high enough  $\left( \frac{R_{HMIN-R}}{R_{HMIN-R}} \right)$  for the current condition to be satisfied (otherwise a read '1' fault occurs: *R1F*). If  $I_R > I_{REF}$ , the state is read as '0', i.e., the MTJ is in its parallel state  $(R<sub>MT</sub>=R<sub>l</sub>)$ . In this case,  $R<sub>l</sub>$  must be small enough  $(\langle R_{LMAX-R} \rangle)$  for the current condition to be satisfied (otherwise a read '0' fault occurs: *R0F*).

The union of faulty write and faulty read operation regions (red regions in Fig. 2(a)) represents the overall failure region for the cell under analysis, while the remainder of the parameter space represents the *acceptance region* (green region in Fig.  $2(a)$ , the region in the MTJ resistance space, where the cell operates correctly.

The aforementioned regions are extracted assuming nominal NMOS access transistor. For a more comprehensive characterization of the cell failure mechanisms in the parameter space, a 3rd dimension is added (for the threshold voltage of the NMOS transistor –  $V_{TH}$ ), as depicted in Fig. 2(b). The considerations on the  $R_{MTJ}$  do not change; the acceptance region is bounded by the same constraints.

However, the values of these constraints (i.e.,  $R_{HMAX-W}$ ,  $R_{LMAX-W}$  $W$ ,  $R_{LMAX-R}$  and  $R_{HMIN-R}$ ) are dependent on the driving capability of the NMOS transistor. A low value  $V_{TH}$  means higher driving capability of the NMOS, which translates into relaxation of read and write operation constraints  $(R_{HMAX-W}$ *RLMAX-W, RHMIN-R,* and *RLMAX-R*). This leads to a larger acceptance region (as shown in Fig. 2(b) bottom crosssection). The situation is reversed when the NMOS threshold voltage is large (see Fig. 2(b) upper cross-section).

The set of coordinates in the 3D space  $(R_L, R_H, V_{TH})$ which bound the acceptance regions give the Satisfiability Boundary [17]. The failure probability of the STT-MRAM cell under read and write failures is evaluated using the SB-SI method in [17]. Here, the failure probability is defined as the probability that the device parameters lay outside the acceptance region and it is given by:

$$
P_{RF\&WF} = 1 - \int_0^{\min(R_{DLVX-F}, R_{DLVX-F})} \int_{R_{BLV-F}}^{R_{DLVX-F}} \int_{V_{H\to\text{max}}}^{V_{H\to\text{max}}} f(R_L, R_H, V_{TH}) dR_L dR_H dV_{TH} \tag{1}
$$

with  $R_{LMAX-R}$ ,  $R_{LMAX-W}$ ,  $R_{HMIN-R}$ ,  $R_{HMAX-W}$  previously defined,  $V<sub>TH-min</sub>$  and  $V<sub>TH-max</sub>$  the extremes values of the NMOS threshold voltage for correct operation, and  $f(R_L, R_H, V_{TH})$  the probability density of the joint (cumulative) distribution function defining the statistical distribution of the three parameters.



Fig. 2. a) 2D illustration of failure mechanisms constraints during read and write operations of the 1T-1MTJ STT-MRAM. Here *W0F* represents the write '0' failure region, *W1F* represents the write '1' failure region, *R0F* represents the read '0' failure region, *R1F* represents the read '1' failure region, *TMR0* represents the region where TMR<0, and *OK* represents the NO failure region, i.e., the *acceptance region*; b) 3D representation of acceptance region in the  $(R_L, R_H, V_{TH})$  parameter space. Three 'slices' are emphasized: the middle one corresponding to nominal value for  $V_{TH}$ , while the top and bottom ones correspond to  $V_{TH\text{-}MAX}$  and  $V_{TH\text{-}MIN}$ , respectively.

The destructive read operation (RD) and the data retention failure (DRF) are statistical phenomena due to spontaneous magnetic direction flip during read operation and data retention, respectively. All magnetic nanostructures suffer from thermally activated magnetization reversal. According to Néel-Brown theory, at finite temperature, there is a finite probability for the magnetization to flip and reverse its direction. The probability for the magnetization of not having flipped after a time *t* is given by the Néel-Brown model [19] is:

$$
P(t) = \exp(t/\tau) \tag{2}
$$

with *τ* the Néel relaxation time, the mean time between two flips. It is given by Néel-Arrhenius equation:

$$
\tau = \tau_0 \cdot \exp(\Delta E / k_B T) \tag{3}
$$

with  $\tau_0$  the attempt time (the inverse of the particle vibration frequency, typical value for magnetic recording:  $10^{-9}$ s [20]),  $k_B$ is the Boltzmann constant, *T* is the device temperature and *ΔE* is the height of the energy barrier between the two magnetization states of the free layer. The height of the energy barrier is given by:

$$
\Delta E = K_u \cdot V \tag{4}
$$

with  $K_u$  is the uniaxial anisotropy per unit volume and *V* is the volume of the free ferromagnetic layer. Therefore, the probability of data retention failure after a time *t* can be estimated as:

$$
P_{DRF}(t) = 1 - \exp\left[\frac{t}{\tau_0} \cdot \exp(-\frac{\Delta E}{k_B T})\right]
$$
 (5)

The same thermal effect takes place during read operation as well, therefore there is a probability of thermally activated magnetization reversal. However, in this case the read current  $(I_R)$  flowing through the MTJ device reduces the energy barrier against switching (*ΔE*) and the probability of read disturb after a time *t* is estimated as:

$$
P_{RD}(t) = 1 - \exp\left[\frac{t}{\tau_0} \cdot \exp\left(-\frac{\Delta E \cdot (1 - I_R / I_{c0})}{k_B T}\right)\right]
$$
(6)

with  $I_{c0}$  the critical current for switching by spin-transfertorque.

The probability of data retention failure and the probability of read disturb are strongly dependent on the materials used for the MTJ fabrication (more specifically on their uniaxial anisotropy,  $K_u$ ) and on the volume of the free layer  $(V)$ , since they have a direct effect on the height of the energy barrier  $(\Delta E)$  as seen in (4).

In this work, we analyze the behavior of an STT-MRAM cell, designed with a perpendicular-anisotropy CoFeB/MgO magnetic tunnel junction [18] . The uniaxial anisotropy of the used CoFeB alloy per unit volume is  $K_u = 1.09 \cdot 10^5 H \frac{\lambda^2}{m^3}$  and the free layer is designed with a circular surface of diameter *40nm* and thickness of *2.2nm*. Under these conditions, the thermal stability of the MTJ device is:

$$
\frac{\Delta E}{k_B T} = 72.78\tag{7}
$$

With this high thermal stability coefficient, the cell probability of data retention failure (eq. (5)) after a time  $t = 10$ years, is:

$$
P_{DRF}(10 \text{ years}) = 7.7 \cdot 10^{-15} \tag{8}
$$

which represents a very low probability that the data stored in the memory cell is lost after 10years of data retention. The cell probability of read disturb (eq. (5)) after a time  $t = 10$ years, assuming 50% of the time the cell is under read stress is:

$$
P_{RD}(10 \text{ years} \text{ @ } 50\%) = 6.2 \cdot 10^{-8} \tag{9}.
$$

Even if we assume that for 10 years the cell is under continuous read stress, the failure probability still remains low, in the order of  $10^{-7}$ . Given these low probabilities for DRF and RD occurrences, the rest of the paper will be focused on analyzing the cell reliability under write and read failures (*WF0*, *WF1*, *RF0*, *RF1*) by evaluating (1) under different conditions.

#### *B. Control Voltage Influence on the STT-MRAM Operation*

The STT-MRAM operation is directly affected by the values of the control voltages. In this work we include in the control voltage class the supply voltage  $(V_{DD})$ , the word line voltage  $(V_{WL})$ , the differential voltage between the Bit and the Source Line  $(V_{BL-SL})$  and the body bias of the NMOS access transistor  $(V_{BB})$ .

When boosting the supply voltage  $(V_{DD})$ , both the word line voltage  $(V_{WL})$  and differential voltage drop  $(V_{BL-SL})$ increase at the same rate. This translates into an increased current passing through the MTJ device caused by two concomitant effects: i) larger voltage drop  $(V_{BL-SL})$  on the same resistance; ii) larger gate voltage on the same NMOS transistor. A larger current controlling the MTJ device has the advantage of improving the quality of read and write operations, however it exerts a stress on the tunnel junction, which has a detrimental effect on the cell endurance to write operations. When reducing the supply voltage, the opposite effects are expected.

If the differential voltage drop  $(V_{BL-SL})$  and the word line voltage  $(V_{WI})$  are controlled independently, there is still variation in the MTJ current, but to a lesser extent than in the pervious scenario, since just one of the effects, i) or ii), is present at the time.

Since in our design we use a 4 terminal NMOS transistor, the body bias voltage (normally grounded) can be independently controlled. This voltage has a direct effect on the transistor threshold voltage, indirectly affecting its ON current. A negative  $V_{BB}$  causes a decrease in threshold voltage, therefore a larger current is allowed to pass through the NMOS transistor, with advantage of improving the quality of read and write and the disadvantage of exerting more stress on the tunnel junction. This translates into an increased reliability of the fresh cell but lower endurance to write stress.

#### IV. CELL RELIABILITY ESTIMATION

The reliability of a circuit in general, and of the STT-MRAM bit-cell in particular, is affected by fabrication- and aging- induced variability. The statistical distributions of *RL*,  $R_H$  and  $V_{TH}$  are assumed Gaussian and defined according to the available statistical data in literature [18][21]-[23]. Based on these data, the NMOS access transistor is designed with minimum length *L=40nm* (for 40nm technology node), its width is *W=270nm* and its nominal threshold voltage  $V_{TH}$ =0.285*V*. A Gaussian distribution of the V<sub>TH</sub> is assumed under fabrication-induced variability, with relative standard deviation  $\sigma/\mu = 10\%$ . The CoFeB/MgO MTJ device is designed with a circular base, with the diameter *d=40nm*, with a free layer with the thickness of  $t_f=2.2nm$  and an MTJ with the thickness  $t_{\alpha}$ =0.7nm. For this design, the electrical resistance of the MTJ device at zero volt bias is  $R_{MTJ}=R_L=2KΩ$  when the device is in parallel relative magnetization, and device is in parallel relative magnetization, and  $R_{MT} = R_H = 4K\Omega$  when the device is in anti-parallel relative magnetization, respectively. A Gaussian distribution of the  $R_{MTJ}$  is assumed under fabrication-induced variability, with relative standard deviations  $\sigma/u(R_J)=9.3\%$  and  $\sigma/\mu(R_L)$ =9.3%  $\sigma/\mu(R_H)$ =10.4%.

The NMOS transistor is subjected to aging effects like hot carrier injection (HCI) and bias temperature instability (BTI), but in scaled technologies and with the introduction of high-k metal gates, the BTI effect is the predominant one. The positive bias temperature instability PBTI, is a degradation phenomenon which affects the  $V_{TH}$  of the NMOS transistors stressed with positive gate voltage at high temperatures. The threshold voltage deviations induced by BTI effects depend on stress voltage, temperature and stress time. However the degradation suffered during the stress periods is followed by a recovery during the relaxation period. The short stress/long relaxation cycles experienced by the NMOS access transistor during the normal operation of an STT-RAM cell lead to insignificant variation of its threshold voltage for the current application.

The main mechanism of MJT element degradation is the breakdown phenomenon. At each write operation the tunnel barrier is exposed to an electrical stress which might cause an electrical breakdown. The typical breakdown energy is in the order of  $5.10<sup>8</sup>V/m$  [24]. During the anti-parallel to parallel write operation, the MTJ is subjected to a larger voltage stress then in the opposite write operation. For this reason, the analysis is focused on the degradation of  $R<sub>H</sub>$ . Using the percolation model to characterize the time-dependent dielectric breakdown [25], the dielectric material is modelled as a large number of parallel conducting paths. We assume the admittance of the parallel conducting paths to be *Y(0)* for the undamaged oxide, and  $Y(T_{BD})$  at the end of the breakdown process. The admittance at time *t*, after the MTJ element has been stressed, can be estimated as:

$$
Y(t) = (1 - F(t)) \cdot Y(0) + F(t)
$$
\n(10)

Where  $F(t)$  is the probability that a micro-conducting path to have suffered a soft breakdown at time *t*. The effect of stress on the value of the anti-parallel state resistance  $(R_H)$  as a function of stress time can be expressed as:

$$
R_H(t) = \frac{1}{Y(t)} = \frac{R_H(0)}{1 + F(t) \cdot \left[R_H(0) / R_H(t_{BD}) - 1\right]}
$$
(11)

where  $R_H(0)$  is the resistance of the anti-parallel state for the fresh MTJ element, while  $R_H(t_{BD})$  is the resistance at the end of the breakdown process. The function *F(t)* follows a *Weibull* distribution [24][25]:

$$
F(t) = 1 - \exp(-(t/\lambda)^k)
$$
 (12)

Where *k* is the shape parameter  $k=2.39$  in [24], and  $\lambda$  is the scale parameter and its value is  $\lambda = 3.98 \cdot 10^{10}$  in [24].

Building on these hypotheses, we have evaluated the failure probability of the STT-MRAM cell under study at fabrication time, i.e. the *fresh* cell  $(P_{RF@WF}(0))$  and under repeated write stress, i.e. the *aged* cell ( $P_{RF@WF}(t)$ ).

Extensive electrical SPICE simulations have been performed to identify the acceptance region based on the failure conditions (as explained in Section III and depicted in Fig. 2) with the required degree of accuracy. From this, using eq. (1) and (11) the cell failure probabilities are estimated. The results obtained are plotted in Fig. 3. The failure probability of the fresh cell due to incorrect read and write operations is:

$$
P_{RF\&WF}(0) = 6.397 \cdot 10^{-5} \tag{13}
$$

It should be noted that this probability is order of magnitude larger than the cell probability of data retention failure and read disturbance (in (8) and (9)). This confirms our initial assumption that, for the cell under test, the thermal instability is a minor contributor to cell reliability degradation and therefore it can be ignored without substantial effect on our results.

To demonstrate the effect of the access transistor threshold voltage variation on the reliability of the STT-MRAM cell, the failure probability of the fresh cell due to incorrect read and write operations has been estimated assuming discreet values for  $V_{TH}$ . The results are shown in Fig. 3(a). As expected, the failure probability  $(P_{RF\&WF}(0))$  increases as the  $V_{TH}$ , due to the resulting decrease in driving current. If the threshold voltage is by only 50mV larger than its nominal value, the failure probability decreases to about  $10^{-3}$ , which is unacceptable for a cell in a memory array. The reliability further worsens for larger positive deviations of  $V_{TH}$ .



Fig. 3. STT-MRAM cell reliability estimation under nominal values of the control voltages: a) reliability of the fresh cell affected by random variations in the MTJ resistance values; b) reliability degradation in time due to repetitive write stress, estimated assuming random variability of the MTJ resistance and NMOS threshold voltage.

The reliability of the cell is obtained by statistically integrating the joint *probability density function* of the electrical parameters of the cell after different cumulative stress periods. By evaluating (1) and (11), the reliability curve is obtained and shown in Fig. 3(b). We observed that the cell reliability degradation is almost insignificant during a large number of operation cycles  $({-10^{16}})$  under our assumptions) and then it is fast falling to zero. The cell reliability degradation in time is governed by the MTJ time-depended degradation.

#### *Control Voltage Influence on the STT-MRAM Reliability*

In order to change the magnetic state of the STT-MRAM cell, there must be sufficient current  $(I_{MTJ})$  flowing through the MTJ element to be able to switch the magnetic orientation of the free-layer. During a write operation, the power supply voltage  $(V_{DD})$  is applied to the Word Line  $(W_L)$  and sets the voltage drop between Bit Line (*BL*) and Source Line (*SL*).

During the read operation, a small voltage drop is applied between Bit Line (*BL*) and Source Line (*SL*).

 $V_{BL-SL} = 0.3 \cdot V_{DD}$  for R0 & R1 operation  $V_{BL-SL} = -V_{DD}$  for *W*1 operation  $V_{BL-SL} = V_{DD}$  for *W* 0 operation  $V_{WL} = V_{DD}$ 

We estimate the reliability of the STT-MRAM cell under test assuming different values for the control voltages. A first analysis is performed by estimating the cell reliability under supply voltage  $(V_{DD})$  variation. This translates into a variation of the MTJ current caused by two joint effects: i) different voltage drop  $(V_{BL-SL})$  on the same resistance; ii) different gate voltage on the same NMOS transistor. The failure probability of the fresh cell due to incorrect read and write operations has been estimated assuming discreet values for  $V_{TH}$  (results shown in Fig. 4(a)). It has been noted that the cell failure probability decreases with increasing the supply voltage and it is widely spread across different threshold voltages. The effect of threshold voltage variation is more pronounced at lower supply voltages. The same decrease in cell reliability with supply voltage scaling has been observed when the joint effects of resistance and  $V<sub>TH</sub>$  variations are considered. In Fig. 4(b), the cell reliability curves obtained after different cumulative stress periods are shown. For the fresh cell we observe a monotonic decrease of the failure probability when the supply voltage increases. This was to be expected, since larger *V<sub>DD</sub>* means larger MTJ current, hence larger read and write capabilities. However, close to the breakdown point (at and beyond the endurance limit,  $t \ge 10^{16}$  write cycles) and beyond the failure probability is not monotonic with supply voltage variations. We observe an increase in reliability up to a certain point, after which the reliability decreases. This reliability decrease is mainly due to the added stress exerted on the tunnel junction, which has a detrimental effect on the cell endurance to write operations.



Fig. 4. STT-MRAM cell reliability estimation under supply voltage  $(V_{DD})$ variation: a) reliability of the fresh cell affected by random variations in the MTJ resistance values; b) reliability degradation in time due to repetitive write stress, estimated assuming random variability of the MTJ resistance and NMOS threshold voltage.

The same analyses have been performed by varying each of the control voltages. When the effect of the Word Line (*WL*) voltage is analyzed, its value is varied 25% from nominal in each direction, while the voltage drop between Bit Line (*BL*) and Source Line (*SL*) is maintained at the nominal value. When the effect of the voltage drop between Bit Line (*BL*) and Source Line (*SL*) is analyzed, its value is varied 25% from nominal in each direction, while Word Line (*WL*) voltage is maintained at the nominal value. To analyze the effect of the NMOS body bias, its voltage value is varied 25% from nominal in each direction, while Word Line (*WL*) voltage and the voltage drop between Bit Line (*BL*) and Source Line (*SL*) are maintained at the nominal value. The obtained results are given in Fig. 5 for the fresh cell (a) and for the cell subjected to cumulative write stress of  $10^{16}$  cycles (b). We observe that the effect of voltage drop between Bit Line (*BL*) and Source Line (*SL*) and of body bias on the cell reliability are less relevant that the effect of Word Line and Supply Voltage. For instance, 10% increase of the control voltage, results in reduced failure probability by six orders of magnitude when  $V_{DD}$  or  $V_{WL}$  are used as knobs, by two orders of magnitude when  $V_{BB}$  is used as a knob and only one order of magnitude when  $V_{BL, SL}$  is used as a knob. From these data we conclude that for reliability mitigation, the most efficient (among the ones we have analyzed) is supply voltage boosting. A close second is word line boosting, which shows almost the same efficiency as  $V_{DD}$  boost.



Fig. 5. STT-MRAM cell reliability estimation under control voltage: a) reliability of the fresh cell; b) reliability degradation after  $t=10^{16}$  write cycles.

#### V. CELL LEVEL RELIABILITY/POWER TRADE-OFF EVALUATION

So far, we have noted that the most efficient variability mitigation techniques are the Supply Voltage  $(V_{DD})$  and Word Line  $(V_{WL})$  boosting. The questions left to answer are: i) which is the maximum reliability increase we can expect when using these techniques and ii) which is the price we have to pay for these increase?

To answer these questions we take a new look on our data. At a first glance, the answer to the first question is straight forward, the maximum achievable reliability increase is:

$$
\frac{P_{RF\&WF} @knob}{P_{RF\&WF}} = 3.1 \cdot 10^{-6}
$$
 (14)

for 10% increase in supply voltage (*knob*). Any further increase of supply voltage causes reliability deterioration due to cumulative stress. Another point becomes relevant to this analysis, i.e., the read disturb failures. If for nominal control voltage values, we showed that its contribution to overall cell failure probability can be ignored, this is not the case under control voltages boosting. This is due to the fact that the read disturb failure (*RD*) probability in (6) increases when the read current increases, which occurs under control voltage boost. From Fig. 6 it can be observed that for high boost of control voltage, the read disturb failure (*RD*) becomes the predominant cause of cell failure. Therefore, the control voltage boost should remain under 8% of nominal value, to make sure that we do not trade one type of failure for another. In this case, the maximum achievable reliability increase (employing the two variability mitigation techniques) is:

$$
\frac{(P_{RF\&WF} + P_{RD})(\text{A}V_{DD}/V_{DD} = 8\%)}{P_{RF\&WF}} = 1.2 \cdot 10^{-6}
$$
 (15)

$$
\frac{(P_{RF\&WF}+P_{RD})(@\Delta V_{WL}/V_{WL}=8\%}{P_{RF\&WF}}=1.03\cdot 10^{-6} \tag{16}
$$

The results are comparable, so from the point of view of reliability improvement, the two techniques have the same efficiency, so either one can be used. The analysis would substantially change when the reliability of a less thermalstable cell is under study, but this is out of the scope of the present work.

The question left to answer is: which is the price we have to pay for this reliability improvement? Traditionally, we are concerned with the cost in terms of area, speed and power. The physical implementations of the two techniques are similar, hence the area overhead should be the same in both cases. Both techniques improve the operation speed of the memory, due to increase current flowing through the device, hence no price is paid in terms of speed. However, when we analyze the power requirements, we observe a relevant difference between the two techniques. The power components considered in this analysis are: the power dissipated on the MTJ device and the power required to charge the control lines when the cell is integrated in a high capacity memory array.

$$
P_{tot} = \sum I_{MTJ}^2 \cdot R_{MTJ} + \sum \frac{1}{2} \cdot V^2 \cdot C_{equiv}
$$
 (17)

where  $I_{MTJ}$  is the current passing through the MTJ device and  $R_{MTJ}$  its resistance, *V* is the control voltage and  $C_{equiv}$  is the equivalent capacity of the line to be charged. The first sum operation is performed over all cells in an array and the second one is performed over all control lines.

The power requirements of the supply voltage boosting technique are larger than of the word line voltage boosting technique (Fig. 6), since both terms in the power equation (17) are larger. When boosting the  $V_{DD}$  a larger MTJ current is observed, which translates in higher resistive power dissipation than when only  $V_{WL}$  is boosted. This is explained by the fact that in addition to the increase in MTJ current given by  $V_{WL}$  boosting, a further current increase is due to boosting of  $V_{BL-SL}$ . The second term in (17) is also larger for  $V_{DD}$  boosting than for  $V_{WL}$  boosting, since in the later solution, just the capacity of the world line is charged at boosted voltage, while in the former solution, the capacities of all lines have to be charged at boosted voltage.

In conclusion, the more efficient of the two techniques in terms of power consumption is the Word Line boosting. For the *knob* value found to be the most efficient in terms of reliability improvement, i.e., 8% voltage boost, the power consumption is increased by 14% from its nominal value when the  $V_{WL}$  is boosted and by 19% from its nominal value when the  $V_{DD}$  is boosted. These numbers are estimated for a memory array of 256 bits arranged on 16 words and 16 columns. The difference will be more pronounced in larger memory arrays.



Fig. 6. STT-MRAM cell reliability/power trade-off analysis under  $V_{DD}$  and  $V_{WI}$  variation.

#### VI. CONCLUSIONS

In this paper we provided a methodology for predicting the reliability of an STT-MRAM based memory. The reliability estimation is performed at cell level accounting for fabrication induced variability and aging phenomena simultaneously affecting the NMOS and MTJ devices. Several circuit techniques based on control voltage tuning have been tested to identify the best technique for variability and aging mitigation and their power requirements have been evaluated to identify the more power efficient one. We have found that word line boosting technique is the most efficient reliability improvement techniques with reasonably low power requirements.

#### VII. REFERENCES

- [1] M. Hosomi, et al., "A novel nonvolatile memory with spin torque transfer magnetization switching: spin-RAM," in IEEE International Electron Devices Meeting IEDM Technical Digest 2005, pp. 459–462.
- [2] J. Li, P. Ndai, A. Goel, S. Salahuddin, and K. Roy, "Design paradigm for robust spin-torque transfer magnetic RAM (STT-MRAM) from circuit/architecture perspective," *Very Large Scale Integration (VLSI) Systems, IEEE Transactions on*, vol. 18, no. 12, pp. 1710–1723, 2010.

[View publication stats](https://www.researchgate.net/publication/283086586)

- [3] Y. Chen, X. Wang, H. Li, H. Xi, Y. Yan, and W. Zhu, "Design margin exploration of spin-transfer torque RAM (STT-RAM) in scaled technologies," *Very Large Scale Integration (VLSI) Systems, IEEE Transactions on*, vol. 18, no. 12, pp. 1724–1734, 2010.
- [4] W. Zhao and Y. Cao, "New generation of predictive technology model for sub-45 nm early design exploration," *Electron Devices, IEEE Transactions on*, vol. 53, no. 11, pp. 2816–2823, 2006.
- [5] K. Munira, W. Soffa, and A. Ghosh, "Comparative material issues for fast reliable switching in stt-rams," in *11th IEEE Conference on Nanotechnology (IEEE-NANO)*, pp. 1403–1408, 2011.
- [6] A. Nigam, C. Smullen, V. Mohan, E. Chen, S. Gurumurthi, and M. Stan, "Delivering on the promise of universal memory for spin-transfer torque RAM (STT-RAM)," in *International Symposium on Low Power Electronics and Design (ISLPED)*, pp. 121–126, 2011.
- [7] W. Zhao, et al., "Failure and reliability analysis of STT-MRAM," *Microelectronics Reliability*, no. 52, pp. 1848-1852, 2012.
- [8] E.I. Vatajelu, R. Rodriguez-Montanes, M. Indaco, M. Renovell, P. Prinetto, J. Figueras, "Read/write robustness estimation metrics for spin transfer torque (STT) MRAM cell," to appear in *Design, Automation & Test in Europe Conference & Exhibition (DATE)*, 2015.
- [9] F. Xuanyao, K. Roy, "Robust low-power multi-terminal STT-MRAM," *Non-Volatile Memory Technology Symposium (NVMTS)*, pp.1-4, 2013.
- [10] F. Xuanyao, K. Roy, "Low-power robust complementary polarizer STT-MRAM (CPSTT) for on-chip caches," *IEEE International Memory Workshop (IMW)*, pp.88-91, 2013.
- [11] S.H. Choday, S.K. Gupta, K. Roy, "Write-Optimized STT-MRAM Bit-Cells Using Asymmetrically Doped Transistors," *Electron Device Letters, IEEE* , vol.35, no.11, pp.1100-1102, Nov. 2014.
- [12] F. Xuanyao, K. Yusung, S.H. Choday, K Roy, "Failure mitigation techniques for 1T-1MTJ Spin-Transfer Torque MRAM bit-cells," *Very Large Scale Integration (VLSI) Systems, IEEE Transactions on* , vol.22, no.2, pp.384,395, Feb. 2014.
- [13] J. Slonczewski, "Current-driven excitation of magnetic multilayers," *Journal of Magnetism and Magnetic Materials*, vol. 159, no. 12, pp. L1– L7, 1996
- [14] A.V. Khvalkovskiy, et al., "Basic principles of STT-MRAM cell operation in memory arrays," *Journal of Physics D: Applied Physics*, no. 46, 2013.
- [15] C.W. Smullen et al., "Relaxing non-volatility for fast and energyefficient STT-RAM caches," *IEEE International Symposium on High Performance Computer Architecture (HPCA)*, pp. 50-61, 2011.
- [16] M. Durlam, et al., "A 1-Mbit MRAM based on 1T1MTJ bit cell integrated with copper interconnects," *IEEE Journal of Solid-State Circuits*, vol.38, no.5, pp.769-773, 2003.
- [17] E.I. Vatajelu, J. Figueras, "Robustness analysis of 6T SRAMs in memory retention mode under PVT variations," *Design, Automation & Test in Europe Conference & Exhibition (DATE)*, pp.1-6, 2011
- [18] Y. Zhang, W. Zhao, D. Ravelosona, J.-O. Klein, J.-V. Kim, C. Chappert, "A compact model of perpendicular-anisotropy CoFeB/MgO magnetic tunnel junction," *in IEEE Transaction on Electron Device,* vol.59, pp.819-826, 2012.
- [19] W.F. Brown, "Thermal fluctuations of a single-domain particle," *Physical Review*, vol. 130, no. 5, pp. 1677-1686, 1963.
- [20] M.P. Sharrock, "Time dependence of switching fields in magnetic recording media," *Journal of Applied Physics*, vol. 76, no. 10, 1994.
- [21] R. Ubal, J. Sahuquillo, S. Petit, H. Hassan, P. Lopez, "Leakage Current Reduction in Data Caches on Embedded Systems," *Intelligent Pervasive Computing Conference (IPC)*, pp.45-50, 2007.
- [22] L. Jing, L. Haixin, S. Salahuddin, K. Roy, "Variation-tolerant Spin-Torque Transfer (STT) MRAM array for yield enhancement," *IEEE Custom Integrated Circuits Conference (CICC)*, pp.193-196, 2008.
- [23] Sato H, Yamanouchi M., Miura K., Ikeda S., Koizumi R., Matasukura F., and Ohno H., "Junction size effect on switching current and thermal stability in CoFeB/MgO perpendicular magnetic tunnel junctions*," Appl. Phys. Lett*., 99, 042501, 2011.
- [24] S. Amara-Dababi, et al., "Charge trapping-detrapping mechanism of barrier breakdown in MgO magnetic tunnel junctions*," Appl. Phys. Lett*., 99, 083501, 2011.
- [25] G. Panagopoulos, C. Augustine, K. Roy, "Modeling of dielectric breakdown-induced time-dependent STT-MRAM performance degradation,*" Device Research Conference (DRC)*, pp.125-126, 2011.