

## Non-Volatile Latch Designs with Node-Upset Tolerance and Recovery using Magnetic Tunnel Junctions and CMOS

Aibin Yan, Litao Wang, Jie Cui, Zhengfeng Huang, Tianming Ni, Patrick Girard, Xiaoqing Wen

### ▶ To cite this version:

Aibin Yan, Litao Wang, Jie Cui, Zhengfeng Huang, Tianming Ni, et al.. Non-Volatile Latch Designs with Node-Upset Tolerance and Recovery using Magnetic Tunnel Junctions and CMOS. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2024, 32 (1), pp.116-127. 10.1109/TVLSI.2023.3323562. limm-04239391

## HAL Id: lirmm-04239391 https://hal-lirmm.ccsd.cnrs.fr/lirmm-04239391v1

Submitted on 12 Oct 2023  $\,$ 

**HAL** is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L'archive ouverte pluridisciplinaire **HAL**, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d'enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

# Non-Volatile Latch Designs with Node-Upset Tolerance and Recovery using Magnetic Tunnel Junctions and CMOS

Aibin Yan, Litao Wang, Jie Cui, Zhengfeng Huang, Tianming Ni, Patrick Girard, Fellow, IEEE, and Xiaoqing Wen, Fellow, IEEE

Abstract—As semiconductor technologies scale down, radiativeparticle-induced soft errors and static power consumption are becoming major concerns for digital circuits. Magnetic-tunneljunctions (MTJs) are widely used to address these concerns. MTJs are non-volatile and compatible with traditional CMOS processes. In this paper, we first propose a double-node-upset (DNU) tolerant and non-volatile latch, i.e., M-TPDICE-V2, providing high reliability. Additionally, we further propose an advanced latch, namely M-8C, that is able to completely recover from single-nodeupsets (SNUs) and DNUs. M-8C uses a DNU recovery module and a backup and restore module based on a pair of MTJs. Furthermore, we propose a universal backup and restore module suitable for any latch providing non-volatility. We simulate the proposed latches using the Synopsys HSPICE tool with a 45nm CMOS process model. Simulation results confirm the superior capabilities of our proposed M-TPDICE-V2 and M-8C latches. M-TPDICE-V2 exhibits strong SNU and DNU tolerance and nonvolatility, while the M-8C latch provides complete DNU recovery capabilities.

*Index Terms*—MTJ, fault tolerance, radiation hardening, double-node-upset.

#### I. INTRODUCTION

THE continuous scaling of CMOS technology has led to improved integration as well as high performance for circuits and systems. However, as the size of transistors continues to shrink, CMOS devices are becoming increasingly susceptible to radiation-induced soft errors, which can result in data corruption and even system crashes in the worst-case scenario. Radiative particles can cause an SNU when they collide with an OFF-state transistor in an integrated circuit, resulting in a flipped value of a node. In addition, under the mechanism of charge-sharing, DNUs can also occur when a high-energy particle simultaneously impacts double OFF-state transistors [1].

In recent years, non-volatile (NV) storage cells in spintronic technologies, such as spin orbit torque (SOT) and spin-transfer torque (STT), have emerged as promising alternatives. NV magnetic storage cells have several advanced features,

Aibin Yan is with School of Computer Science and Technology, Anhui University, and also with School of Microelectronics Hefei University of Technology, Hefei 230601, China. (E-mail: abyan@mail.ustc.edu.cn)

including high density, high endurance, soft error immunity, low access latency, and scalability [2-3]. It is well-known that MTJs are crucial for the radiation hardening as well as nonvolatility of spintronic circuits [4]. Figure 1 shows MTJ device and its states. It can be seen that an MTJ consists of three ferromagnetic layers. For the top layer, it is known as the free layer (FL), which is mainly made of the CoFeB material [5]. For the middle layer, it is known as the ultrathin MgO dielectric layer, which is referred to as the tunnel barrier (TB) [5]. For the bottom ferromagnetic layer, it is called as the pinned layer (PL). Note that, the magnetization in FL is parallel (i.e., P state) or anti-parallel (i.e., AP state) to that of PL, with the fixed magnetization in PL serving as a reference. It is also noteworthy that the resistance of an MTJ in the P state is lower than that in the AP state.



(a) MTJ device structure. (b) P state. (c) AP state. Fig. 1. Magnetic-tunnel-junction (MTJ) device structure and its states.

Note that, in our previous work [24], a non-volatile magnetic latch, namely M-TPDICE, has been proposed. In M-TPDICE, the backup channel was controlled by the clock (CLK) signal, necessitating backup at every clock cycle, which is impractical. Moreover, we have proposed a backup and restore module in our previous work [24]; however, it requires the latch to be adjusted so as to achieve non-volatility.

In this paper, we propose two non-volatile magnetic latches, namely M-TPDICE-V2 and M-8C, based on the unique features of MTJs as discussed above. M-TPDICE-V2 is designed to

The previous version of this paper was published by the 31st IEEE Asian Test Symposium (ATS 2022) [24]. This work was supported by the National Natural Science Foundation of China under Grants 61974001, 62274052 and 62174001, the Open Project of the State Key Laboratory of Computing Institute of Chinese Academy of Sciences under Grant CARCHA202101, the NSFC-JSPS Exchange Program under Grant 62111540164, the Outstanding Young Talent Support Program Key Project of Anhui Provincial Universities under Grant gxyqZD202005, and the Distinguished Young Scholar Fund of Anhui Province under Grant 2022AH020014. *Contact author: Tianming Ni.* 

Litao Wang and Jie Cui are with School of Computer Science and Technology, Anhui University, Hefei 230601, China. (E-mail: augustuswlt@qq.com, cuijie@mail.ustc.edu.cn).

Zhengfeng Huang is with School of Microelectronics, Hefei University of Technology, Hefei 230009, China. (E-mail: huangzhengfeng@139.com).

Tianming Ni is with School of Integrated Circuits, Anhui Polytechnic University, Wuhu 241000, China. (E-mail: timyni126@126.com).

Patrick Girard is with LIRMM, University of Montpellier / CNRS, Montpellier 34095, France (E-mail: girard@lirmm.fr)

Xiaoqing Wen is with Graduate School of Computer Science and Systems Engineering, Kyushu Institute of Technology, Fukuoka 820-8502, Japan (E-mail: wen@cse.kyutech.ac.jp).

tolerate both SNUs and DNUs, while M-8C provides complete recovery from both SNUs and DNUs, ensuring high reliability and nonvolatility. In the proposed designs, the transmission gate control is driven by new signals, enabling precise backup timing and frequency control, eliminating the need for backup at every transparency period. Furthermore, we propose a universal backup and restore module suitable for any latch providing non-volatility. The non-volatility is provided by leveraging MTJs, which enable zero static-power consumption and ensure no data loss in the power-off state. These latches offer promising solutions for non-volatile latch designs in applications where reliability and nonvolatility are critical.

The organization of this article is as follows. To provide a comprehensive understanding of our proposed latches, Section II describes some background information on spintronics, C-elements (CEs), as well as some previous works. Section III presents a novel DNU-tolerant NV latch, i.e., M-TPDICE-V2. Next, in Section IV, we describe a novel DNU recovery NV magnetic latch (namely M-8C) and explain its design and operational principles. Section V shows comparison results to verify the advantages of the proposed latches. Finally, we summarize our findings and contributions in Section VI.

#### II. BACKGROUNDS

#### A. Spintronic Preliminaries

MTJs are critical components in spintronic devices that offer various approaches to writing data, such as thermally assisted switching (TAS), voltage-controlled magnetic anisotropy (VCMA), spin transfer torque (STT), field-induced magnetization switching (FIMS), and spin hall assisted STT (SHASTT) [4, 6-8]. TAS as well as FIMS have high power dissipation and instability [4, 6]. SHASTT has extra current flow, so that it will increase routing complexity [8]; VCMA has to use high voltage, so that it will decrease lifetime of MTJs [9]. STT is up to now the preferred technique due to its lower current as well as data disturbance compared to the other approaches [10].

The resistances of MTJs depend on thickness of the tunnel barrier (TB), the relative direction of magnetization in the free layer (FL) as well as the pinned layer (PL). Note that, the resistance is low if an MTJ device is in the parallel (P) state, as it can be seen in Fig. 1(b). Meanwhile, if an MTJ device is in the anti-parallel (AP) state, as it can be seen in Fig. 1(c), the resistance is high. It is well known that this phenomenon is called as the tunneling magneto-resistance (TMR) effect [11]. For the TMR ratio, it is defined as TMR = (RAP - RP) / RP, where RP and RAP are the resistance values of the P and AP states. It is an indicator of this effect.

An MTJ device uses spin transfer torque (STT) to write values. A spin-polarized current needs to be passed through an MTJ so as to change the state of an MTJ device between AP and P. If the current exceeding the critical switching current (CSC) passes through an MTJ, the magnetization in the FL will change to the correct state. This also depends on the direction of the current. Note that CSC is an important electrical parameter defined as the current required to change the state of an MTJ device within a period of time [12].



(a) 2-input Fig. 2. Structures of C-elements.

(b) Clock-gating based 3-input

#### B. C-element Devices

CEs are widely used components for circuit design to improve reliability. A CE can work as an inverter when its input values are the same. However, when its input values change to be different, its output can still have the original correct value temporarily due to the intrinsic capacitances. To better implement the proposed latch's error tolerance and recovery capabilities, we use the C-elements shown in Fig. 2. Figure 2 shows the structures of CEs, i.e., a 2-input CE as well as a clock-controlled 3-input CE. A CE features the following four important properties.

(1) **Recoverability:** If a CE has all correct inputs, it will output the input-reverted correct value no matter whether its output has an error or not.

(2) **Valid-Retention:** If a CE has an erroneous input but its output does not have an error, its output value will not be changed due to the input error, i.e., the input error can be masked.

(3) **Invalid-Retention:** If one input as well as the output simultaneously have errors, the output will not provide the correct value. For this case, the output will be recovered only if the erroneous input is recovered.

(4) **Corruption:** If all inputs of a CE have errors, its output will have a flipped value. For this case, the output will be recovered only if all inputs are recovered.

#### C. Previous Works

Non-volatile (NV) latches are widely used in modern electronics due to their capability to retain stored data even after a loss of power. However, SNUs and DNUs can cause data loss or corruption in NV latches. To mitigate these issues, several NV latch designs have been proposed in the literature.

Figure 3 shows several existing techniques for NV latch designs. Figure 3(a) illustrates the proposed latch based on magnetic random-access memory (MRAM) [13], which employs four modified CEs to provide SNU tolerance. Each modified CE consists of six transistors, which is similar to the CEs shown in Fig. 2 [14]. The MRAM latch stores dual copies of the retained values to implement the NV feature with robustness improvement.

Figure 3(b) presents the latch proposed in [15], which consists of two parallel CEs, signal-controlled transistors, as well as two complementary magnetic tunnel junctions (MTJs). The MTJs provide the NV feature and the CEs provide improved robustness. However, the design suffers from a large delay due to the direct write operation to the MTJs in its backup operation. To address this issue, the latch proposed in [16] eliminates peripheral circuitry and extra control signals. Nevertheless, it cannot tolerate DNU.



Figure 3(c) shows the design proposed in [16], which reduces the number of CMOS transistors by not using inverters as in [15]. However, this design consumes high power, has a larger D-Q delay and cannot tolerate DNU. The design proposed in [18], shown in Fig. 3(d), also reduces the number of CMOS transistors by not using inverters, but it still consumes high power consumption as well as a larger D-Q delay. Note that this latch cannot tolerate DNU either.

Figure 3(e) depicts the design proposed in [19], which uses seven 3-input CEs to provide DNU tolerance. However, this design uses extra transistors and meanwhile it does not have the backup functioning. It is worth noting that the values cannot be effectively written into the MTJs for the designs as in [13, 19].

#### III. PROPOSED M-TPDICE-V2 LATCH

Figure 4 presents the structure of the proposed latch, namely M-TPDICE-V2, which comprises transmission gates (TGs), a modified TPDICE based on the original version [22], two MTJs, as well as a clock-gated (CG) 3-input CE (its structure is shown in Fig. 3(b)). N<sub>2</sub>, N<sub>4</sub> and N<sub>6</sub> feed the inputs of the 3-input CE so as to output the value of the proposed latch. In the latch, Q is the output, D is the input, CLK is the system clock and CLKB is the negative system clock. Note that signals PRE and RES are used for restore operations.

The M-TPDICE-V2 provides normal operations in transparent/hold mode, and SNU/DNU tolerance in hold mode, as to be described below. It should be noted that the transistors controlled with RES and (" $\overline{\text{RES}}$ ") signals are ON for the purpose of constructing feedback loops to provide stability.

#### A. Normal Operations

When CLK is high (CLKB is low), the M-TPDICE-V2 latch operates in transparent mode, turning on all transistors in all TGs, so that  $N_1$ ,  $N_3$ ,  $N_5$ , as well as Q can be pre-charged by D through these TGs. Meanwhile, the resistance of both MTJ1 and MTJ2 can be preset by  $N_1$  through  $N_6$ . Thus, the latch operates correctly in this mode.

When CLK is low (CLKB is high), the M-TPDICE-V2 latch operates in hold mode. All transistors in TGs are OFF, and Q can only be driven by N<sub>2</sub>, N<sub>4</sub>, and N<sub>6</sub>. Note that many feedback loops in the latch can be efficiently formed so as to retain values. Thus, the proposed M-TPDICE-V2 latch can retain/output values correctly.



Fig. 4. Schematic of the proposed radiation-hardened NV latch, i.e., M-TPDICE-V2.

#### B. Fault-Tolerance Principle

In the proposed M-TPDICE-V2 latch, nodes Q as well as  $N_1$  to  $N_6$  are sensitive to SNUs. To illustrate fault-tolerance, the node status presented in Fig. 4 is chosen as a presentative scenario. Firstly, the SNU-tolerance principle is provided. In the case where each node in TPDICE is impacted by an SNU, the SNU can disappear. For instance, if  $N_1$  is impacted by the SNU,  $N_1$  will temporarily flip to "1". At this moment, the SNU

cannot pass to N<sub>6</sub> since the PMOS above N<sub>6</sub> turns off. At the same time, N<sub>1</sub> = 1 will temporarily turn on the NMOS below N<sub>2</sub>, resulting in N<sub>2</sub> outputting a weak "1". Note that N<sub>3</sub> to N<sub>6</sub> cannot be directly impacted by the upset of N<sub>1</sub>, enabling them to maintain their original correct states. Meanwhile, N<sub>3</sub> remains correct and the PMOS above N<sub>2</sub> remains ON and thus N<sub>2</sub> has a strong correct value "0". Thus, N<sub>2</sub>'s strong "1" can neutralize the weak "0" and thus N<sub>2</sub>'s value is still "1". Since N<sub>2</sub> and N<sub>6</sub> are both correct, N<sub>1</sub> can return to its original correct state. N<sub>2</sub> = 1 and thus the PMOS above N<sub>1</sub> is turned off; N<sub>6</sub> = 1 and thus the NMOS below N<sub>1</sub> is turned on. Consequently, N<sub>1</sub> can provide SNU recover when suffering from an SNU. Similarly, N<sub>2</sub> to N<sub>6</sub> can also provide SNU recover when each of them suffers from an SNU.

Note that in the case where Q suffers from an SNU, all nodes inside TPDICE still have correct values, meaning that the CE still have correct input values, and thus Q can recover to its original correct value. Hence, the single nodes of the proposed latch all can recover from SNUs, indicating that the latch is entirely SNU-hardened.

Next, we discuss the DNU tolerance for the proposed latch. Because of the symmetrical latch structure, there are only three cases (i.e., Cases 1 to 3) that need to be considered.

In Case 1, the DNU affects the output Q as well as a single node in TPDICE. Clearly, the key node pairs include  $\langle N_1, Q \rangle$  as well as  $\langle N_2, Q \rangle$ . If  $\langle N_1, Q \rangle$  suffers from a DNU,  $N_1$  can be firstly recovered because TPDICE can provide SNU-recovery for its each node. Additionally, Q can recover to its original correct state because the CG-based CE's all inputs still have correct states. This means that  $\langle N_1, Q \rangle$  can provide DNU-recovery. In the same manner,  $\langle N_2, Q \rangle$  can also provide DNU-recovery. Hence, the latch can provide complete DNU-hardening for Case 1.

In Case 2, we consider that Q remains unaffected by a DNU, while there is only a single input that is impacted in the CGbased CE, and another affected node is the node inside TPDICE. Node-pairs  $\langle N_1, N_2 \rangle$  and  $\langle N_2, N_5 \rangle$  are chosen as representatives for this analysis. In the case where  $\langle N_1, N_2 \rangle$ suffers from a DNU, node N1 upsets to "1" and node N2 upsets to "0". As a result, the PMOS above N1 and the NMOS below N<sub>2</sub> are turned temporarily ON, allowing N<sub>1</sub> to have a weak "1" and meanwhile N<sub>2</sub> to have a weak "0". However, the DNU cannot pass to N<sub>3</sub> and N<sub>6</sub> as the PMOS above N<sub>6</sub> as well as the NMOS below N3 are OFF. Therefore, N3 to N6 cannot be directly impacted and they can still maintain original correct states. The NMOS below N1 as well as the PMOS above N2 both are ON, allowing N1 to have a strong "0" and meanwhile N<sub>2</sub> to have a strong "1". However, N<sub>1</sub>'s strong "0" can neutralize its weak "1", and N2's strong "1" can neutralize its weak "0". Therefore, <N1, N2> can self-recover in the case where it suffers from a DNU.

Similarly, in the case where  $\langle N_2, N_5 \rangle$  suffers from a DNU, node N<sub>2</sub> upsets to "0" and meanwhile node N<sub>5</sub> upsets to "1". The NMOS below N<sub>3</sub> as well as the PMOS above N<sub>4</sub> are turned off, preventing the DNU from propagating to N<sub>3</sub>/N<sub>4</sub>. Note that N<sub>1</sub> can have a weak "1" and meanwhile N<sub>6</sub> can have a weak "0" as the PMOS above N<sub>1</sub> and the NMOS below N<sub>6</sub> are turned on. N<sub>1</sub>, N<sub>6</sub>, N<sub>3</sub> and N<sub>4</sub> cannot directly be impacted by N<sub>2</sub>/N<sub>5</sub>, and N<sub>1</sub> still has its original correct state "1" and meanwhile N<sub>6</sub> still has its original correct state "0". At the same time, the NMOS below N<sub>1</sub> as well as the PMOS above N<sub>6</sub> are turned on, allowing N<sub>1</sub> to have a strong "0" and meanwhile N<sub>6</sub> to have a strong "1". However, the strong "0" neutralizes the weak "1" of N<sub>1</sub>, and the strong "1" neutralizes the weak "0" of N<sub>6</sub>, ensuring that N<sub>1</sub>, N<sub>3</sub>, N<sub>4</sub>, and N<sub>6</sub> maintain the original correct states. As N<sub>1</sub> and N<sub>3</sub> are both correct, N<sub>2</sub> can recover to its original correct state "0". N<sub>1</sub> = 0 and thus the NMOS below N<sub>2</sub> is turned off; N<sub>3</sub> = 0 and thus the PMOS above N<sub>1</sub> is turned on. Similarly, N<sub>5</sub> can also recover to the original correct state because N<sub>4</sub> and N<sub>6</sub> are correct. Thus, <N<sub>2</sub>, N<sub>5</sub>> can DNU-recovery. In other words, the proposed latch can provide complete DNU-hardening for Case 2.

In Case 3, we consider a scenario where the CG-based CE's any possible double inputs suffer from a DNU. In this case,  $<N_2$ , N<sub>4</sub>> is the key node-pair. In the case where  $\langle N_2, N_4 \rangle$  suffers from a DNU, nodes N<sub>2</sub> and N<sub>4</sub> both become "0". This turns ON the PMOS above N<sub>3</sub> and turns OFF the NMOS below N<sub>3</sub>, causing N<sub>3</sub> to become "1". However, N<sub>5</sub> retains its previous value of "0" because the PMOS above N5 is still OFF. At this time, N<sub>4</sub> and N<sub>1</sub> both have uncertain values due to the PMOS above and NMOS below them being ON. However, N6 can still be correctly valued as "1" because the NMOS below N6 is turned off. At this time, the CG-based CE can still output a correct value because the inputs are not simultaneously flipped. Therefore, the DNU can be tolerated by <N<sub>2</sub>, N<sub>4</sub>>. In other words, the proposed latch can provide complete DNUhardening for Case 3. Overall, the proposed M-TPDICE-V2 latch can provide complete SNU/DNU hardening.

#### C. Non-volatility based on MTJs

The proposed M-TPDICE-V2 latch has two non-volatile operations, i.e., backup and restore. During backup, WR (Write), NWR (inverted Write), RES,  $\overline{\text{RES}}$  and  $\overline{\text{PRE}}$  have the value of "1", "0", "0", "1", and "1", respectively. During restore, WR, NWR, RES,  $\overline{\text{RES}}$  and  $\overline{\text{PRE}}$  have the value of "0", "1", "1", "0", and "0", respectively.

#### (1) Operation Flow of Backup

In backup mode (i.e., WR = 1), the values of internal nodes N1 - N6 of the latch can be retained in MTJs through the flowed current, completing the backup operation. For instance, if N<sub>1</sub>, N<sub>3</sub>, and N<sub>5</sub> have the value of "0", and meanwhile N<sub>2</sub>, N<sub>4</sub>, and N<sub>6</sub> have the value of "1", MTJ1 will be in the P state, and meanwhile MTJ2 will be in the AP state due to the flowed current from the free layer (FL) in MTJ2 to the FL in MTJ1. To effectively switch the state of MTJs, we employ triple nodes (i.e., N<sub>1</sub>, N<sub>3</sub>, as well as N<sub>5</sub> converging to the node above MTJ1) instead of a single node to induce a higher flowed current [12].

#### (2) Operation Flow of Restore

When VDD is powered off, all transistors are turned off. When it is powered on, the circuit enters the restore operation where  $N_1$  and  $N_4$  can be re-charged by PMOS when  $\overline{RES} = 0$ . It is important to note that during this time, the "WR" signal should be set to 0 to deactivate the backup channel, thus avoiding any interference from the backup process on the restoration phase. During this operation, RES = 1 and meanwhile  $\overline{\text{RES}} = 0$ , meaning that nodes N<sub>1</sub> and N<sub>4</sub> cannot be affected by the other nodes in TPDICE, and thus the parallel MTJs will be grounded. Because the resistance of the MTJ in P state is smaller than that in AP state, the node fed the MTJ in P state will discharge faster than the node fed the MTJ in AP state, resulting in different logic values for N1 as well as N4. For instance, if MTJ1 is in P state and meanwhile MTJ2 is in AP state, N<sub>1</sub> and N<sub>4</sub> will be both charged to 1 in the restore operation. Since the transistors controlled by  $RES/\overline{RES}$  are turned off, nodes N1 and N4 will not be influenced by the other nodes in TPDICE. Meanwhile, the NMOS transistors fed the MTJs will be turned on, and N1 will discharge faster than N4 due to MTJ1's smaller resistance compared to MTJ2. Thus, N<sub>1</sub> will become 0 and N<sub>4</sub> will remain 1, while N<sub>5</sub> and N<sub>6</sub> will be 0 and 1, respectively. N3 will also become 0 due to the PMOS transistor above it being OFF, and N2 will become 1 due to the PMOS above it being ON as well as the NMOS below it being OFF. Therefore, the output Q of the CG-based CE will be 0, and the output as well as nodes N1-N6 will reload the original correct states, indicating the completion of the restore operation.

#### D. Simulations

The proposed M-TPDICE-V2 latch was fabricated using a 45nm CMOS bulk technology as well as the MTJ model from [21], with a supply voltage of 1.0V and room temperature. Table I shows the parameters of the STT-MTJ device in simulations. Synopsys HSPICE was used for relevant simulations as in [1, 17].

| TABLE I                                         |
|-------------------------------------------------|
| PARAMETERS OF THE STT-MTJ DEVICE IN SIMULATIONS |
|                                                 |

| Parameter       | Description                           | Default Value                       |  |  |
|-----------------|---------------------------------------|-------------------------------------|--|--|
| Area            | MTJ surface                           | $40nm \times 40nm \times \pi  /  4$ |  |  |
| TMR (0)         | TMR ratio with zero $V_{\text{bias}}$ | 150%                                |  |  |
| t <sub>f</sub>  | Free layer height                     | 0.90nm                              |  |  |
| t <sub>ox</sub> | Oxide barrier thickness               | 0.90nm                              |  |  |
| V               | Volume of free layer                  | Area × $t_f$                        |  |  |
| RA              | Resistance*Area product of MTJ        | $5\Omega\cdot \mu m^2$              |  |  |
| Ms              | Saturation magnetization              | $3.25\times 10^5\text{A/m}$         |  |  |
| Нк              | Anisotropy field                      | $4.00\times 10^5A/m$                |  |  |
| $\mathbf{P}_0$  | Polarization factor                   | 0.56                                |  |  |
| a               | Damping factor                        | 0.01                                |  |  |

#### (1) DNU Simulations

Figure 5 presents the simulations for DNU injections for the proposed M-TPDICE-V2 latch, which indicates that the DNUs injected at 4ns and 5ns only caused narrow pulses, with these node-pairs returning to their original correct values. DNUs injected at 17ns, 29ns, and 41ns had no effect on Q, indicating the proposed latch's DNU tolerance.



Fig. 5. DNU-injection simulation results for the proposed latch design.

#### (2) Normal/Restore Operations

Figure 6 shows the simulations of the proposed M-TPDICE-V2 latch's normal/restore operations. During the normal operation with VDD, WR, PRE and RES having the values of "1", "0", "0" and "0" respectively, the latch worked in transparent mode at 39ns and then switched to hold mode at 41ns. At 41ns, Q remained at the original state pre-charged by D working in transparent mode. At 81ns, CLK = 1, WR = 1, D initialized Q, and MTJs backed up a copy of D value (note that MTJ1 and MTJ2 were in P state and AP state, respectively). Furthermore, at 120ns, the proposed latch was powered off, and at this time, the output was 0. Clearly, the output was restored correctly from MTJs to TPDICE after the power-on at 200ns. In summary, the simulations demonstrate the proposed latch's correct operations.



Fig. 6. Simulation results of the proposed M-TPDICE-V2 latch during normal and restore operations. Note that, between 80ns and 120ns, the copy of D value was stored into MTJs to complete the backup.

#### IV. PROPOSED M-8C LATCH

Figure 7 shows the proposed DNU recovery non-volatile magnetic latch, namely M-8C. The latch mainly comprises four TGs (in the left bottom part of Fig. 7), a DNU recovery module based on eight C-elements (in the top part of Fig. 7), and a backup and restore module based on a pair of MTJ cells (in the lower part of Fig. 7). In this latch, N<sub>0</sub> to N<sub>7</sub> are the internal nodes, where N<sub>1</sub> also acts as output Q. D is the input, which provides input to N<sub>1</sub>(Q), N<sub>3</sub>, N<sub>5</sub> and N<sub>7</sub> through the transmission gate during the transparent mode. The advantages of the proposed M-8C latch include complete DNU-recovery as well as non-volatility that will be introduced as follows.

The normal operations, the SNU/DNU-recovery principles and simulations, and further discussions of the proposed M-8C latch are provided as follows.

#### A. Normal Operations

The latch operates in transparent mode when CLK = 1 and CLKB = 0. At this time, all transistors in all TGs are turned on, and thus  $N_1(Q)$ ,  $N_3$ ,  $N_5$ , as well as  $N_7$  can be pre-charged by D through these TGs. Note that MTJ1 and MTJ2's resistance can be pre-charged by  $N_1$ ,  $N_3$ ,  $N_5$ ,  $N_0$ ,  $N_2$  and  $N_4$ . Clearly, the proposed M-8C latch can operate properly in this mode.

The latch operates in hold mode when CLK = 0 and CLKB = 1. At this time, these eight CEs are fed back to each other, i.e., output node  $N_{i \ (0 \le i \le 7)}$  is fed by nodes  $N_{(i+1) \ mod \ 8}$  and  $N_{(i+3) \ mod \ 8}$  through  $C_{i(0 \le i \le 7)}$ , in which  $N_{(i+1) \ mod \ 8}$  and  $N_{(i+3) \ mod \ 8}$  are the output nodes of  $C_{(i+1) \ mod \ 8}$  and  $C_{(i+3) \ mod \ 8}$ , respectively. Clearly, the proposed M-8C latch can store/output the correct states correctly.

#### B. Error Recovery Principle

The error recovery principle of the proposed M-8C latch is mainly based on CE redundancy. If the original module suffers from a DNU, its two nodes will change to wrong states.



Fig. 7. Schematic of the proposed DNU recovery non-volatile magnetic latch.

However, the redundant module can intercept the error's propagation so that the module keeps its original correct states and recovers the invalid states to their original correct states. This is achieved by forming error-interceptive paths between the original/redundant modules. The DNU resilience behavior for the redundant module is similar to the original.

The proposed M-8C latch uses eight CEs to build a robust structure. It can be seen from Fig. 7 that, the value of  $N_{i(0 \le i \le 7)}$  is fed by  $N_{(i+1) \mod 8}$  and  $N_{(i+3) \mod 8}$ . The nodes  $N_1$ ,  $N_3$ ,  $N_5$ , and  $N_7$  store the original value, and the nodes  $N_0$ ,  $N_2$ ,  $N_4$ , and  $N_6$  store the redundant/complementary values. Note that, if the original values are impacted by a DNU, the redundant values retaining the original correct value will restore the wrong value through the error-interceptive paths composed of the eight CEs.

SNU/DNU recovery behaviors for the M-8C are discussed as follows.

First, the SNU-recovery behaviors of the M-8C are analyzed. Since N<sub>1</sub> stands for Q, the fault behavior analysis of node  $N_{i(0 \le i \le 7)}$  can represent all SNU cases. If a high-energy radiative particle hits the M-8C latch, the value of  $N_{i(0 \le i \le 7)}$  will be temporarily flipped.  $N_{i(0 \le i \le 7)}$  is the input node of  $C_{(i-1) \mod 8}$  and  $C_{(i-3) \mod 8}$ ; thus,  $C_{(i-1) \mod 8}$  and  $C_{(i-3) \mod 8}$  can still have their original correct states. After the SNU disappears, the correct  $N_{(i+1) \mod 8}$  and  $N_{(i+3) \mod 8}$  can return  $N_{i(0 \le i \le 7)}$  to their previous correct states through  $C_{i(0 \le i \le 7)}$ .

Next, the DNU-resilience behaviors of the M-8C are analyzed. All DNUs caused by single particle striking can be classified to the following three possible cases:

**Case 1**: Two inputs of a CE are affected by a DNU (i.e., inputs  $N_{(i+1) \mod 8}$  and  $N_{(i+3) \mod 8}$  of CE  $C_{i(0 \le i \le 7)}$  are simultaneously flipped).

**Case 2**: A single input and the output of a CE are impacted by a DNU (i.e., node  $N_{(i+1) \mod 8}$  (or  $N_{(i+3) \mod 8}$ ) as well as the output  $N_{i(0 \le i \le 7)}$  of CE  $C_{i(0 \le i \le 7)}$  are simultaneously flipped).

**Case 3**: The two nodes having an identical value while driving different CEs are impacted by a DNU (i.e., node  $N_{i(0 \le i \le 7)}$  as well as  $N_{(i+4) \mod 8}$  are simultaneously flipped).

For Case 1, we suppose i = 0. A DNU simultaneously flips inputs N<sub>1</sub> and N<sub>3</sub> of C<sub>0</sub>, and N<sub>0</sub> is flipped accordingly. The corrupted nodes N<sub>1</sub> and N<sub>3</sub> are the inputs of C<sub>6</sub> and C<sub>2</sub>, respectively, and meanwhile N<sub>0</sub> is the input of C<sub>5</sub> and C<sub>7</sub>. In this way, the four CEs, i.e., C<sub>2</sub>, C<sub>6</sub>, C<sub>5</sub>, and C<sub>7</sub>, enter high-impedance states, and thus the values of N<sub>2</sub>, N<sub>6</sub>, N<sub>5</sub>, and N<sub>7</sub> remain unchanged. In addition, DNU does not affect N<sub>4</sub>. After the transient fault disappears, N<sub>2</sub> and N<sub>4</sub> recover N<sub>1</sub> through C<sub>1</sub>. Similarly, N<sub>4</sub> and N<sub>6</sub> recover N<sub>3</sub> via C<sub>3</sub>. Finally, N<sub>0</sub> is restored to the correct value of N<sub>1</sub>, and N<sub>3</sub> is restored to its original state. Therefore, under this case series, we can recover from DNU.

For Case 2, we suppose i = 7. DNU simultaneously flips the input N<sub>0</sub> and output N<sub>7</sub> of C<sub>7</sub>. DNU does not affect N<sub>1</sub> and N<sub>3</sub>. Thus, N<sub>1</sub> and N<sub>3</sub> restore N<sub>0</sub> to the correct value by C<sub>0</sub>. Then, N<sub>0</sub> and N<sub>2</sub> recover N<sub>7</sub> through C<sub>7</sub>. Therefore, we can also achieve recovery from DNU in Case 2. For Case 3, we suppose i = 2. Nodes N<sub>2</sub> and N<sub>6</sub> are upset by DNU simultaneously. DNU does not affect N<sub>0</sub> and N<sub>4</sub>. Hence N<sub>1</sub> and N<sub>5</sub> will enter the high impedance regime. Then, N<sub>2</sub> will return to its correct state via N<sub>3</sub> and N<sub>5</sub> through C<sub>2</sub>. In the similar way, N<sub>6</sub> can be restored to the correct state by N<sub>1</sub> and N<sub>7</sub> through C<sub>6</sub>. Therefore, under this case series, this latch is also able to recover from the DNU.

In summary, after the transient fault disappears, each nodepair impacted by a DNU for the proposed M-8C latch can be restored to its correct value. Therefore, the proposed M-8C latch has complete DNU recovery.

When considering errors on the control signal RES, it is important to note that these signals are typically activated only during the recovery time and are not continuously used throughout the latch's operation. During this brief time window of recovery, these signals are directly provided by the signal source, resulting in a relatively low probability of errors. On the other hand, internal signals like those used in latches and other components require higher attention due to their frequent usage and the complexity of their internal design. Hence, in most cases, it is reasonable to neglect the possibility of errors on RES and similar control signals. However, once the control signal, such as RES, suffers from an error, the proposed latch cannot tolerate, so that we will leave this issue as an interesting work.

#### C. Non-volatility based on MTJs

For the proposed M-8C latch, its non-volatility is based on the part of the circuit constructed with a pair of MTJs. In this part (i.e., in the lower part in Fig. 7), the resistance value of MTJ is adjusted by the voltage level of the internal node during the backup so that the internal node can obtain the correct value through MTJ when the power is resupplied for the restore operation. Due to the high modularity, this part of the restore circuit can be used as a separate unit combined with other latches, thus providing them with non-volatility. Therefore, this part of the circuit is universal.

#### (1) Operation Flow of Backup

In scenarios requiring a backup operation, for the proposed M-8C latch, it becomes necessary to set WR to 1. Simultaneously, RES is set to 0 and PRE to 1. It is noteworthy that, in addressing the impracticality of performing backups in every clock cycle, the proposed approach adopts a novel methodology. Instead of using a CLK signal to control the transmission gates for backup initiation, the mechanism employs WR and NWR signals to drive the transmission gate control. This innovation enables precise control over backup timing and frequency, thereby overcoming the necessity for performing backups during each transparent phase.

By flexibly controlling the WR and NWR signals, the decision of when to execute a backup operation or when to circumvent it can be made. Selectively activating the WR signal allows backup operations to commence at specific moments. This synchronous control mechanism ensures that all latch circuits simultaneously generate backups (checkpoints), guaranteeing data consistency across the entire comparisons.

When WR = 1, RES = 0, and PRE = 1, the backup module governs the six transmission gates connected to internal nodes  $N_0 - N_5$ , causing them to open. Simultaneously, through the flow of current, the value between the internal nodes and the output can be sustained within the MTJ, thereby completing the backup operation. For instance, when  $N_0 = N_2 = N_4 = 1$  (while  $N_1 = N_3 = N_5 = 0$ ), with the transmission gate controlled by WR open and the NMOS controlled by RES closed, the current can only flow from the FL of MTJ2 to the FL of MTJ1. This results in MTJ1 being in the AP state, while MTJ2 is in the P state.

In order to facilitate effective MTJ state switching, the utilization of three nodes is retained (such as nodes  $N_1$ ,  $N_3$ , and  $N_5$  converging above MTJ1), rather than relying solely on a single node. This configuration generates a greater flow of current, ensuring the efficient execution of MTJ state transitions.

#### (2) Operation Flow of Restore

The powered off VDD can turn off all transistors. When VDD is powered on, the proposed latch starts the restore operation. If PRE = 0, all internal nodes  $N_0$  to  $N_7$  can be correctly charged by PMOS transistors so that  $N_0 = N_1 = N_2 = N_3 = N_4 = N_5 = N_6 = N_7 = 1$ . At this time, let WR = 0 and RES = 1, the backup path between the MTJ and internal nodes will be deactivated, and the recovery path between the MTJ and

internal nodes will be activated. Concurrently, the PL of MTJ1 and MTJ2 will be grounded simultaneously. Because the MTJ in P state has smaller resistance than that in AP state, the nodes connected to the MTJ in P state discharge faster than those connected to the MTJ in AP state.

The logical values of N1, N3, N5, N7 and N0, N2, N4, N6 is different. For instance, if MTJ1 is in AP state and meanwhile MTJ2 is in P state, the resistance value of MTJ2 is lower than MTJ1. If the proposed latch conducts the restore operation, where  $N_0$  to  $N_7$  has no value, PRE = 0 makes the internal nodes  $N_0$  to  $N_7$  all have the value 1. Since RES = 1, the NMOS transistors controlled by RES and connected to MTJs are all conduction, and the L layer of MTJs is grounded. So that N1, N<sub>3</sub>, N<sub>5</sub>, and N<sub>7</sub> discharge faster than N0, N2, N4, and N6 currently since the resistance of MTJ2 is lower than that of MTJ1 (MTJ1 is in AP state, and meanwhile MTJ2 is in P state). Hence,  $N_1 = N_3 = N_5 = N_7 = 0$  ( $N_0 = N_2 = N_4 = N_6 = 1$ ). At this point, all the internal nodes have obtained the correct values. That is, the state of N<sub>0</sub> to N<sub>7</sub> is reloaded as the original state and maintained until the next valid clock pulse when the restore operation is completed.

#### D. Simulations

The proposed M-8C latch was designed/implemented in a 45nm CMOS bulk process using the MTJ model as proposed in [21], and pertinent simulations were performed using Synopsys HSPICE. The critical parameters of the STT-MTJ employed in the simulations have been exhaustively presented in Table I. Concurrently, under the standard power supply voltage of 1V and at room temperature, we have explicitly established the aspect ratio for the PMOS transistors within the latch having W/L = 2, while the aspect ratio for the NMOS transistors having W/L = 1.

#### (1) Error Recovery

Figure 8 shows the effect of SNU on internal nodes  $N_0$  to  $N_7$  in hold mode with CLK = 0. As can be seen, a particle striking creates a positive or negative error that changes the value of the impacted nodes temporarily. Our proposed latch can subsequently restore the affected node to the correct value. Therefore, the simulated waveforms verify the full resilience of the proposed latch to SNUs.



Fig. 8. Simulation waveform of the injected SNU for the proposed latch.

Figure 9 presents the simulation waveform of injected DNU for the proposed M-8C latch. To consider the completeness of simulations, positive/negative errors were all simulated at the impacted nodes. Figure 9(a) shows the simulation results for the first case (see Section IV.B), where a DNU simultaneously impacts nodes  $N_{(i+1) \mod 8}$  and  $N_{(i+3) \mod 8}$  of CE  $C_{i(0 \le i \le 7)}$ , using i = 0 as well as i = 1 examples. Clearly, the DNUs impact node pairs  $<N_1(Q)$ ,  $N_3>$  and  $<N_2$ ,  $N_4>$  respectively, inducing voltage glitches. However, the impacted nodes can recover to their original correct values after the glitches die down.



Fig. 9. Simulation waveform of injected DNU for the proposed latch.

Figure 9(b) shows the simulation waveform for Case 2 (see Section IV.B), taking i = 5 as well as i = 7 as examples. Pertinent DNUs were injected to  $\langle N_0, N_7 \rangle$  and  $\langle N_5, N_6 \rangle$ , respectively, and the values of  $\langle N_0, N_7 \rangle$  and  $\langle N_5, N_6 \rangle$  are temporarily upset, causing that  $N_1$  (Q) as the output is also temporarily flipped. Then, the impacted nodes eventually return to the original correct states. Figure 9(c) (see Section IV.B) shows the simulation waveform for Case 3, taking i = 1 as well as i = 2 as examples. Clearly, DNUs cause the logical values of  $\langle N_1(Q), N_5 \rangle$  and  $\langle N_2, N_6 \rangle$  to change temporarily. However, flipped nodes can also restore to their original correct values. Therefore, the simulation results of the above three cases show that the proposed M-8C latch can indeed recover from all DNUs.



Fig. 10. DNU effects on the proposed latch when the clock is gated.

Figure 10 shows the simulation waveforms when DNU affects the proposed M-8C latch. Clearly, when a DNU affects node pairs  $\langle N_0, N_7 \rangle$ ,  $\langle N_1(Q), N_3 \rangle$ ,  $\langle N_2, N_4 \rangle$ ,  $\langle N_5, N_6 \rangle$ ,  $\langle N_1(Q), N_5 \rangle$ , and  $\langle N_4, N_6 \rangle$ , respectively, the affected nodes are flipped temporarily but quickly restore to their original correct values. Therefore, the results of all the above simulation results confirm that the proposed M-8C latch indeed has the capability to fully self-recover from DNU.

#### (2) Normal/Restore Operations

Figure 11 illustrates the simulation waveforms of the proposed M-8C latch across three operational phases: normal, backup, and restore. Under a standard power supply voltage of VDD = 1V and conditions where WR = 0, RES = 0, and PRE = 1, the latch enters its normal operational state. Clearly, prior to the CLK falling to 0 at 39ns, the proposed latch operates in a transparent mode:  $N_1(Q)$  is influenced by D. Subsequently, at 41ns following the CLK's return to 0, the latch transitions to a hold mode:  $N_1(Q)$ 's value maintains the initialization set by D during the preceding transparent mode.



Fig. 11. Simulation results of the proposed latch during normal and restore operations.

Between 80ns and 135ns, when WR = 1, the M-8C operates in the backup phase: the states of MTJ1 and MTJ2 evolve in accordance with  $N_1(Q)$ 's value. Specifically, when  $N_1(Q) = 0$ , MTJ1 assumes an AP state, while MTJ2 adopts a P state. Conversely, when  $N_1(Q) = 1$ , MTJ1 transitions to a P state, and MTJ2 to an AP state, thereby concluding the backup operation. However, upon setting WR = 0, the backup path is disengaged, leading to the observation that the states of MTJs no longer vary with N<sub>1</sub>(Q)'s value.

Moreover, at 135ns, the proposed latch was powered off (i.e., VDD = 0) and thus the output was 0. Nevertheless, the output cannot have its original correct value before VDD was power on at 200ns. Clearly, after applying the restore signals, the correct data can be reloaded from the MTJs to the DNU recovery module and thus the values of all internal nodes in the proposed latch became correct once again. In summary, the simulation results demonstrate all correct operations of the proposed M-8C latch.

#### E. Discussion

In this section, we focus on the universality of the backup and restore module in Fig. 7. We propose a similar module in our previous work [24]; however, it requires the latch to be adjusted so as to achieve non-volatility. To make the module universal, we propose a new module and Fig. 12 shows the structure of the proposed backup and restore module.

According to the number of node pairs in different latches, fine-tuning the backup and restore module can bind the module to different latches, thus providing non-volatility for these latches, respectively. In this way, we do not need to adjust the designed latch to realize its non-volatility, which significantly improves the design efficiency and practical application value of the non-volatile latch design.



Fig. 12. Structure of the proposed backup and restore module.

In Fig. 12, the proposed universal module consists of two parts. One is the interlocking module composed of internal nodes through NMOS transistors (this is in the top half part of Fig. 12). The other is the MTJ resistance adjustment module composed of the internal node and a pair of MTJs plus six transmission gates (this is in the bottom half part of Fig. 12).

Let us now discuss how to adjust our proposed backup and restore module for different latches. Firstly, the internal nodes of the latch are divided into two classes whose logical values are opposite to each other at any moment, and then a node pair is selected from each of the two classes. In the next step, we select three node pairs as the signal sources for adjusting MTJ resistance. Three pairs are selected because they can already adjust the resistance of MTJ, and if additional node pairs are added, unnecessary power consumption will be increased. For example, in the proposed latch, we choose  $\langle N_0, N_1 \rangle$ ,  $\langle N_2, N_3 \rangle$  and  $\langle N_4, N_5 \rangle$  to ensure that the current flows from MTJ1 to MTJ2 or vice versa to obtain different MTJ resistance according to the logic value of the internal nodes. In this way, we can adjust the resistance adjustment module.

Let us now discuss the adjustment of the interlocking module in Fig. 12 to fit for a new latch. It can be observed from Fig. 12 that we add a pair of cross-controlled NMOS transistors to each node pair to form an interlocking module. For example, since the proposed latch has four node pairs, there are four crosscontrolled NMOS transistors in the interlocking module of Fig. 7. If there are five pairs of nodes, then add a pair of crosscontrolled NMOS transistors on this basis; similarly, if there are three pairs of nodes, a pair of cross-controlled NMOS transistors can be removed based on this. Therefore, after adjusting the interlocking module according to the actual number of the internal nodes of a latch, it is only necessary to connect the internal nodes of the latch with the nodes of the backup and restore module one by one to provide a nonvolatility for the latch.

#### V.COMPARATIVE RESULTS

In order to facilitate an equitable assessment, the latch configuration detailed in Table II and the implementation parameters for both the proposed M-TPDICE-V2 and M-8C latches remain consistent: all designs underwent simulation utilizing the 45nm CMOS bulk technology, ensuring fairness in data comparison. Moreover, owing to the inherent diversity of the respective designs, the proportions and configurations as outlined in their original publications were faithfully retained. Furthermore, all designs adhere to the MTJ model parameters presented in Table I of this manuscript and were simulated under uniform conditions of 1V supply voltage and room temperature.

| TABLE II                                                  |  |  |  |  |  |
|-----------------------------------------------------------|--|--|--|--|--|
| RELIABILITY COMPARISONS AMONG THE SNU AND/OR DNU RECOVERY |  |  |  |  |  |
| NV MAGNETIC LATCHES                                       |  |  |  |  |  |

| Designs        | SNU<br>Tol.  | SNU<br>Rec.  | DNU<br>Tol.  | DNU<br>Rec.  | Backup<br>Ability | Restore<br>Ability |
|----------------|--------------|--------------|--------------|--------------|-------------------|--------------------|
| Design in [15] | $\checkmark$ | ×            | ×            | ×            | $\checkmark$      | $\checkmark$       |
| Design in [16] | $\checkmark$ | ×            | ×            | ×            | $\checkmark$      | $\checkmark$       |
| Design in [18] | $\checkmark$ | ×            | ×            | ×            | $\checkmark$      | $\checkmark$       |
| Design in [13] | $\checkmark$ | ×            | ×            | ×            | ×                 | $\checkmark$       |
| Design in [19] | $\checkmark$ | ×            | $\checkmark$ | ×            | ×                 | $\checkmark$       |
| Design in [23] | $\checkmark$ | ×            | ×            | ×            | $\checkmark$      | $\checkmark$       |
| M-TPDICE-V2    | $\checkmark$ | ×            | $\checkmark$ | ×            | $\checkmark$      | $\checkmark$       |
| M-8C           | $\checkmark$ | $\checkmark$ | $\checkmark$ | $\checkmark$ | $\checkmark$      | $\checkmark$       |

Table II shows the reliability comparisons among the SNU and/or DNU recovery NV magnetic latches. In Table II, note that, "Tol." stands for tolerance, indicating the capacity for fault tolerance. "Rec." represents recovery, signifying the ability to recover from node upsets. The term "Backup Ability" pertains to a latch's capacity to retain the D value within MTJs. Conversely, "Restore Ability" characterizes its proficiency in

| OVERHEAD COMPARISONS AMONG THE SNU AND/OR DNU RECOVERY NV MAGNETIC LATCHES |                    |                    |                    |                         |                |                     |                    |                    |
|----------------------------------------------------------------------------|--------------------|--------------------|--------------------|-------------------------|----------------|---------------------|--------------------|--------------------|
| Designs                                                                    | D-Q Delay          | CLK-Q Delay        | Setup Time         | 10 <sup>-4</sup> ×CMOS  | MTJ            | Power (μW)          |                    |                    |
| Designs                                                                    | (ps)               | (ps)               | (ps)               | Area (nm <sup>2</sup> ) | Counts (       | <mark>Normal</mark> | Backup*            | Restore            |
| Design in [15]                                                             | <mark>55.03</mark> | <mark>70.74</mark> | <mark>56.35</mark> | <mark>10.13</mark>      | <mark>2</mark> | <mark>13.42</mark>  | <mark>19.37</mark> | <mark>29.82</mark> |
| Design in [16]                                                             | <mark>35.35</mark> | <mark>45.07</mark> | <mark>5.72</mark>  | <mark>9.52</mark>       | <mark>2</mark> | <mark>0.85</mark>   | <mark>16.25</mark> | <mark>15.42</mark> |
| Design in [18]                                                             | <mark>44.65</mark> | <mark>43.77</mark> | <mark>6.10</mark>  | <mark>8.30</mark>       | <mark>2</mark> | <mark>0.97</mark>   | <mark>15.72</mark> | <mark>16.38</mark> |
| Design in [13]                                                             | <mark>50.82</mark> | <mark>43.03</mark> | <mark>12.56</mark> | <mark>6.89</mark>       | <mark>4</mark> | <mark>14.45</mark>  | -                  | <mark>17.32</mark> |
| Design in [19]                                                             | 101.74             | <mark>56.23</mark> | <mark>38.25</mark> | <mark>15.39</mark>      | <mark>2</mark> | <mark>17.18</mark>  | -                  | <mark>13.28</mark> |
| Design in [23]                                                             | <mark>7.00</mark>  | <mark>3.96</mark>  | <mark>14.60</mark> | <mark>7.49</mark>       | <mark>2</mark> | <mark>0.24</mark>   | <mark>8.56</mark>  | <mark>1.15</mark>  |
| M-TPDICE-V2                                                                | <mark>2.02</mark>  | <mark>5.74</mark>  | <mark>20.98</mark> | <mark>14.99</mark>      | <mark>2</mark> | 0.02                | <mark>22.12</mark> | <mark>18.53</mark> |
| M-8C                                                                       | <mark>10.09</mark> | <mark>8.95</mark>  | <mark>29.30</mark> | <mark>12.75</mark>      | 2              | <mark>0.19</mark>   | <mark>15.65</mark> | <mark>35.25</mark> |
|                                                                            |                    |                    |                    |                         |                |                     |                    |                    |

\*Note that, the "-" values for "Backup Power" in the designs from [13] and [19] indicate that these two designs do not have backup capabilities.

transmitting stored values from the latch's MTJs. Reliability is an important consideration for radiation-hardened latches. In Table II, the proposed DNU tolerance/recovery non-volatile magnetic latches achieve better reliability among all latches. M-TPDICE-V2 only has DNU tolerance ability but M-8C additionally has DNU recovery ability. Moreover, both M-TPDICE-V2 and M-8C can store a copy of the D value in MTJs, enabling non-volatility.

Furthermore, Table III shows the overhead comparisons among the SNU and/or DNU recovery NV magnetic latches. The term "D-Q Delay" denotes the time taken for the transition from D to Q (averaging rise/fall delays from D to Q), while "CLK-Q" refers to the delay from a change in CLK level to a change in Q (average propagation time for both rising and falling edges of D). "Setup Time" means the minimum amount of time during which the input is held steady before a CLK event, and "CMOS Area" signifies the silicon area, measured as described in [20]. To facilitate a more precise power comparison, we categorized the power consumption into three stages: normal operation power (average of static and dynamic power during regular operation), backup power (storing values in MTJs as non-volatile storage) and restore power (recovering values from MTJs). "MTJ Counts" signifies the quantity of MTJs utilized in each design. Note that, the "-" values for "Backup Power" in the designs from [13] and [19] indicate that these two designs do not have backup capabilities.

Regarding overhead, in terms of the D-Q delay, the design presented in [15] introduces additional components between D and Q to facilitate fault-tolerance capabilities. However, this inclusion results in the highest D-Q delay among the designs. Similarly, the design in [19], due to its incorporation of extra elements between D and Q, exhibits the maximum D-Q delay. Additionally, its utilization of a larger number of transistors contributes to the highest silicon area requirement. Moreover, the modified design outlined in [13] stands out by employing four MTJs for value retention, leading to a comparatively larger MTJ count.

In terms of setup time, insights can be gleaned from Table III. Specifically, the designs featured in [16] and [18] showcase relatively shorter input stabilization times prior to the CLK event, resulting in smaller setup times. In contrast, designs [15] and [19] necessitate extended periods of input stabilization, leading to larger setup times. Notably, the proposed M-TPDICE-V2 and M-8C designs exhibit a moderate setup time, reflecting a balance between input preparation and subsequent operation. In terms of CLK-Q delay, a discernible pattern emerges from Table III. Design in [23], M-TPDICE-V2, and M-8C stand out with the shortest CLK-Q delays. This advantage can be attributed to the use of effective high-speed transmission pathways for them. During the transparent mode, the D signal is seamlessly conveyed to Q through CLK-controlled transmission gates. In contrast, other designs fail to fully exploit this high-speed transmission path, leading to relatively larger CLK-Q delays. Notably, the design in [15] exhibits the highest CLK-Q delay, mainly due to its more intricate circuit involving an increased number of transistors and the introduction of an additional inverter at the output.

In terms of power consumption, the proposed M-TPDICE-V2 and M-8C latches demonstrate commendable performance. Specifically, during the regular operational phase, these designs exhibit the lowest power consumption, highlighting their efficiency. Within the backup phase, the proposed latches maintain a moderate power consumption. Notably, in the recovery phase, the M-8C latch appears to consume a slightly higher amount of power. However, this increased consumption can be attributed to the comprehensive recovery mechanism implemented by the M-8C, encompassing the restoration of all internal nodes of the latch. This additional power outlay is warranted by the enhanced recovery comprehensiveness and reliability it offers.

In a comparative context with other radiation-hardened latch architectures, the proposed M-TPDICE-V2 and M-8C latches present an appealing proposition. These designs showcase the ability to deliver radiation resilience and non-volatility functionality with a reasonable level of overhead. Moreover, the M-8C latch distinguishes itself by offering the unique advantages of DNU recovery and comprehensive restoration of all internal nodes following a power loss. This distinguishing feature renders the M-8C latch particularly well-suited for applications characterized by stringent reliability requirements.

In summary, the proposed M-TPDICE-V2 and M-8C latches have competitive advantages (in terms of reliability and nonvolatility) and moderate overhead. Compared with other hardened latches, the proposed latches provide a more balanced trade-off between reliability and overhead, making them suitable for applications where reliability and non-volatility are required.

#### VI. CONCLUSION

In this paper, we have proposed two novel non-volatile latch designs for robust computing in radiation environments. M-TPDICE-V2 is a DNU-tolerant non-volatile magnetic latch that provides robustness against radiation-induced DNUs and non-volatility based on MTJs. M-8C is a DNU recovery non-volatile magnetic latch that provides complete DNU recovery capability and non-volatility by utilizing MTJs. In addition, the proposed backup and restore module based on MTJs in M-8C can be easily integrated into any latch to provide non-volatility universality. The simulation results have shown that both designs exhibit extremely high reliability, non-volatility, low power consumption, moderate delay and a compact CMOS area. Therefore, the proposed latches outperform other latches in terms of their comprehensive metrics, making them suitable for practical application in radiation environments.

#### REFERENCES

- A. Yan, Y. Chen, Z. Xu, et al, "Design of Double-Upset Recoverable and Transient-Pulse Filterable Latches for Low Power and Low-Orbit Aerospace Applications," *IEEE Transactions on Aerospace and Electronic Systems*, vol. 56, no. 5, pp. 3931-3940, 2020.
- [2] P. Girard, Y. Cheng, A. Virazel, et al, "A Survey of Test and Reliability Solutions for Magnetic Random Access Memories", *Proceedings of the IEEE*, vol. 109, no. 2, pp. 149-169, 2021.
- [3] S. Ikeda, J. Hayakawa, Y. Ashizawa, et al, "Tunnel magnetoresistance of 604% at 300 K by suppression of Ta diffusion in Co Fe B / Mg O / Co Fe B pseudo-spin-valves annealed at high temperature," *Applied Physics Letters*, vol. 93, no. 8, pp. 1-3, 2008.
- [4] F. Razi, M. H. Moaiyeri, R. Rajaei, et al, "A Variation Aware Ternary Spin-Hall Assisted STT-RAM Based on Hybrid MTJ/GAA-CNTFET Logic," *IEEE Transactions on Nanotechnology*, vol. 18, pp. 598-605, 2019.
- [5] G. Kar, W. Kim, T. Tahmasebi, et al. "Co/Ni based p-MTJ stack for sub-20nm high density stand alone and high performance embedded memory application" *IEEE International Electron Devices Meeting*, pp. 1-4, 2014.
- [6] A. Amirany, M. H. Moaiyeri, and K. Jafari, "Process-in-Memory Using a Magnetic-Tunnel-Junction Synapse and a Neuron Based on a Carbon Nanotube Field-Effect Transistor," *IEEE Magnetics Letters*, pp. 1-1, 2019.
- [7] A. Amirany and R. Rajaei, "Fully Nonvolatile and Low Power Full Adder Based on Spin Transfer Torque Magnetic Tunnel Junction with Spin-Hall Effect Assistance," *IEEE Transactions on Magnetics*, vol. 54, no. 12, pp. 1-7, 2018.
- [8] W. Kang, L. Chang, Y. Zhang, and W. Zhao, "Voltage-controlled MRAM for working memory: Perspectives and challenges," *the Design*, *Automation & Test in Europe Conference & Exhibition*, pp. 542-547, 2017.
- [9] A. Amirany, M. H. Moaiyeri, and K. Jafari, "Bio-Inspired Nonvolatile and Low-Cost Spin-Based 2-Bit per Cell Memory," *the 25th International Computer Conference*, pp.1-7, 2020.
- [10] Gosavi T A, Manipatruni S, Aradhya S V, et al, "Experimental demonstration of efficient spin-orbit torque switching of an MTJ with sub-100 ns pulses," *IEEE Transactions on Magnetics*, vol. 53, no. 9, pp. 1-7, 2017.
- [11] Jin H, Miyazaki T, "Tunnel magnetoresistance effect," The Physics of Ferromagnetism. Springer, pp. 403-432, 2012.
- [12] Khvalkovskiy A V, Apalkov D, Watts S, et al. "Basic principles of STT-MRAM cell operation in memory arrays" *Journal of Physics D: Applied Physics*, vol. 46, no. 7, pp. 1-35, 2013.
- [13] Zhang D, Kang W, Cheng Y, et al, "A novel SEU-tolerant MRAM latch circuit based on C-element," *IEEE International Conference on Solid-State and Integrated Circuit Technology*, pp. 1-3, 2014
- [14] A. Yan, Y. Hu, J. Cui, et al, "Information Assurance through Redundant Design: A Novel TNU Error-Resilient Latch for Harsh Radiation Environment," *IEEE Transactions on Computers*, vol. 69, no. 6, pp. 789-799, 2020.

- [15] A. Amirany, F. Marvi, K. Jafari et al, "Nonvolatile Spin-Based Radiation Hardened Retention Latch and Flip-Flop," *IEEE Transactions on Nanotechnology*, vol. 18, pp. 1089-1096, 2019.
- [16] A. Amirany, K. Jafari and M. H. Moaiyeri, "High-Performance and Soft Error Immune Spintronic Retention Latch for Highly Reliable Processors," *Iranian Conference on Electrical Engineering*, pp. 1-5, 2020
- [17] A Yan, Z Xu, X Feng, et al, "Novel Quadruple-Node-Upset-Tolerant Latch Designs With Optimized Overhead for Reliable Computing in Harsh Radiation Environments," *IEEE Transactions on Emerging Topics* in Computing, vol. 10, no. 1, pp. 404-413, 2022.
- [18] A. Amirany, K. Jafari and M. Moaiyeri, "High-Performance Radiation-Hardened Spintronic Retention Latch and Flip-Flop for Highly Reliable Processors," *IEEE Transactions on Device and Materials Reliability*, vol. 21, no. 2, pp. 215-223, 2021.
- [19] D Zhang, X Wang, K Zhang, et al, "Fully Single Event Double Node Upset Tolerant Design for Magnetic Random Access Memory," *IEEE International Symposium on Circuits and Systems*, pp. 1-5, 2021.
- [20] A. Yan, X. Feng, Y. Hu, et al, "Design of a Triple-Node-Upset Self-Recoverable Latch for Aerospace Applications in Harsh Radiation Environments," *IEEE Transactions on Aerospace and Electronic Systems*, vol. 56, no. 2, pp. 1163-1171, 2020.
- [21] J. D. Harms, F. Ebrahimi, X. Ya, et al, "SPICE Macro model of Spin-Torque-Transfer-Operated Magnetic Tunnel Junctions," *IEEE Transactions on Electron Devices*, vol. 57, no. 6, pp. 1425-1430, 2010.
- [22] D. R. Blum and J. G. Delgado-Frias, "Schemes for eliminating transientwidth clock overhead from SET-tolerant memory-based systems," *IEEE Transactions on Nuclear Science*, vol. 53, no. 3, pp. 1564-1573, 2006.
- [23] F. Razi, M. H. Moaiyeri and R. Rajaei, "Design of an Energy-Efficient Radiation-Hardened Non-Volatile Magnetic Latch," *IEEE Transactions* on Magnetics, vol. 57, no. 1, pp. 1-10, Jan. 2021.
- [24] A. Yan et al., "A Radiation-Hardened Non-Volatile Magnetic Latch with High Reliability and Persistent Storage," 2022 IEEE 31st Asian Test Symposium (ATS), Taichung City, Taiwan, pp. 1-6, 2022.