Effects of Thermal Neutron Irradiation on a Self-Refresh DRAM
Lucas Matana Luza, Daniel Soderstrom, Helmut Puchner, Ruben Garcia Alia,
Manon Letiche, Alberto Bosio, Luigi Dilillo

To cite this version:

HAL Id: lirmm-03025721
https://hal-lirmm.ccsd.cnrs.fr/lirmm-03025721
Submitted on 28 Sep 2021

HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Effects of Thermal Neutron Irradiation on a Self-Refresh DRAM

Lucas Matana Luza¹, Daniel Söderström², Helmut Puchner³, Rubén García Alía⁴, Manon Letiche⁵, Alberto Bosio⁶ and Luigi Dilillo¹

¹ LIRMM, University of Montpellier, Montpellier, France, *{lucas.matana-luza, dilillo}@lirmm.fr
² Department of Physics, University of Jyväskylä, Jyväskylä, Finland, *daniel.p.soderstrom@jyu.fi
³ Cypress Semiconductor, San Jose, USA, *helmut.puchner@cypress.com
⁴ Engineering Department, CERN, Geneva, Switzerland, *ruben.garcia.alia@cern.ch
⁵ Institut Laue–Langevin, Grenoble, France, *letiche@ill.fr
⁶ Lyon Institute of Nanotechnology, École Centrale de Lyon, France, *alberto.bosio@ec-lyon.fr

Abstract

In this study, static and dynamic test methods were used to define the response of a self-refresh DRAM under thermal neutron irradiation. The neutron-induced failures were investigated and characterized by event cross-sections, soft-error rate and bitmaps evaluations, leading to an identification of permanent and temporarily stuck cells, block errors, and single-bit upsets.

Index Terms

neutron, irradiation, Self-Refresh, DRAM, SEE, HyperRAM

I. INTRODUCTION

Thermal neutrons are generated by reducing the kinetic energy of energetic neutrons through a moderator, reaching an average energy of 25 meV at room temperature. The reaction of thermal neutrons with the boron-10 (¹⁰B) generates byproducts (including an alpha particle and a lithium-7 nucleus) that can cause single event upsets (SEUs). These effects were a concern for static random access memories (SRAMs) and dynamic random access memories (DRAMs) fabricated with borophosphosilicate glass (BPSG) during the 90s. Nowadays, the BPSG is not present in these devices [1]–[4].

However, several works have been done in sub-micron SRAM devices, which show that even without the BPSG layer in advanced Si technologies, there is a high possibility of contamination during the fabrication process [5]–[9], and that the impact of thermal neutrons should not be ignored [10]–[12]. In [7], the authors investigate the thermal neutron sensitivity of a 40 nm SRAM in which they present a residual source of ¹⁰B from doping in silicon. A characterization of SEUs in Xilinx 20 nm UltraScale Kintex FPGA is presented in [13], resulting in a cross-section comparison on devices with and without a technique to mitigate the effects under thermal neutron irradiation. Also, as presented in [14], the contribution of thermal-neutron-induced soft-error rate (SER) in 16-nm bulk FinFET flip-flops can be comparable with the high-energy-neutron-induced SER, showing variations for different flip-flops designs, owing to different ¹⁰B contamination and different critical charge values.

The proposed work represents the first study on the effects of thermal neutron irradiation on a self-refresh DRAM, a novel type of memory device. The rest of the paper is structured as follows: Section II presents the Device Under Test (DUT), the test facility and the experimental setup; Section III describes the applied test modes; Section IV presents and analyzes the results from the thermal neutron irradiation; Section V concludes the work.

II. TEST SETUP

A. Device Under Test

The DUT is the S27KS0642GABHI020, a 64 Mib HyperRAM™ Self-Refresh DRAM manufactured by Cypress Semiconductor. The DUT is a high-speed CMOS with a HyperBus™ interface, which uses the Double Data Rate (DDR) to reach a data throughput up to 400 MBps with a maximum clock rate of 200 MHz. The memory is laid out

This study has been achieved thanks to the financial support of the VAN ALLEN Foundation (Contract No. UM 181387) and the Region Occitanie (Contract No. UM 181386).

This study has received funding from the European Union’s Horizon 2020 research and innovation programme under the MSC grant agreement no. 721624.

The experiment(s) on D50 (INDU-178) at ILL have been performed within the “Characterisation Program” of the IRT Nanoelec, co-funded by the French government in the frame of the “Programme d’Investissements d’Avenir” under the reference ANR-10- AIRT-05.
on a 38 nm technology, and the cells array is composed of 8192 rows, and each row contains 512 word (16 bits) address.

B. Test Facility

The tests were carried out at the Platform for Advanced Characterisation (PAC-G) facility that is hosted by the Institute Laue Langevin (ILL) in Grenoble, France, using the D50 instrument. This instrument provides thermal neutrons moderated by liquid deuterium at 20 K, and the captured flux (i.e., equivalent flux of 25 meV neutrons) is adjustable from 0 to $10^{10}$ particles/cm$^2$/s, which is controlled by a $^3$He-detector and periodical gold foil measurements [15]. Fig. 1 presents the energy and wavelength beam spectrum provided by the facility.

![Energy and wavelength beam profile.](image)

C. Test Setup

The test setup is composed of a control board based on the Zynq-7000 SoC from Xilinx and a daughter board carrying the DUT. Fig. 2 presents both these boards. The controller system uses the System-on-Chip (SoC) ARM Cortex™-A9 processor to perform the test algorithms on the DUT through the HyperBus™ controller, which is an IP (Intellectual Property) provided by Cypress and implemented in the SoC’s Programmable Logic, which manages the communication between the processor and the DUT.

During the tests, the power supply was monitored in order to identify single event latch-up (SEL). All performed tests were logged with the logical address, bit error data, and operation status. Functional tests were performed between the runs to ensure the full functionality of the device. The DUT was tested under room temperature and nominal supply voltage, using a 25 meV thermal neutron equivalent flux of $10^9$ particles/cm$^2$/s with a $30 \times 30$ mm$^2$ beam. The control board was positioned out of the beam, and to ensure the system reliability, the same was also shielded using a boron carbide material [16].

III. Test Modes

In this study, to evaluate the memory response during irradiation, static, and dynamic memory tests were applied to the DUT. Dynamic tests constantly access the memory employing read and write operations in order to emulate real applications and identify functional faults [17], [18]. For the static test, a write operation is performed with a known data pattern (i.e. solid ‘0’, solid ‘1’ and checkerboard patterns), then the memory is irradiated during a time interval, and subsequently, a readback operation is performed to identify the corrupted bits.

For dynamic tests, four different algorithms were used: March C-, Dynamic Stress, Dynamic Classic, and mMats+. Theses algorithms were previously used, as example, on SRAM [21], FRAM [22], MRAM [23], to evaluate the radiation impact on the devices. Fig. 3 to 6 depict the four algorithms, in which, the arrow indicates the addressing order (‘↑’ up or ‘↓’ down), ‘w’ (write) and ‘r’ (read) indicates the operation and the following Boolean number indicates
the data background. The algorithms are composed of elements which are indicated by the arrow followed by the operations within the parenthesis, each element is applied to the entire address space before proceeding to the following one, and a complete dynamic test algorithm is delimited by a bracket pair [24]. For the March C-, Dynamic Stress and mMats+, the first element (up write operation) is performed only once for the initialization of the memory.

\[ \uparrow (w_0); \]
\[ \{ \uparrow (r_0, w_1); \uparrow (r_1, w_0); \downarrow (r_0, w_1); \downarrow (r_1, w_0); \uparrow (r_0) \} \]

Fig. 3. Scheme of dynamic March C- test algorithm.

\[ \uparrow (w_1); \]
\[ \{ \uparrow (r_1, w_0, r_0, r_0, r_0, r_0, r_0); \]
\[ \uparrow (r_0, w_1, r_1, r_1, r_1, r_1, r_1); \]
\[ \uparrow (r_1, w_0, r_0, r_0, r_0, r_0, r_0); \]
\[ \downarrow (r_0, w_1, r_1, r_1, r_1, r_1, r_1); \]
\[ \downarrow (r_1, w_0, r_0, r_0, r_0, r_0, r_0); \]
\[ \uparrow (r_0, w_1, r_1, r_1, r_1, r_1) \} \]

Fig. 4. Scheme of Dynamic Stress test algorithm.

\[ \{ \uparrow (w_0); \uparrow (r_0); \uparrow (w_1); \downarrow (r_1) \} \]

Fig. 5. Scheme of Dynamic Classic test algorithm.

IV. RESULTS

Static and dynamic tests were applied at the DUT in several runs of five minutes each, in order to have a fluence of $3 \times 10^{11}$ particles/cm$^2$.

The analysis of the test logs led to the identification of four different failures types. The simplest observed failure mode consists of SBUs (Single-Bit Upsets), which was observed 18 times. A write operation was sufficient to erase these SBUs, and the occurrence was not recurrent at the same bit address.

The phenomenon of stuck bits was observed in two different manners: permanent and temporary stuck bits. The failure is defined as a bit with a stuck value (‘0’ or ‘1’) independently of the value that was written. In this study, permanent stuck bits are the ones that, after the first appearance, the error occurs in each one of the following
read accesses to the faulty address. In the case of temporary stuck bits, the error returns just during a certain time window.

The number of permanent stuck bits as a function of cumulative fluence is presented in Fig. 7, which exhibit a growth of stuck cells with the increase of the cumulative fluence. The points depicted in this figure represents the number of stuck bits at their first appearance. During a Dynamic Stress test, all the cells that present the stuck at phenomenon do not return the faulty bit as an error in the sequential five read back performed just after a write operation, however, it appeared in the first read operation performed in the next element of the algorithm. This behavior can be explained with an induced reduction (by the particle interaction) of the retention time of the storage capacitor of the cell. The logic value of the stuck cell was either ‘0’ or ‘1’, showing that each logic value is represented by a charged or discharged capacitor depending on the memory region. From the total observed stuck (permanent and temporary) cells, 44.1 % was stuck at ‘0’ and 55.9 % at ‘1’.

Temporary stuck bits presented the same behavior of the permanent ones. The only difference is that, in the first case, the failure is not permanent and just occurred during consecutive write and read operations that were performed within the dynamic and static test modes. Temporary stuck bits also presented different level of damage, i.e. different retention capability of the cell. The duration of these temporary errors was different depending on the test run.

Besides the described failures, block errors with vertical and horizontal shapes were observed in the memory bitmaps. In order to evaluate these events, we generated logical bitmaps by dividing the memory array into two parts, using the left side for pair rows and the right side for the even ones. This procedure generated 16 384 columns. In a bitmap, each pixel represents a bit cell.

An example of a horizontal block error can be seen in Fig. 8, which is the resulted Bitmap of a static test with a checkerboard pattern as data background. In the figure, two square zones are zoomed-in to increase visibility. These events are characterized by errors occurring in all the 512-word addresses of two consecutive even or odd rows, being most of the bits within a word with an error. An exception of this behavior is presented in the left zoomed-in square of Fig. 8, were within the same address range, the bitmap shows a horizontal band of errors with most of bits not faulty, resulting in events with less than the expected 1024 words errors.

Block errors were also observed with a vertical shape, in which the same column is affected in subsequent even or odd rows. Fig. 9 spot this block error identified during a Dynamic Stress test in a second cycle for the first “r1” operation of the fourth element of the algorithm. It is interesting to highlight that in all vertical lines of errors, the addresses with errors span in the same range, returning a maximum of 2 048 words with errors.
For both vertical and horizontal errors, a write operation was able to restore the access to the cells without the need to carry out a power cycle. This type of errors is not due to problem related to the affected cells, but rather to the control logic. In particular, a temporary malfunction of the sense amplifier or register that serve that column may lead to this type of behavior.

Besides the above-discussed events, two blocks of errors spanning a different range of addresses occurred during the test campaign. The first event is depicted in Fig. 10. The Bitmap presented in the figure was obtained during a Dynamic Stress test. The red arrows show the six error lines that were presented in the five “r1” operations performed in the last element of the Dynamic Stress algorithm. In this case, in three fixed columns in both even and odd rows, we identified twelve addresses range. As the opposite of the first vertical event, all the addresses returned all bits with an error.

The second type of vertical line failure mechanism was observed during March C- test execution, with increasing addressing order, resulting in a sequence of more than 100 words with errors. The affected addresses were dependent of the execution order, resulting in a range from “000000h” to “00006Ah” for an increasing order (↑), and from “3FFFFh” down to “3FFF8Dh” for a decreasing order (↓). The effect persisted during several cycles of dynamic tests. However, after a dynamic execution, we performed static write and read operation, and the block error was recovered after two static writes, returning its appearance during the next dynamic test. This event occurred during five runs using March C-, Dynamic Classic, mMats+, and with a sequence of static tests between the irradiation runs, and it was totally recovered just after a power cycle.

In order to evaluate the events’ cross-sections of this memory, the failure types were divided into SBUs, temporary stuck bits, permanent stuck bits, and block errors. The estimate event cross section (σ) is defined as

\[
\sigma = \frac{N}{F \times M} \tag{1}
\]

where \(N\) is the number of events, \(F\) is the beam fluence in particles/cm\(^2\), and \(M\) is the number of bits \[25\].

From the calculated events cross sections, we define the SER, which expressed in FIT/Mb. FIT/Mb is equal to a failure per billions of working hours per Mb \[26, 27\]. The equation is

\[
SER = \sigma \times (1024 \times 1024) \times 10^9 \times j \tag{2}
\]

where \(1024 \times 1024 \text{ (bits)}\) is the Mb coefficient, \(10^9\) is the FIT definition, and \(j\) (6.5 particles/cm\(^2\)/h) is the thermal energies’ (< 400 meV) flux at New York (sea level) outdoors for a mean solar activity defined in JEDEC JESD89B \[1, 27, 28\].
Fig. 9. Bitmap obtained during a Dynamic Stress test after the first ‘r1’ of the fourth line of the algorithm. Each pixel represents a bit; bits that were identified with errors appear in black. The gray lines are used to limit the region. Zoom-ins are added to increase the visibility of the horizontal block events.

Table I presents the estimated cross sections and SER.

<table>
<thead>
<tr>
<th>Failure type</th>
<th>Cross Section ($\sigma$) (cm$^2$/bit)</th>
<th>SER (FIT/Mb)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Single bit upset</td>
<td>$3.43 \times 10^{-20}$</td>
<td>$2.3 \times 10^{-4}$</td>
</tr>
<tr>
<td>Permanent stuck bit</td>
<td>$2.67 \times 10^{-20}$</td>
<td>$1.7 \times 10^{-4}$</td>
</tr>
<tr>
<td>Temporary stuck bit</td>
<td>$4.01 \times 10^{-20}$</td>
<td>$2.7 \times 10^{-4}$</td>
</tr>
<tr>
<td>Block errors</td>
<td>$2.67 \times 10^{-20}$</td>
<td>$1.7 \times 10^{-4}$</td>
</tr>
</tbody>
</table>

V. CONCLUSION

The effects of thermal neutron irradiation in a self-refresh DRAM were described. From static and dynamic test modes realized during a test campaign, different kinds of failures were identified. Besides the occurrence of SBUs, the tests showed permanent and temporary stuck bits, which already had been reported in several studies, presenting different fault mechanisms, being the most probable cause the irradiation impact on the variable retention time phenomenon [29], [30].

Furthermore, block errors were observed in four different patterns, with intermittent word errors in vertical and horizontal sequential logical addresses, and also presenting divided vertical lines with all bit within a word with errors, and a sequential error with dependency in the addressing order.

Cross-section for the different kinds of failures were estimated in a magnitude order of $10^{-20}$, showing that the memory is not very sensitive to thermal neutrons, however, it is necessary to consider that vertical and horizontal errors present a significant quantity of word errors within an event, where, for a user point of view, is relevant in critical applications.

REFERENCES

Fig. 10. Bitmap obtained during a Dynamic Stress test after the fifth ‘r0’ of the sixth line of the algorithm. Each pixel represents a bit; bits that were identified with errors appear in black. The gray lines are used to limit the region. Zoom-ins are added to increase the visibility of the horizontal block events. Red arrows indicate the six vertical lines.


