

## Energy-Performance Assessment of Oscillatory Neural Networks based on VO2 Devices for Future Edge AI Computing

Corentin Delacour, Stefania Carapezzi, Madeleine Abernot, Aida Todri-Sanial

### ▶ To cite this version:

Corentin Delacour, Stefania Carapezzi, Madeleine Abernot, Aida Todri-Sanial. Energy-Performance Assessment of Oscillatory Neural Networks based on VO2 Devices for Future Edge AI Computing. 2022. limm-03591176

## HAL Id: lirmm-03591176 https://hal-lirmm.ccsd.cnrs.fr/lirmm-03591176

Preprint submitted on 28 Feb 2022

**HAL** is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L'archive ouverte pluridisciplinaire **HAL**, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d'enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

# Energy-Performance Assessment of Oscillatory Neural Networks based on VO<sub>2</sub> Devices for Future Edge AI Computing

Corentin Delacour, Stefania Carapezzi, Madeleine Abernot, Aida Todri-Sanial

Abstract—Oscillatory Neural Network (ONN) is an emerging neuromorphic architecture composed of oscillators that implement neurons and coupled by synapses. ONNs exhibit rich dynamics and associative properties, which can be used to solve problems in the analog domain according to the paradigm let physics compute. For example, compact oscillators made of VO<sub>2</sub> material are good candidates for building low-power ONN architectures dedicated to AI applications at the edge like pattern recognition. However, little is known about the ONN scalability and its performances when implemented in hardware. Before deploying ONN, it is necessary to assess its computation time, energy consumption, performance and accuracy for a given application. Here, we consider a VO<sub>2</sub>-oscillator as an ONN building block and we perform circuit-level simulations to evaluate the ONN performances at the architecture level. Notably, we investigate how ONN computation time, energy and memory capacity scale with the number of oscillators. We show that ONN energy grows linearly when scaling up the network, making it suitable for large-scale integration at the edge. Furthermore, we investigate the design knobs for minimizing the ONN energy. Assisted by TCAD simulations, we report on scaling of VO<sub>2</sub> devices in crossbar geometry to decrease the oscillator voltage and energy. We benchmark ONN versus state-of-the-art architectures and observe that the ONN paradigm is a competitive energy efficient solution for scaled VO<sub>2</sub> devices. Finally, we present how ONN can efficiently detect edges in images captured on low-power edge devices.

*Index Terms*—Oscillatory Neural Network, Vanadium Dioxide (VO<sub>2</sub>), Edge AI, Hopfield Neural Network, Image Edge Detection

#### I. INTRODUCTION

THE number of mobile devices connected to the internet has considerably increased the past few years and is estimated to reach 75 billion by 2025 [1]. The Internet of Things (IoT) paradigm is driven by continuous machine learning and AI progress, allowing mobile devices to predict and decide in interaction with their environment. IoT devices are connected and regularly exchange data on the internet, but depending on the workload, such connectivity may suffer from latency issues, bandwidth problems, and even confidentiality issues for some applications such as sending sensitive data as in healthcare devices. For these reasons, IoT devices require some local processing capability instead of transferring data over cloud or data centers [2]. However, with the sophistication of AI algorithms, computation at the edge becomes challenging for devices with limited resources [1]. Current algorithms depend on large neural networks with thousands of synapses and data propagate through several layers of neurons, via successive matrix multiplications between data and synaptic weights. Such algorithms implemented on a Von Neumann architecture (such as edge CPUs) suffer from large power consumption and data transfer bottleneck between memory and processing unit, also known as the Von Neumann bottleneck [3].

To overcome the limitations of the Von Neumann bottleneck, alternative brain-inspired computing paradigms are explored. Inspired by the biological neural networks, in-memory computing aims to merge memory and processing functions, where device physical properties can store the network's weights while efficiently performing matrix products [4]. For instance, Ohm's and Kirchhoff's laws naturally describe multiplication and summation in the analog domain, allowing fast and efficient computations. Such as in crossbar architectures that have been shown to perform energy-efficient inference [4].

Based on the in-memory computing paradigm and inspired by the olfactory system in biological neural networks, Oscillatory Neural Networks (ONNs) compute by harnessing the rich dynamics of coupled oscillators for parallel processing. In ONNs, neurons are oscillators that are physically connected by electrical components (synapses). By exploiting nonlinear oscillator, dynamics allows to compute in phase [5] or frequency [6]. In ONNs, the memory is locally stored in synaptic elements, and it is distributed among oscillators that act as processing units interacting in parallel, in contrast with Von Neumann's architecture. Mathematicians have studied ONN for decades and have proved the collective computational capability in ONNs [5].

In hardware, ONNs have been implemented with various technologies such as CMOS ASICs [7], [8], fieldprogrammable gate arrays [9], spintronic oscillators [10], micro-electromechanical systems [11] for solving tasks varying from image processing [12], [13], [14], [15] to combinatorial optimization problems [16], [17], [18], [19], [20] and to implement reservoir computers [10], [21], [22], [23]. Insulatorto-metal phase transition (IMT) devices such as vanadium dioxide (VO<sub>2</sub>) are promising candidates to design compact nano-oscillators as they only require an additional load to produce oscillations at room temperature and are CMOScompatible [14], [24]. It is believed that scaled VO<sub>2</sub> devices would provide fast and energy-efficient oscillations and has

This work was supported by the European Union's Horizon 2020 research and innovation program, EU H2020 NEURONN (www.neuronn.eu) project under Grant 871501.

C. Delacour, S. Carapezzi, M. Abernot and A. Todri-Sanial are with the Microelectronics Department, LIRMM, University of Montpellier, CNRS, Montpellier, France, e-mail: corentin.delacour@lirmm.fr.



Fig. 1. ONN inputs and outputs are phase differences among oscillators and the reference oscillator (first one). As  $\Delta \Phi_i \epsilon [0^\circ, 180^\circ]$ , we represent phases by black and white pixels where a pixel corresponds to a single oscillator.

been validated experimentally up to eight coupled oscillators [18]. But, little is known about ONN energy when scaling up the network size, whereas energy and power are among the most important specification for edge devices. Likewise, it is still unknown how the computation time evolves for a large ONN when used as an associative memory.

Prior experimental work using VO<sub>2</sub> oscillators have reported on ONN performances for less than ten oscillators, but information on 1) VO<sub>2</sub> device scaling and 2) ONN architecture scaling are yet to be explored. For example, for image processing application, Shukla et al. reported the power consumption for six-coupled VO<sub>2</sub>-oscillators [25], [24] but do not mention the energy and delay for larger networks. Though, at the device level, a power projection motivates the scaling down of the VO<sub>2</sub> channel length in planar geometry. For spoken vowel detection, Dutta et al. [6] propose to use four coupled planar VO<sub>2</sub> oscillators that consume 6  $\mu$ W each, but scalability and computation time are not discussed. Corti et al. [14] describe how four and nine coupled VO<sub>2</sub> oscillators can be used as input filters in convolutional neural networks and make a projection of the ONN energy-delay for scaled VO2 devices in crossbar geometry. However, the estimation remains empirical as VO<sub>2</sub> device physics and coupling elements parameters are not considered.

In this work, we investigate VO<sub>2</sub>-ONN scaling at device, circuit, and architecture levels. We model coupling elements by resistances to study ONN architecture and performance. The contributions of this work are as:

- we show that the memory capacity of a fully-coupled ONN scales linearly with the number of oscillators, similarly to Hopfield Neural Networks (HNN).
- we analytically derive and express the trade-off between the ONN size, the oscillating frequency and the Signal to Noise Ratio (SNR).
- we determine the ONN linear energy scaling and constant computation time with the number of oscillators.
- assisted by Technology Computer-Aided Design (TCAD) simulations, we demonstrate how to minimize the oscillating energy for crossbar VO<sub>2</sub> devices.
- we benchmark the VO<sub>2</sub>-based ONN energy and delay with respect to state-of-the-art neural accelerators and neuromorphic chips. We highlight that ONN can be



Fig. 2. ONN as an associative memory like an Hopfield Neural Network (HNN), using 60 VO<sub>2</sub>-oscillators. A noisy input image initializes ONN. ONN retrieves the training image after a settling time. The conceptual Hopfield energy [26] decreases until it reaches a local minimum.

a competitive computing paradigm for high oscillating frequencies.

 finally, we showcase a VO<sub>2</sub>-based ONN benchmark for image edge detection and compare it with the state-ofthe-art CMOS ASICs.

#### **II. ONN DESCRIPTION**

#### A. ONN as an Associative Memory

In ONNs, the information is encoded in phase differences among oscillators and a reference oscillator (the first one) [5], [13]. ONN inputs are the phase initialization  $\Delta \Phi_i^{in} \epsilon [0^\circ, 180^\circ]$ , and outputs are the phase differences measured once ONN stabilizes. ONN output phases lock to binary values  $\Phi_i^{out} = 0^\circ$ and  $\Phi_i^{out} = 180^\circ$ , which in the case of image processing can be represented by white and black pixels, respectively. Hence, we represent the ONN phase state by black and white images, where every oscillator corresponds to a single pixel (Fig.1).

In this work, we investigate ONN for associative memory applications like a Hopfield Neural Network (HNN) [26]. HNNs have been used for various applications such as solving optimization problems [27], [28] and image processing and encryption [29]. To study the ONN memory capacity, we perform pattern recognition like HNN where the network is fully connected, meaning every oscillator is connected to all the others. We train the network using the Hebbian learning rule [26] and we map the synaptic weights to coupling resistances [30] to store training images in the ONN. When ONN settles to a stable phase state (corresponding to a training image), its dynamics can be interpreted as the minimization of an energy function defined in [26]. An example of ONN computation with 60 VO<sub>2</sub>-oscillators is shown in Fig.2, where ONN retrieves the noiseless digit '1' after a few oscillation cycles. Next, we present how the dynamics of coupled VO<sub>2</sub>oscillators lead to associative properties, similarly to HNN.

#### B. VO<sub>2</sub> Oscillator

VO<sub>2</sub> is an IMT material that can switch in two different resistive states depending on its voltage V [13]. It transitions from an insulating state to a metallic state when V is above a threshold  $V_H$ , and reciprocally when V reaches the threshold  $V_L$ . Hence, VO<sub>2</sub> presents a hysteresis in its I-V plan (Fig.3b).



Fig. 3. a) VO<sub>2</sub>-oscillator with  $R_S$  as biasing load. b) VO<sub>2</sub> I - V curve and load line  $I_L$ .  $C_P$  charges when VO<sub>2</sub> is in metallic state, and discharges in insulating state.

We use this property to design a relaxation oscillator. We bias the VO<sub>2</sub> device with a load resistance  $R_S$  in series and we connect a capacitor  $C_P$  in parallel with the output node  $V_{out}$ to adjust the oscillation frequency (Fig.3a). The oscillator's dynamics are described as  $C_P \frac{dV}{dt} = I - I_L$ , where I is the VO<sub>2</sub> current capturing its hysteresis behavior. To produce oscillations, the line  $I_L$  must intercept I in the VO<sub>2</sub> negative differential resistance region (NDR) (Fig.3b). To emulate VO<sub>2</sub> behavior in circuit simulations, we use the compact model from Maffezzoni *et al.* [31] with circuit parameters listed in Table I.

 TABLE I

 List of parameters used for simulations in this work.

| Parameter            | Value    |  |
|----------------------|----------|--|
| $V_{DD}$             | 2.5 V    |  |
| $R_S$                | 20 kΩ    |  |
| $C_P$                | 500 pF   |  |
| Rins                 | 100.2 kΩ |  |
| $R_{met}$            | 0.99 kΩ  |  |
| $V_L$                | 1 V      |  |
| $V_H$                | 1.99 V   |  |
| α                    | 200      |  |
| $	au_0$              | 10 ns    |  |
| $V^+ = V_{DD} - V_L$ | 1.5 V    |  |
| $V^- = V_{DD} - V_H$ | 0.501 V  |  |
| $T_{osc}$            | 21.6 µs  |  |
| Simulation time step | 1 ns     |  |

#### C. Two-Coupled Oscillators

ONN initialization assigns oscillators initial phases that correspond to the input image. Here, we initialize input phases by delaying oscillators' starting time with respect to the reference oscillator, as  $\Delta \Phi_i^{in} = \frac{\Delta t_i}{T_{osc}} 2\pi$ , with  $T_{osc}$  the natural oscillation period [30]. Oscillators' dynamics evolve by exchanging current through the coupling resistor  $R_{C_2}$ . Fig.4a shows the simplest configuration with two VO<sub>2</sub>-oscillators. Coupling switches allow precise ONN initialization [30] and are closed once all oscillators are turned on. Analogous to a synaptic weight in HNN,  $R_{C_2}$  determines the coupling strength between two oscillators, and hence the final phase state. Fig.5a



Fig. 4. a) Two VO<sub>2</sub> oscillators coupled by a resistor  $R_{C_2}$ . Coupling switches are opened during initialization, and are closed once all oscillators are turned on. b) Delay on  $V_{DD2}$  with respect to  $V_{DD1}$  set the input phase.



Fig. 5. a) Output phase as a function of  $R_{C_2}$  and input phase. A large  $R_{C_2}$  emulates a negative synaptic weight as the output phase is 180°. A small  $R_{C_2}$  implements a positive synaptic weight. b) and c) For  $R_{C_2} \approx R_{C_2}^0$ , the two oscillators retrieve the two phase states 0° and 180° for  $\Delta \Phi^{in}$ =80° and  $\Delta \Phi^{in}$ =100°, respectively

shows the relationship between 1) the input phase, 2) the coupling resistance  $R_{C_2}$ , and 3) the output phase once the oscillators settle. When  $R_{C_2} << R_{C_2}^0$ , oscillators are in-phase and  $R_{C_2}$  implements a positive synaptic weight. Whereas for  $R_{C_2} >> R_{C_2}^0$ , oscillators are out-of-phase and  $R_{C_2}$  emulates a



Fig. 6. ONN recognition accuracy for 36 oscillators. M is the number of stored patterns.

negative weight [30].  $\Delta \Phi^{in}$  allows the retrieval of one of the two states. Fig.5b and c show examples with  $R_{C_2} = R_{C_2}^0$  where the two oscillators settle to in-phase and out-of-phase states, respectively. This two-coupled oscillator case represents the smallest ONN used as an associative memory. In the next section, we investigate large size ONNs with N oscillators coupled by resistances  $R_{C_N}$ , and we report on their performances.

#### III. ONN SCALING

#### A. Simulation set-up

We have developed an ONN circuit simulation platform in Matlab that includes VO<sub>2</sub> device parameters (compact model [31]) and coupling parameters to allow transient simulation of different size ONNs. We consider each oscillator as a pixel of an image. To avoid biased results due to specific training sets, we generate random training sets composed of *M* random black and white patterns  $\xi^{\mu}$ ,  $\mu \in \{1, 2, ..., M\}$  with  $\xi^{\mu}_{i} = -1$  if pixel *i* is black, or  $\xi^{\mu}_{i} = +1$  if white. We impose the same proportion of black and white pixels in the training patterns to avoid any effect emerging from unbalanced patterns. Next, we apply the Hebbian learning rule [5] to the training set and we obtain a matrix of synaptic coefficients *H*. We compute the corresponding coupling resistances  $R_{C_N}$  using the mapping function described in [30].

To generate test sets, we use the training patterns in which we apply a random uniform noise taking values between -1 and +1 (noisy pixels are gray), as in the example of Fig.1a. We vary the number of noisy pixels up to 50%. For inference, we initialize the ONN with a test image by delaying the starting time of oscillators. Then, we solve ONN dynamics using circuit equations along with the VO<sub>2</sub> model [31]. As VO<sub>2</sub>'s state equation is nonlinear, we solve it numerically at each time step using Newton-Raphson's algorithm.

#### B. ONN Recognition Accuracy

We study how the ONN recognition accuracy varies with M and the number of noisy pixels in the test images. We set N=36 and we consider that the ONN recognition fails



Fig. 7. ONN recognition accuracy with respect to the number of stored patterns *M*. Test images have 10% of noise. Each data point is the recognition average computed over 20 different trials.

if at least one pixel differs from the corresponding training image. Fig.6 shows the simulation results. When M = 2, the ONN recognition accuracy is larger than 70% for test images having up to 30% of noise. However, the accuracy dramatically drops for more stored patterns. Such as, for M = 4, the ONN recognition accuracy is around 30% for the same level of input noise. To predict the ONN memory capacity for N oscillators, we perform multiple simulations in the following subsection. To the best of our knowledge, this is the first systematic study to derive the ONN memory capacity versus ONN size and accuracy.

#### C. ONN Memory Capacity

We report on ONN memory capacity when trained using the Hebbian learning rule. For different ONN sizes from N=8 up to N=100, we vary the size M of the training set. Fig.7 shows the ONN recognition accuracy for test images having 10% of noise. As expected, larger networks can store more patterns. ONN with N=100 stores M=16 patterns with a recognition accuracy larger than 50%. Whereas 16 oscillators are limited to M=6 patterns for a similar accuracy. This trend is in accordance with Hopfield's results [26]. Based on Fig.7, we extract the ONN memory capacity when recognition accuracy reaches 50%. Results are shown in Fig.8. We derive that the ONN memory capacity grows linearly with a fitted slope of 0.146, in accordance with the scaling factor of 0.15 derived by Hopfield [26].

To increase the ONN memory capacity, one could think of having large networks but the number of synapses scales quadratically as N(N - 1)/2 and would make the physical implementation of large designs very challenging. Moreover, it is worthwhile to mention that noise would also limit the ONN scaling as the thermal noise increases with the number of synaptic resistors  $R_{C_N}$ . Next, we derive a first-order estimation of the maximum fully-connected ONN size when the synaptic thermal noise is the predominant noise source.



Fig. 8. ONN capacity extracted for a 50 % recognition accuracy. ONN capacity scales linearly. The slope is close to the 0.15 value theoretically obtained with HNN using the Hebbian rule [26]

#### D. ONN Size and Noise Limitation

In a fully-connected ONN of size N, each oscillator sees N - 1 noisy synaptic resistors  $R_{C_N}$  (no self coupling) and its equivalent noise source expressed in  $[V^2/Hz]$  is:

$$\overline{v_n^2} = (N-1)\frac{4k_B T R_{C_N} f_{osc}}{2} \tag{1}$$

where  $k_B$  is the Boltzmann's constant, T the temperature, and  $f_{osc} = 1/T_{osc}$  is the oscillation frequency. We assume that the intrinsic oscillator noise is negligible with respect to the synaptic thermal noise when N is large. As a first-order approximation, we only consider thermal noise (1) because we are interested in scaling up  $f_{osc}$  for high frequency ONN operation.

In a previous work [32], Csaba and Porod highlighted the ONN robustness to electronic noise and have shown that ONN can tolerate a smaller SNR compared to amplitudebased computing systems achieving the same functionality. We express the oscillator SNR as the ratio between the peakto-peak voltage amplitude over the thermal noise standard deviation:

$$SNR = \frac{\Delta V_{max}}{\sqrt{v_n^2}} \tag{2}$$

When increasing N, we scale  $R_{C_N}$  as  $R_{C_N} = (N - 1)R_{C_2}$ , where  $R_{C_2}$  is the coupling resistance for an ONN composed of two oscillators only [30]. We then express the maximum fully-connected ONN size  $N_{max}$  combining (1) and (2):

$$N_{max} = 1 + \frac{\Delta V_{max}}{\sqrt{4k_B T R_C f_{osc}}} \frac{\sqrt{2}}{SNR}$$
(3)

Fig.9 shows the maximum fully-connected ONN size  $N_{max}$  with respect to the oscillation frequency for various amplitudes. We consider SNR=3.5 as minimum achievable SNR that has been reported [32] in the case of two coupled-ring oscillators. With  $\Delta V_{max}$ =21 mV, we observe that the synaptic thermal noise limits the number of oscillators to  $N_{max}$ =300 for  $f_{osc}$ =10 MHz. For applications that need large ONNs, the



Fig. 9. Maximum number of fully-connected oscillators with respect to the oscillator frequency with  $R_{C_2}$ =5k $\Omega$ , T=300K and  $SNR_{min}$ =3.5. The various oscillating amplitudes are obtained via TCAD simulations described in section IV.D. With very small amplitude  $\Delta V_{max}$ =21mV thermal noise limits  $N_{max}$ =300 for f=10MHz. As low ONN energy requires high frequency and low voltage (15), there is a trade-off between energy consumption and maximum ONN size.

oscillation frequency has to be reduced and the amplitude should increase. It is worthwhile to note that the minimum SNR might also depend on the ONN size, as Csaba and Porod [32] reported correct functionality for 100 coupled oscillators even for SNR<1. In literature, coupling oscillators has been shown efficient to reduce the phase noise [33], but yet little is known on the impact of noise on the oscillator synchronization and scaling of phase-based computing systems.

#### IV. ONN ENERGY SCALING

Here, we study the ONN energy scaling using a bottomup approach, i.e., starting from device and circuit level before scaling up to the architecture level. We show analytically and by circuit simulations that the ONN energy scales linearly with the number of oscillators.

#### A. Single Oscillator Energy Footprint

From circuit equations and Fig.3a, we derive the instantaneous power consumption of a single oscillator as:

$$P(t) = V_{DD}(\frac{V_{out}}{R_S} + C_P \frac{dV_{out}}{dt})$$
(4)

As  $V_{out}$  is a  $T_{osc}$ -periodic signal, the oscillator energy loss for one oscillation is given by

$$E_{osc} = \frac{V_{DD}}{R_S} \int_0^{T_{osc}} V_{out} dt$$
 (5)

Then, we introduce the output mean voltage  $\overline{V_{out}} = 1/T_{osc} \int_{0}^{T_{osc}} V_{out} dt$  to reformulate the last expression as:

$$E_{osc} = \frac{V_{DD}}{R_S} \overline{V_{out}} T_{osc}$$
(6)

As  $T_{osc} \propto R_S C_P$  [30], we obtain a similar expression to the dynamic energy loss due to the charge and discharge of load



Fig. 10. (a) ONN neuron model (top) and analog implementation using coupling resistors and a VO<sub>2</sub>-oscillator. In this work, synaptic operations occur via current flow through coupling resistors and the input summation naturally happens in current mode. (b) When two coupled oscillators are out-of-phase, a synaptic current flows through the coupling resistor and energy is lost by the Joule effect. The case where the two oscillators are out-of-phase during  $t_{settle}$  corresponds to the maximum SOP energy loss. (c) The other extreme case occurs when two coupled oscillators are in phase during  $t_{settle}$ ; the current flow is null and  $E_{SOP} = 0$ .

capacitors in digital circuits  $E_{dyn} = C_P V_{DD}^2$ . However, in our case, the oscillator energy loss (6) is modulated by the DC output voltage operating point  $\overline{V_{out}}$ . Note that closed-form expressions for  $\overline{V_{out}}$  and  $T_{osc}$  are established in [30] but are not listed here for clarity. We observe that the two key knobs to obtain low energy ONN are low operating voltages and low parasitics. Next, we derive the oscillator energy when coupled to N - 1 other oscillators.

#### B. ONN Synaptic Operations

We first define the intrinsic synaptic operation between coupled oscillators. For oscillator *i*, we conceptually express its synaptic input weighted sum  $h_i(t)$  as:

$$h_{i}(t) = \sum_{j=1}^{N-1} W_{ij} \,\Delta\phi_{j}(t) \tag{7}$$

where  $W_{ij}$  are the synaptic weights and  $\Delta \phi_j(t)$  are the phases of other oscillators (Fig.10a). Then, the role of the oscillating neuron is to produce an output phase by applying a non-linear activation function *a* to its input:

$$\Delta\phi_i(t) = a(h_i(t)) \tag{8}$$

We define a synaptic operation (SOP) in ONN as the evaluation of the quantity  $W_{ij}\Delta\phi_j(t)$ . Note that up to now, we have not considered any hardware, and SOP could be implemented in various manners such as with digital circuits [9] or using the analog Ohm's law. Using these definitions, we express the neuron energy as the sum of two contributions:

$$E_{neuron} = E_{input} + E_{activation} \tag{9}$$

 $E_{input}$  is the loss related to the evaluation of the input weighted sum, whereas  $E_{activation}$  is the energy needed to produce an output, *i.e.*, determine the phase difference. Again, (9) is general enough so it can capture any type of implementation



Fig. 11. ONN settling time and energy for different values of N. Settling time remains approximately constant when scaling up N. Oscillators truly act as parallel processing units. Energy to settle scales linearly with N. Medians, first and third quartiles of simulation results are represented.

and computing (sequential or parallel). In the interesting case where neurons process information in parallel, we can then express the neuron energy as:

$$E_{neuron} = ((N-1)E_{SOP} + E_{osc})N_{cycles}$$
(10)

where  $N_{cycles}$  is the number of oscillating cycles before settling to a stable output phase state, and  $E_{osc}$  is the energy of a single oscillation. One interesting aspect of analog ONN is that sometimes SOP can be energy-free. For instance, when two coupled oscillators are in-phase the synaptic current is null and  $E_{SOP}=0$  (see Fig.10c). The worst-case SOP energy occurs when two oscillators are out-of-phase: the maximum amplitude across the synaptic resistor reaches  $\Delta V_{max} = V_H - V_L$ and induces Joule's loss (see Fig.10b). As SOP analytical expression depends on the oscillating waveform, we evaluate here the worst-case for simplicity and we consider that a DC voltage  $\Delta V_{max}$  is applied to every coupling resistor  $R_{C_N}$  during the entire oscillating period:

$$E_{SOP} = \frac{\Delta V_{max}^2}{R_{C_N}} T_{osc} \tag{11}$$

To assess how the ONN energy scales with N, we must first evaluate the ONN computation time,  $N_{cycles}$ . Next, we perform circuit simulations of various ONN sizes dedicated to pattern recognition to estimate  $N_{cycles}$ .

#### C. ONN Settling Time and Energy Scaling

We define the *ONN settling time* as the time  $t_{settle}$  required for ONN signals to be periodically stable:

$$t_{settle} = N_{cycles}T_{osc} \tag{12}$$

For  $t \ge t_{settle}$ , ONN phases can be measured as they are stable. For example in Fig.2, ONN stabilizes to a stable pattern after  $N_{cycles} = 1.75$  cycles. To derive the ONN settling time, we perform simulations for different ONN sizes by varying 1) the number *M* of stored patterns and 2) the number of noisy pixels

in test images from 10% to 50%. Fig.11 shows the simulation results. Interestingly, the ONN settling time is approximately constant and is smaller than 5 cycles in most cases. Hence, ONN parallel computation can allow to compute in constant time even for large networks. This result corroborates what has been observed with oscillator-based Ising machines [16], *i.e.*, coupled oscillators converge to a solution (not necessarily the optimal one) in constant time.

Moreover, we derive that the ONN energy scales linearly (see Fig.11) when ONN satisfies the two following properties:

- Parallelism: the computation time t<sub>settle</sub> remains quasiconstant.
- 2) Downscaling of synaptic energy: we scale the coupling resistors  $R_{C_N}$  as:

$$R_{C_N} = (N-1)R_{C_2} \tag{13}$$

where  $R_{C_2}$  is the coupling resistance between two coupled oscillators [30]. The synaptic loss  $E_{SOP}$  becomes:

$$E_{SOP} = \frac{\Delta V_{max}^2}{(N-1)R_{C_2}} T_{osc}$$
(14)

Therefore, even though the number of synapses grow quadratically, the ONN energy grows only linearly with the number of oscillators. This can be verified using our previous definitions (6, 10, 14):

$$E_{ONN}^{analog} = N E_{neuron}$$
  
=  $N((N-1) E_{SOP} + E_{osc}) N_{cycles}$   
=  $N(\frac{\Delta V_{max}^2}{R_{C_2}} + \frac{V_{DD}}{R_S} \overline{V_{out}}) N_{cycles} T_{osc}$  (15)

Note that we have not yet considered any peripheral circuits that could change the ONN energy scaling law when implemented in real hardware. For instance, even though the energy of the analog ONN computing core grows linearly (15), we would still have a quadratic number of synapses that would need to be programmed. But in terms of computing, the analog ONN is promising when compared to digital architectures. In the latter case, the energy of a synaptic operation (such as multiply and accumulate (MAC)) remains constant and cannot be scaled down. If we consider a fully-digital ONN computing with MACs rather than analog currents, we would then have a total energy that grows quadratically:

$$E_{ONN}^{digital} = N((N-1)E_{MAC} + E_{osc})N_{cycles}$$
(16)

In our simulations, we considered ONNs with a large supply voltage  $V_{DD} = 2.5$ V leading to an important energy consumption of 2 nJ/oscillator/cycle. Whereas, Jackson *et al.* in [8], have designed an ONN consuming 1.21 pJ/oscillator using a hybrid design (analog synapses and digital neurons) in 28 nm CMOS technology. Next, we study how to scale VO<sub>2</sub> devices to achieve competitive performances with respect to state-of-the-art solutions.

#### D. Oscillator Energy Minimization using Scaled VO<sub>2</sub> Devices

Here, we study how to minimize the energy for a  $VO_2$ based oscillator using the formulation (6) and assisted by



Fig. 12. (a) Structure of the VO<sub>2</sub> crossbar (CB) device. Top and bottom electrodes are in cross-like configuration. They have the same contact width of 250 nm. The VO<sub>2</sub> layer of 80 nm thickness is sandwiched between them. The color map overlapped to the geometrical structure accounts for the temperature distribution across the device at the highest simulated voltage. (b) Device I - V obtained through electrothermal TCAD simulation of CB=4- $\mu$ m (red solid line) and CB=2- $\mu$ m (blue solid line) devices. The simulations have been performed in voltage-controlled mode, by applying the voltage to a circuit composed of the VO<sub>2</sub> device connected in series to an external resistor of  $R_S = 1 \text{ k}\Omega$ . The dashed dotted lines represent the associated load lines.

TCAD simulations. The TCAD modeling and simulation flow is further described in recent work [34], [35]. We consider  $VO_2$  devices in crossbar (CB) geometry [14] as a potentially scalable geometry to lower the oscillator energy consumption (Fig.12a). By reducing the  $VO_2$  CB size, the overall  $VO_2$  thermal dissipation decreases and the  $VO_2$  device can transition to a metallic state with less power [35]. The applied voltage can then be reduced for given insulator and metallic states that are set by material properties and contact area (Fig.12b).

As our model predicts that the oscillator energy scales quadratically with voltages (6), it is of interest to scale down VO<sub>2</sub> CB dimensions. Fig.13 shows results of TCAD simulations for various CB (500nm, 1 $\mu$ m, 1.5 $\mu$ m, 2 $\mu$ m, 3 $\mu$ m and 4 $\mu$ m) and biasing parameters. We see from Fig.13a that VO<sub>2</sub> threshold voltages V<sub>H</sub> and V<sub>L</sub> are approximately proportional to CB and allow a linear V<sub>DD</sub> scaling. With reduced CB, the oscillating voltage amplitude can be decreased (Fig.13b) for low power operation (Fig.13c). As we kept the same material, contact area, and load capacitor for all CB sizes, the oscillating period does not vary significantly and the minimum energy is obtained for CB=500nm (Fig.13d).

Fig.14 shows the comparison between our analytical model (6) and mean power and energy computed with TCAD for different CB sizes. We observe a good match for the mean power but some deviation when evaluating the energy. We believe this is mainly due to non-linearities induced by thermal



Fig. 13. TCAD simulation results for the same crossbar (CB) geometry and  $C_P$ =5nF. (a) Oscillator parameters with respect to the VO<sub>2</sub> CB. By scaling down the VO<sub>2</sub> CB, the thermal dissipation decreases and the device needs less power to transition from one state to the other. Therefore, the VO<sub>2</sub> thresholds  $V_H$  and  $V_L$  decrease with CB. We scale down  $V_{DD}$  approximately linearly with  $V_H$  and  $V_L$ . The load resistor  $R_S$  is adapted in each case to place the load line in the NDR region and obtain oscillation. (b) Transient voltage across VO<sub>2</sub> devices for different CB. (c) Instantaneous power (d) Oscillator energy vs period for various CB.

effects, which result in a larger oscillating period thus a higher energy consumption [34]. This aspect is not captured by our analytical formalism as it only considers electrical variables (Fig.14b). Nevertheless, the scaling trend of our model is in agreement with TCAD simulations and we use it for benchmarking ONN with state-of-the-art chips.

#### V. ONN BENCHMARKING

#### A. Neuron Energy-Delay Benchmark

Benchmarking ONN with other architectures is not trivial as ONN is a phase-based system and does not perform conventional MAC operations. However, the concept of synaptic operation is shared among all sorts of neural inference chips and can serve as common ground for benchmarking. In Artificial Neural Networks (ANN), a SOP is defined by the multiplication between the input and the synaptic weight. Then, it can be naturally implemented in digital hardware by a MAC operator and in this case, there is the equivalence 1 SOP $\approx$ 1 MAC [36]. Here, we use real-chip SOP metrics to benchmark the time and energy required for a neuron to produce an output. Nikonov and Young [36] recently proposed a chip-level benchmark between neuromorphic hardware and digital neural accelerators based on the neuron energy and delay.

Similarly, we benchmark neuron energy-delay metrics from various chips defined as  $E_{neuron} = N_S P_{ch}/T_{ch}$  where  $N_S$ ,  $P_{ch}$  and  $T_{ch}$  are the number of synapses per neuron, the chip's power consumption and throughput (number of SOP/s), respectively. In digital neural accelerators performing MACs, the neuron delay is estimated as  $delay_{acc} = N_S N/T_{ch}$ . For Spiking Neural Networks (SNN) neuromorphic hardware, neurons do not need to wait for all input SOPs to occur to produce an output spike and this depends on the type of information



Fig. 14. Comparison between TCAD and analytical model for (a) Oscillator mean power (b) Oscillating period and (c) Oscillator energy with respect to  $V_{DD}$ . Our analytical model does not include thermal effects which slow down oscillations and increase the energy. Nevertheless, our model (6) captures well the quadratic  $V_{DD}$  scaling law. (d) Both TCAD and our model predict a linear energy scaling law with respect to the oscillator load capacitance. CB=1.5 $\mu$ m is considered here.

encoding [38]. In average, we approximate the neuron delay as  $delay_{neurom} = N/T_{ch}$  [36].

To benchmark ONN, we consider VO<sub>2</sub> devices with CB=500nm,  $\Delta V_{max}$ =21 mV, and various load capacitors. TCAD simulations were initially carried out to fit experimental oscillations where circuits employ nano-farad load capacitors [13]. However, in literature, faster VO<sub>2</sub> oscillations up to 9 MHz have been reported [39] and we believe crossbar VO<sub>2</sub> could reach a similar speed with lower load capacitors. Thus, we project the oscillator energy and delay for lower capacitances down to 500 fF using our analytical model (6) and (12).

To obtain a more precise energy assessment, we include the power consumed by peripheral circuits, *i.e.*, the phase initialization and measurement circuits. To set the oscillator input phase, we would use in the worst case one digital-to-time converter (DTC) per oscillator. As an example, we consider a 9-bits DTC consuming 31  $\mu$ W at 40 MHz in 28nm CMOS technology [40] suitable for low power edge applications. For the phase measurement, we take the example of the circuit described in [41] that consumes 20.5  $\mu$ W in 28nm CMOS technology. Overall, we consider  $P_{periph}=60 \ \mu W$  of peripheral circuits per oscillator clocked at 30 MHz, which gives 2 pJ per cycle. As a first-order estimation, we consider that  $P_{periph}$  is proportional to the neuron oscillating frequency and we obtain a constant peripheral energy loss  $E_{periph} = P_{periph} T_{osc} N_{cycles}$ . We use  $N_{cycles} \approx 5$  derived in section IV.C and we obtain  $E_{periph}$ =10 pJ. Note that our estimation remains optimistic as we use a bottom-up type of energy-delay assessment, whereas state-of-the-art data correspond to real chip measurements.

Fig.15 shows the neuron energy-delay for various SNN neuromorphic chips (blue circled dots), digital neural accelerators (red squared points) considered in previous work [36], [37] and VO<sub>2</sub> oscillators with different load capacitances.



Fig. 15. Neuron energy and delay to produce an output for various chips. We used data from [37], [17], [7]. Red squared markers are digital neural accelerators optimized for efficient matrix-vector product (MAC operations). Blue circled points are neuromorphic chips that implement SNNs. Green diamond markers are VO<sub>2</sub> oscillators with CB=500nm and various load capacitances including peripheral circuits. Orange star markers are ONN neurons standalone without any peripheral circuits. Purple triangular points are ONNs designed with CMOS technology.

When the oscillator load capacitance increases, the oscillator slows down and its energy to produce a stable output phase increases. Similarly, neuromorphic SNN chips lie on the righthand side of the plot as they generally produce spikes at lower frequency than digital neural accelerators [36]. From the neuron energy point of view, it appears that VO<sub>2</sub>-based ONN can compete with state-of-the-art SNN neuromorphic chips for a similar neuron delay. With real chip measurements which would include all peripheral energies, we expect the ONN region to shift up and to lie in the SNN neuromorphic region in the worst case.

The VO<sub>2</sub> oscillator could compete with neural accelerators at energy level but would be orders of magnitude slower with load capacitances larger than 500fF. For instance, a neuron from PuDianNao [42] accelerator produces an output after 242 ps whereas it would take 16 ns to phase lock for a scaled VO<sub>2</sub> oscillator with  $C_P$ =500fF. We notice that peripheral circuits set the minimum achievable neuron energy for load capacitances smaller than 50 pF (green diamond points), whereas the energy of the ONN neuron standalone can be below the picojoule range (orange star points). From our first-order estimation, we conclude that the energy-delay of a VO<sub>2</sub>-ONN can be very competitive under the two following conditions:

1) the oscillating frequency is in the GHz range, *i.e.*, the load capacitance  $C_P < 50$  fF and assuming that the VO<sub>2</sub> thermal time constant remains negligible [35].

 careful design of peripheral circuits to fully take advantage of ONN phase computing paradigm.

As an alternative of VO<sub>2</sub> oscillators, CMOS ONNs (purple triangular points) are currently very competitive as they use scaled transistors from a mature CMOS technology. For instance, the first phase-based ONN chip ever reported for pattern recognition is the digital ONN designed by Jackson et al. [8] with 100 neurons and 10,000 synapses using a 28nm CMOS technology. Their results are promising as they measured a 1.21 pJ neuron energy and 4 ns delay. For fast convolution inference, Nikonov et al. recently reported on an ONN chip fabricated in 22nm FinFet CMOS process that computes in less than 10 ns and consumes 2 pJ/oscillator. In the field of oscillator-based Ising machines (OIM) [16], Ahmed et al. [17] revealed an OIM composed of 560 ring oscillators in 65nm CMOS technology that consume 1.74 pJ/oscillator for  $N_{cvcles}$ =5 and  $f_{osc}$ =118 MHz. These recent examples further highlight the ONN potential to perform various tasks at high speed and low energy.

Finally, we would like to stress that benchmarking different architectures at the neuron level only gives a limited vision of chips' potential as they are ultimately used to solve practical problems. For example, Nikonov's ONN and the neural accelerator DianNao [43] have almost the same energy-delay when used to compute convolutions [7]. Next, we choose to benchmark a VO<sub>2</sub>-ONN in the case of image edge detection



Fig. 16. a) 10 fully-connected oscillators trained to detect vertical, horizontal and diagonal edges in images. b) Mapping of Hebbian coefficients to coupling resistors. c) ONN state that detects the image background.

which is a widely used task in image processing.

#### B. Edge Detection Benchmark with ONN

Here, we aim to benchmark VO<sub>2</sub>-ONNs with other works on a specific image edge detection application. Similar to edge detection algorithms that employ 3x3 or 5x5 convolution kernels [44], we scan an input image with 3x3 ONN to extract edges. Analogous phase-based edge detection algorithms have already been proposed in literature [45], [46] but we rather focus on the analog hardware implementation to assess how a VO<sub>2</sub>-ONN benchmarks with state-of-the-art edge detection hardware.

We consider a fully-coupled ONN composed of 10 oscillators and 45 coupling resistors where 9 oscillators scan the input image with a padding of 1, and the 10th oscillator makes the final decision (Fig.16a). Using the Hebbian learning rule [5], we train the ONN to detect edges in the vertical, horizontal and diagonal direction and we map the Hebbian coefficients to coupling resistors using the mapping function defined in [30] (Fig.16b). To detect the background, we bias the  $VO_2$ oscillators such that the 0° phase state is more likely to occur (we set  $R_S = 6k\Omega$  instead of  $20k\Omega$ ), further explained in [30]. As shown in Fig.16c and Fig.17b, oscillators converge in-phase when initialized with similar input phases and the ONN detects the background. Fig.17a shows an example where the ONN detects a vertical edge. Note that the 10th output oscillator is always initialized with an input phase of 90° to not favor any particular output state. As already highlighted in section IV.C, the ONN makes the decision after few oscillation cycles only (between 3 and 5).

We compare our ONN image detection with the state-ofthe-art Sobel and Canny edge detection methods [44], [47] that we test in Matlab using built-in functions. The results from Fig.18 show that Sobel, Canny, and ONN edge detections are qualitatively similar for a binary input image. A more interesting case consists in detecting edges in a gray-scale image as shown in Fig.19 with the 8-bits 64x64 gray-scale example. We observe that ONN detects more edges than Sobel and therefore evaluates well the image gradient. However, our ONN edge detection seems more sensible to noise than Canny



Fig. 17. From left to right: 3x3 portion of an input image, oscillators' waveforms, output ONN state in the case of a) vertical edge and b) uniform background.



Fig. 18. a) Cameraman binary 512x512 image. b), c) and d) are the output images using Sobel, Canny and ONN edge detection methods, respectively.

that initially smooths the input image with a 5x5 gaussian kernel. We believe that larger ONN kernels such as 5x5 or 7x7 could produce similar denoising property but is beyond the scope of this paper.

Table II shows the performances of edge detection ASICs implemented in 65 nm [48] and 45 nm [49] CMOS technologies. Both accelerators are optimized to run the Canny algorithm and are suitable for edge applications thanks to their low power consumption. We consider a VO<sub>2</sub>-ONN with a crossbar size of 500 nm to achieve low power operations and we vary the load capacitance to set the oscillating frequency. A single ONN running at 31 MHz would process a 512x512 image in 42 ms and would be x100 slower than Soares's ASIC [49]. By reducing the capacitance load to 500 fF and parallelizing at least 10 ONNs, ONN could compete with state-of-the-art to achieve 0.42 ms/image.

Again, the peripheral circuits' energy could become domi-

|                  | Hardware                                    | Frequency | Mean Power                  | Image size | Time /image | Energy/pixel                 |
|------------------|---------------------------------------------|-----------|-----------------------------|------------|-------------|------------------------------|
| Lee 2018 [48]    | ASIC (65 nm)                                | 500 MHz   | 5.48 mW                     | 1280x720   | 2.2 ms      | 13.2 pJ                      |
| Soares 2020 [49] | ASIC (45 nm)                                | 350 MHz   | 6.7 mW                      | 512x512    | 0.42 ms     | 10.7 pJ                      |
| ONN1             | 10 VO <sub>2</sub> -oscillators<br>C=5 pF   | 31 MHz    | 13 μW<br>+ 330 μW (periph.) | 512x512    | 42 ms       | 2.1 pJ<br>+ 53 pJ (periph.)  |
| ONN2             | 10 VO <sub>2</sub> -oscillators<br>C=500 fF | 310 MHz   | 13 μW<br>+ 3.3 mW (periph.) | 512x512    | 4.2 ms      | 0.21 pJ<br>+ 53 pJ (periph.) |

TABLE II Edge detection benchmark



Fig. 19. a) 64x64 8-bits gray scale image []. b), c) and d) are the output images using Sobel, Canny and ONN edge detection methods, respectively.

nant for scaled VO<sub>2</sub>-oscillators and would be x253 larger than the oscillator energy in this first-order estimation. This points out that a VO<sub>2</sub>-ONN requires specific and optimized peripheral circuits to fully take advantage of the ONN paradigm. We also believe there is room for improvement in terms of power management as ONN only needs initialization and phase measurements circuits during the first and last oscillating cycle, respectively.

#### VI. DISCUSSION

ONN is an alternative paradigm and it computes in a parallel, fast, and energy-efficient manner. Despite the recent surge of interest in ONNs, we believe some theoretical and practical points still remain unexplored. Such as, how can one take advantage of ONN associative properties in modern neural networks? Recent works have suggested using ONNs in image processing tasks [15], [50] or as filters in CNNs [14]. But a more general use of ONNs in deep neural networks has not yet been demonstrated. Also, how can we efficiently implement peripheral circuits and programmable synapses for ONNs? These challenges need to be addressed to have competitive phase-based ONNs in edge devices.

Finally, scaled VO<sub>2</sub> devices are promising to implement energy-efficient oscillators with low supply voltage and small load capacitances. However, our TCAD simulations reveal that the  $VO_2$  thermal behavior has an impact on the oscillator energy and delay as it can slow down oscillations. As shown in this work, ONN becomes really competitive for high frequencies beyond hundreds of MHz. Hence, we believe that more  $VO_2$  electrothermal studies are required to ensure that the  $VO_2$  thermal time constant would not limit the frequency scaling.

#### VII. CONCLUSION

In this work, we derived the performance scaling laws of  $VO_2$ -ONNs at device, circuit, and architecture levels. We first studied ONNs used as associative memories and we derived that the ONN memory capacity scales as 0.15*N* when trained with the Hebbian learning rule, similarly to Hopfield Neural Networks. Next, we presented the trade off between the ONN size, SNR and frequency due to the thermal noise produced by the coupling resistors. We also showed that the constant ONN settling time leads to a favorable linear energy scaling when increasing the coupling resistance values. Assisted by TCAD simulations, we then proposed some design guidelines at device and circuit levels to build competitive  $VO_2$ -ONNs with respect to state-of-the-art chips. Finally, we applied our methods to an image edge detection application using a scaled  $VO_2$ -ONN that is suitable for low-power edge devices.

#### Acknowledgment

This work is supported by the European Union's Horizon 2020 research and innovation program, EU H2020 NEURONN (www.neuronn.eu) project under Grant No. 871501.

#### DATA AVAILABILITY

The Matlab source codes will be made available by the authors upon acceptance of the manuscript.

#### References

- M. Merenda, C. Porcaro, and D. Iero, "Edge machine learning for ai-enabled iot devices: A review," *Sensors*, vol. 20, no. 9, 2020.
   [Online]. Available: https://www.mdpi.com/1424-8220/20/9/2533
- [2] A. V. Dastjerdi and R. Buyya, "Fog computing: Helping the internet of things realize its potential," *Computer*, vol. 49, no. 8, pp. 112–116, 2016.
- [3] T. Yang, Y. Chen, J. Emer, and V. Sze, "A method to estimate the energy consumption of deep neural networks," in 2017 51st Asilomar Conference on Signals, Systems, and Computers, 2017, pp. 1916–1920.
- [4] J. D. Kendall and S. Kumar, "The building blocks of a brain-inspired computer," *Applied Physics Reviews*, vol. 7, no. 1, p. 011305, 2020. [Online]. Available: https://doi.org/10.1063/1.5129306
- [5] F. C. Hoppensteadt and E. M. Izhikevich, "Pattern recognition via synchronization in phase-locked loop neural networks," *IEEE Transactions* on *Neural Networks*, vol. 11, no. 3, pp. 734–738, 2000.

- [6] S. Dutta, A. Khanna, W. Chakraborty, J. Gomez, S. Joshi, and S. Datta, "Spoken vowel classification using synchronization of phase transition nano-oscillators," in 2019 Symposium on VLSI Circuits, 2019, pp. T128– T129.
- [7] D. Nikonov, P. Kurahashi, J. Ayers, H. Li, T. Kamgaing, G. Dogiamis, H.-J. Lee, Y. Fan, and I. Young, "Convolution inference via synchronization of a coupled cmos oscillator array," *IEEE Journal on Exploratory Solid-State Computational Devices and Circuits*, vol. PP, pp. 1–1, 12 2020.
- [8] T. Jackson, S. Pagliarini, and L. Pileggi, "An oscillatory neural network with programmable resistive synapses in 28 nm cmos," in 2018 IEEE International Conference on Rebooting Computing (ICRC), 2018, pp. 1–7.
- [9] M. Abernot, T. Gil, M. Jiménez, J. Núñez, M. J. Avellido, B. Linares-Barranco, T. Gonos, T. Hardelin, and A. Todri-Sanial, "Digital implementation of oscillatory neural network for image recognition applications," *Frontiers in Neuroscience*, vol. 15, p. 1095, 2021. [Online]. Available: https://www.frontiersin.org/article/10.3389/ fnins.2021.713054
- [10] J. Torrejon, M. Riou, F. A. Araujo, S. Tsunegi, G. Khalsa, D. Querlioz, P. Bortolotti, V. Cros, K. Yakushiji, A. Fukushima, H. Kubota, S. Yuasa, M. D. Stiles, and J. Grollier, "Neuromorphic computing with nanoscale spintronic oscillators," *Nature*, vol. 547, no. 7664, pp. 428–431, Jul 2017. [Online]. Available: https://doi.org/10.1038/nature23011
- [11] P. Maffezzoni, B. Bahr, Z. Zhang, and L. Daniel, "Analysis and design of boolean associative memories made of resonant oscillator arrays," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 63, no. 11, pp. 1964–1973, 2016.
- [12] A. Todri-Sanial, S. Carapezzi, C. Delacour, M. Abernot, T. Gil, E. Corti, S. F. Karg, J. Nüñez, M. Jiménèz, M. J. Avedillo, and B. Linares-Barranco, "How frequency injection locking can train oscillatory neural networks to compute in phase," *IEEE Transactions on Neural Networks* and Learning Systems, pp. 1–14, 2021.
- [13] E. Corti, A. Khanna, K. Niang, J. Robertson, K. E. Moselund, B. Gotsmann, S. Datta, and S. Karg, "Time-delay encoded image recognition in a network of resistively coupled vo on si oscillators," *IEEE Electron Device Letters*, vol. 41, no. 4, pp. 629–632, 2020.
- [14] E. Corti, J. A. Cornejo Jimenez, K. M. Niang, J. Robertson, K. E. Moselund, B. Gotsmann, A. M. Ionescu, and S. Karg, "Coupled vo2 oscillators circuit as analog first layer filter in convolutional neural networks," *Frontiers in Neuroscience*, vol. 15, p. 19, 2021. [Online]. Available: https://www.frontiersin.org/article/10.3389/fnins. 2021.628254
- [15] A. Raychowdhury, A. Parihar, G. H. Smith, V. Narayanan, G. Csaba, M. Jerry, W. Porod, and S. Datta, "Computing with networks of oscillatory dynamical systems," *Proceedings of the IEEE*, vol. 107, no. 1, pp. 73–89, 2019.
- [16] T. Wang and J. Roychowdhury, "Oim: Oscillator-based ising machines for solving combinatorial optimisation problems," in *Unconventional Computation and Natural Computation*, I. McQuillan and S. Seki, Eds. Cham: Springer International Publishing, 2019, pp. 232–256.
- [17] I. Ahmed, P.-W. Chiu, W. Moy, and C. H. Kim, "A probabilistic compute fabric based on coupled ring oscillators for solving combinatorial optimization problems," *IEEE Journal of Solid-State Circuits*, vol. 56, no. 9, pp. 2870–2880, 2021.
- [18] S. Dutta, A. Khanna, A. S. Assoa, H. Paik, D. G. Schlom, Z. Toroczkai, A. Raychowdhury, and S. Datta, "An ising hamiltonian solver based on coupled stochastic phase-transition nano-oscillators," *Nature Electronics*, vol. 4, no. 7, pp. 502–512, Jul 2021. [Online]. Available: https://doi.org/10.1038/s41928-021-00616-7
- [19] A. Mallick, M. K. Bashar, D. S. Truesdell, B. H. Calhoun, S. Joshi, and N. Shukla, "Using synchronized oscillators to compute the maximum independent set," *Nature Communications*, vol. 11, no. 1, p. 4689, Sep 2020. [Online]. Available: https://doi.org/10.1038/s41467-020-18445-1
- [20] A. Parihar, N. Shukla, M. Jerry, S. Datta, and A. Raychowdhury, "Vertex coloring of graphs via phase dynamics of coupled oscillatory networks," *Scientific Reports*, vol. 7, no. 1, p. 911, Apr 2017. [Online]. Available: https://doi.org/10.1038/s41598-017-00825-1
- [21] J. Coulombe, M. York, and J. Sylvestre, "Computing with networks of nonlinear mechanical oscillators," *PLoS ONE*, vol. 12, 06 2017.
- [22] A. A. Velichko, D. V. Ryabokon, S. D. Khanin, A. V. Sidorenko, and A. G. Rikkiev, "Reservoir computing using high order synchronization of coupled oscillators," *IOP Conference Series: Materials Science and Engineering*, vol. 862, no. 5, p. 052062, may 2020. [Online]. Available: https://doi.org/10.1088/1757-899x/862/5/052062
- [23] J. Nokkala, R. Martínez-Peña, R. Zambrini, and M. C. Soriano, "Highperformance reservoir computing with fluctuations in linear networks,"

*IEEE Transactions on Neural Networks and Learning Systems*, pp. 1–12, 2021.

- [24] N. Shukla, W.-Y. Tsai, M. Jerry, M. Barth, V. Narayanan, and S. Datta, "Ultra low power coupled oscillator arrays for computer vision applications," in 2016 IEEE Symposium on VLSI Technology, 2016, pp. 1–2.
- [25] N. Shukla, A. Parihar, M. Cotter, M. Barth, X. Li, N. Chandramoorthy, H. Paik, D. G. Schlom, V. Narayanan, A. Raychowdhury, and S. Datta, "Pairwise coupled hybrid vanadium dioxide-mosfet (hvfet) oscillators for non-boolean associative computing," in 2014 IEEE International Electron Devices Meeting, 2014, pp. 28.7.1–28.7.4.
- [26] J. J. Hopfield, "Neural networks and physical systems with emergent collective computational abilities," *Proceedings of the National Academy* of Sciences, vol. 79, no. 8, pp. 2554–2558, 1982. [Online]. Available: https://www.pnas.org/content/79/8/2554
- [27] Z. Fahimi, M. R. Mahmoodi, H. Nili, V. Polishchuk, and D. B. Strukov, "Combinatorial optimization by weight annealing in memristive hopfield networks," *Scientific Reports*, vol. 11, no. 1, p. 16383, Aug 2021. [Online]. Available: https://doi.org/10.1038/s41598-020-78944-5
- [28] J. Wang, J. Wang, and Q.-L. Han, "Multivehicle task assignment based on collaborative neurodynamic optimization with discrete hopfield networks," *IEEE Transactions on Neural Networks and Learning Systems*, vol. 32, no. 12, pp. 5274–5286, 2021.
- [29] Q. Lai, Z. Wan, H. Zhang, and G. Chen, "Design and analysis of multiscroll memristive hopfield neural network with adjustable memductance and application to image encryption," *IEEE Transactions on Neural Networks and Learning Systems*, pp. 1–14, 2022.
- [30] C. Delacour and A. Todri-Sanial, "Mapping hebbian learning rules to coupling resistances for oscillatory neural networks," *Frontiers in Neuroscience*, vol. 15, p. 1489, 2021. [Online]. Available: https://www.frontiersin.org/article/10.3389/fnins.2021.694549
- [31] P. Maffezzoni, L. Daniel, N. Shukla, S. Datta, and A. Raychowdhury, "Modeling and simulation of vanadium dioxide relaxation oscillators," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 62, no. 9, pp. 2207–2215, 2015.
- [32] G. Csaba and W. Porod, "Noise immunity of oscillatory computing devices," *IEEE Journal on Exploratory Solid-State Computational Devices* and Circuits, vol. 6, no. 2, pp. 164–169, 2020.
- [33] H.-C. Chang, X. Cao, U. Mishra, and R. York, "Phase noise in coupled oscillators: theory and experiment," *IEEE Transactions on Microwave Theory and Techniques*, vol. 45, no. 5, pp. 604–615, 1997.
- [34] S. Carapezzi, C. Delacour, G. Boschetto, E. Corti, M. Abernot, A. Nejim, T. Gil, S. Karg, and A. Todri-Sanial, "Multi-scale modeling and simulation flow for oscillatory neural networks for edge computing," in 2021 19th IEEE International New Circuits and Systems Conference (NEWCAS), 2021, pp. 1–5.
- [35] S. Carapezzi, G. Boschetto, C. Delacour, E. Corti, A. Plews, A. Nejim, S. Karg, and A. Todri-Sanial, "Advanced design methods from materials and devices to circuits for brain-inspired oscillatory neural networks for edge computing," *IEEE Journal on Emerging and Selected Topics in Circuits and Systems*, vol. 11, no. 4, pp. 586–596, 2021.
- [36] D. E. Nikonov and I. A. Young, "Benchmarking delay and energy of neural inference circuits," *IEEE Journal on Exploratory Solid-State Computational Devices and Circuits*, vol. 5, no. 2, pp. 75–84, 2019.
- [37] —, "Supplemenary materials for benchmarking delay and energy of neural inference circuits," *IEEE Journal on Exploratory Solid-State Computational Devices and Circuits*, vol. 5, no. 2, pp. 75–84, 2019.
- [38] W. Guo, M. E. Fouda, A. M. Eltawil, and K. N. Salama, "Neural coding in spiking neural networks: A comparative study for robust neuromorphic systems," *Frontiers in Neuroscience*, vol. 15, p. 212, 2021. [Online]. Available: https://www.frontiersin.org/article/10.3389/ fnins.2021.638474
- [39] M. S. Mian, K. Okimura, and J. Sakai, "Self-oscillation up to 9mhz based on voltage triggered switching in vo2/tin point contact junctions," *Journal of Applied Physics*, vol. 117, no. 21, p. 215305, 2015. [Online]. Available: https://doi.org/10.1063/1.4922122
- [40] P. Chen, F. Zhang, Z. Zong, S. Hu, T. Siriburanon, and R. B. Staszewski, "A 31- μ w, 148-fs step, 9-bit capacitor-dac-based constant-slope digitalto-time converter in 28-nm cmos," *IEEE Journal of Solid-State Circuits*, vol. 54, no. 11, pp. 3075–3085, 2019.
- [41] S. Dutta, A. Khanna, A. S. Assoa, H. Paik, D. G. Schlom, Z. Toroczkai, A. Raychowdhury, and S. Datta, "An ising hamiltonian solver based on coupled stochastic phase-transition nano-oscillators - supplementary material," *Nature Electronics*, vol. 4, no. 7, pp. 502–512, Jul 2021. [Online]. Available: https://doi.org/10.1038/s41928-021-00616-7
- [42] D. Liu, T. Chen, S. Liu, J. Zhou, S. Zhou, O. Teman, X. Feng, X. Zhou, and Y. Chen, "Pudiannao: A polyvalent machine learning

accelerator," vol. 43, no. 1, 2015. [Online]. Available: https://doi.org/10.1145/2786763.2694358

- [43] T. Chen, Z. Du, N. Sun, J. Wang, C. Wu, Y. Chen, and O. Temam, "Diannao: A small-footprint high-throughput accelerator for ubiquitous machine-learning," vol. 49, 02 2014, pp. 269–284.
- [44] H. Spontón and J. Cardelino, "A Review of Classic Edge Detectors," *Image Processing On Line*, vol. 5, pp. 90–123, 2015, https://doi.org/10. 5201/ipol.2015.35.
- [45] M. J. Cotter, Y. Fang, S. P. Levitan, D. M. Chiarulli, and V. Narayanan, "Computational architectures based on coupled oscillators," in 2014 IEEE Computer Society Annual Symposium on VLSI, 2014, pp. 130– 135.
- [46] M. Abernot, T. Gil, and A. Todri-Sanial, "Oscillatory Neural Network as Hetero-Associative Memory for Image Edge Detection," in *NICE* 2022 - 9th Neuro-Inspired Computational Elements Workshop. New York (Virtual), United States: ACM, Mar. 2022, p. In press. [Online]. Available: https://hal-lirmm.ccsd.cnrs.fr/lirmm-03586865
- [47] J. Canny, "A computational approach to edge detection," *IEEE Transactions on Pattern Analysis and Machine Intelligence*, vol. PAMI-8, no. 6, pp. 679–698, 1986.
- [48] J. Lee, H. Tang, and J. Park, "Energy efficient canny edge detector for advanced mobile vision applications," *IEEE Transactions on Circuits* and Systems for Video Technology, vol. 28, no. 4, pp. 1037–1046, 2018.
- [49] L. B. Soares, J. Oliveira, E. A. C. da Costa, and S. Bampi, "An energy-efficient and approximate accelerator design for real-time canny edge detection," *Circuits, Systems, and Signal Processing*, vol. 39, no. 12, pp. 6098–6120, Dec 2020. [Online]. Available: https://doi.org/10.1007/s00034-020-01448-0
- [50] N. Shukla, A. Parihar, M. Cotter, M. Barth, X. Li, N. Chandramoorthy, H. Paik, D. G. Schlom, V. Narayanan, A. Raychowdhury, and S. Datta, "Pairwise coupled hybrid vanadium dioxide-mosfet (hvfet) oscillators for non-boolean associative computing," in 2014 IEEE International Electron Devices Meeting, 2014, pp. 28.7.1–28.7.4.