Highly-Adaptive Mixed-Precision MAC Unit for Smart and Low-Power Edge Computing
Abstract
Machine learning algorithms are compute-and memory-intensive. Their execution at the edge on resourceconstrained embedded systems is challenging. Data quantization, i.e. data bit-width reduction, contributes to reducing de-facto the memory bandwidth requirement. In order to best exploit this bit-width reduction, a prevailing approach consists of tailored hardware accelerators. Another approach relies on generalpurpose compute units with Single Instruction Multiple Data (SIMD) support for reduced data bit-width precision, as in ARM Cortex-M [1] or RISC-V based RI5CY [2] processors. However, such processors only handle a few predefined bit-width ranges, e.g. 8-bit and 16-bit only for the ARM SIMD. This paper proposes a flexible architecture of Multiply-and-Accumulate (MAC) unit allowing asymmetric multiplication for operand sizes in powers of 2, up to 32 bits. The synthesis of this architecture in 28nm FD-SOI technology shows 10% and 25% reduction in area and dynamic power respectively, compared to the RI5CY MAC unit. From the energy-efficiency point of view, up to 50% improvements are achieved.
Fichier principal
NEWCAS_2021_Highly-Adaptive Mixed-Precision MAC Unit for Smart and Low-Power Edge Computing.pdf (395.34 Ko)
Télécharger le fichier
Origin | Files produced by the author(s) |
---|