A Hybrid Fault-Tolerant Architecture for Highly Reliable Processing Cores
Abstract
Increasing vulnerability of transistors and interconnects due to scaling is continuously challenging the reliability of future microprocessors. Lifetime reliability is gaining attention over performance as a design factor even for lower-end commodity applications. In this work we present a low-power hybrid fault tolerant architecture for reliability improvement of pipelined microprocessors by protecting their combinational logic parts. The architecture can handle a broad spectrum of faults with little impact on performance by combining different types of redundancies. Moreover, it addresses the problem of error propagation in nonlinear pipelines and error detection in pipeline stages with memory interfaces. Our case-study implementation of a fault tolerant MIPS microprocessor highlights four main advantages of the proposed solution. It offers (i) 11.6 % power saving, (ii) improved transient error detection capability, (iii) lifetime reliability improvement, and (iv) more effective fault accumulation effect handling, in comparison with TMR architectures. We also present a gate-level fault-injection framework that offers high fidelity to model physical defects and transient faults.