A Heuristic Exploration of Retraining-free Weight-Sharing for CNN Compression - LIRMM - Laboratoire d’Informatique, de Robotique et de Microélectronique de Montpellier Accéder directement au contenu
Communication Dans Un Congrès Année : 2022

A Heuristic Exploration of Retraining-free Weight-Sharing for CNN Compression

Résumé

The computational workload involved in Convolutional Neural Networks (CNNs) is typically out of reach for low-power embedded devices. The scientific literature provides a large number of approximation techniques to address this problem. Among them, the Weight-Sharing (WS) technique gives promising results, but it requires carefully determining the shared values for each layer of a given CNN. As the number of possible solutions grows exponentially with the number of layers, the WS Design Space Exploration (DSE) time can easily explode for state-of-the-art CNNs. In this paper, we propose a new heuristic approach to drastically reduce the exploration time without sacrificing the quality of the output. The results carried out on recent CNNs (GoogleNet [1], ResNet50V2 [2], MobileNetV2 [3], InceptionV3 [4], and EfficientNet [5]), trained with the Ima-geNet [6] dataset, show over 5× memory compression at an acceptable accuracy loss (complying with the MLPerf [7] quality target) without any retraining step and in less than 10 hours. Our code is publicly available on GitHub [8].
Fichier principal
Vignette du fichier
ASP_DAC_2022.pdf (433.53 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

lirmm-03767100 , version 1 (01-09-2022)

Identifiants

Citer

Etienne Dupuis, David Novo, Ian O'Connor, Alberto Bosio. A Heuristic Exploration of Retraining-free Weight-Sharing for CNN Compression. ASP-DAC 2022 - 27th Asia and South Pacific Design Automation Conference, Jan 2022, Taipei, Taiwan. pp.134-139, ⟨10.1109/ASP-DAC52403.2022.9712487⟩. ⟨lirmm-03767100⟩
26 Consultations
120 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More