Discovering Program Topoi via Hierarchical Agglomerative Clustering - LIRMM - Laboratoire d’Informatique, de Robotique et de Microélectronique de Montpellier
Article Dans Une Revue IEEE Transactions on Reliability Année : 2018

Discovering Program Topoi via Hierarchical Agglomerative Clustering

Résumé

In long lifespan software systems, specification documents can be outdated or even missing. Developing new software releases or checking whether some user requirements are still valid becomes challenging in this context. This challenge can be addressed by extracting high-level observable capabilities of a system by mining its source code and the available source-level documentation. This paper presents feature extraction and traceabil- ity (FEAT), an approach that automatically extracts topoi, which are summaries of the main capabilities of a program, given under the form of collections of code functions along with an index. FEAT acts in two steps: first, clustering: by mining the available source code, possibly augmented with code-level comments, hierarchical agglomerative clustering groups similar code functions. In addition, this process gathers an index for each function. Second, entry point selection: functions within a cluster are then ranked and presented to validation engineers as topoi candidates. We implemented FEAT on top of a general-purpose test management and optimization platform and performed an experimental study over 15 open-source software projects amounting to more than 1 M lines of codes proving that automatically discovering topoi is feasible and meaningful on realistic projects.
Fichier principal
Vignette du fichier
ieee-rel-jrnl-draft.pdf (931.45 Ko) Télécharger le fichier
Origine Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

lirmm-02088786 , version 1 (04-04-2019)

Identifiants

Citer

Carlo Ieva, Arnaud Gotlieb, Souhila Kaci, Nadjib Lazaar. Discovering Program Topoi via Hierarchical Agglomerative Clustering. IEEE Transactions on Reliability, 2018, 67 (3), pp.758-770. ⟨10.1109/TR.2018.2828135⟩. ⟨lirmm-02088786⟩
156 Consultations
289 Téléchargements

Altmetric

Partager

More