Discovering Program Topoi Through Clustering

Carlo Ieva 1 Arnaud Gotlieb 1 Souhila Kaci 2 Nadjib Lazaar 3
2 SMILE - Système Multi-agent, Interaction, Langage, Evolution
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier
3 COCONUT - Agents, Apprentissage, Contraintes
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier
Abstract : Understanding source code of large open-source software projects is very challenging when there is only little documentation. New developers face the task of classifying a huge number of files and functions without any help. This paper documents a novel approach to this problem, called FEAT, that automatically extracts topoi from source code by using hierarchical agglomerative clustering. Program topoi summarize the main capabilities of a software system by presenting to developers clustered lists of functions together with an index of their relevant words. The clustering method used in FEAT exploits a new hybrid distance which combines both textual and structural elements automatically extracted from source code and comments. The experimental evaluation of FEAT shows that this approach is suitable to understand open-source software projects of size approaching 2,000 functions and 150 files, which opens the door for its deployment in the open-source community.
Type de document :
Communication dans un congrès
IAAI: Innovative Applications of Artificial Intelligence, Feb 2018, New Orleans, Louisiana, United States. 13th Annual Conference on Innovative Applications of Artificial Intelligence collocated with the 32nd Conference on Artificial Intelligence, 2018, 〈https://aaai.org/Conferences/AAAI-18/iaai-18/〉
Liste complète des métadonnées

Littérature citée [6 références]  Voir  Masquer  Télécharger

https://hal-lirmm.ccsd.cnrs.fr/lirmm-01790874
Contributeur : Joël Quinqueton <>
Soumis le : lundi 15 octobre 2018 - 10:31:09
Dernière modification le : dimanche 16 décembre 2018 - 10:42:02
Document(s) archivé(s) le : mercredi 16 janvier 2019 - 13:52:36

Fichier

16045-77150-1-PB.pdf
Fichiers éditeurs autorisés sur une archive ouverte

Identifiants

  • HAL Id : lirmm-01790874, version 1

Collections

Citation

Carlo Ieva, Arnaud Gotlieb, Souhila Kaci, Nadjib Lazaar. Discovering Program Topoi Through Clustering. IAAI: Innovative Applications of Artificial Intelligence, Feb 2018, New Orleans, Louisiana, United States. 13th Annual Conference on Innovative Applications of Artificial Intelligence collocated with the 32nd Conference on Artificial Intelligence, 2018, 〈https://aaai.org/Conferences/AAAI-18/iaai-18/〉. 〈lirmm-01790874〉

Partager

Métriques

Consultations de la notice

202

Téléchargements de fichiers

11