From Terminology Extraction to Terminology Validation: An Approach Adapted to Log Files

Abstract : Log files generated by computational systems contain relevant and essential information. In some application areas like the design of integrated circuits, log files generated by design tools contain information which can be used in management information systems to evaluate the final products. However, the complexity of such textual data raises some challenges concerning the extraction of information from log files. Log files are usually multi-source, multi-format, and have a heterogeneous and evolving structure. Moreover, they usually do not respect natural language grammar and structures even though they are written in English. Classical methods of information extraction such as terminology extraction methods are particularly irrelevant to this context. In this paper, we introduce our approach Exterlog to extract terminology from log files. We detail how it deals with the specific features of such textual data. The performance is emphasized by favoring the most relevant terms of the domain based on a scoring function which uses a Web and context based measure. The experiments show that Exterlog is a well-adapted approach for terminology extraction from log files.
Type de document :
Article dans une revue
Journal of Universal Computer Science, Graz University of Technology, Institut für Informationssysteme und Computer Medien, 2015, 21 (4), pp.604-636. 〈http://www.jucs.org/jucs_21_4/from_terminology_extraction_to/jucs_21_04_0604_0635_saneifar.pdf〉
Liste complète des métadonnées

Littérature citée [38 références]  Voir  Masquer  Télécharger

https://hal-lirmm.ccsd.cnrs.fr/lirmm-01184551
Contributeur : Mathieu Roche <>
Soumis le : dimanche 16 août 2015 - 03:47:20
Dernière modification le : jeudi 11 janvier 2018 - 06:27:21
Document(s) archivé(s) le : mercredi 26 avril 2017 - 09:56:22

Fichier

jucs_21_04_0604_0635_saneifar....
Fichiers éditeurs autorisés sur une archive ouverte

Identifiants

  • HAL Id : lirmm-01184551, version 1

Citation

Hassan Saneifar, Stéphane Bonniol, Pascal Poncelet, Mathieu Roche. From Terminology Extraction to Terminology Validation: An Approach Adapted to Log Files. Journal of Universal Computer Science, Graz University of Technology, Institut für Informationssysteme und Computer Medien, 2015, 21 (4), pp.604-636. 〈http://www.jucs.org/jucs_21_4/from_terminology_extraction_to/jucs_21_04_0604_0635_saneifar.pdf〉. 〈lirmm-01184551〉

Partager

Métriques

Consultations de la notice

158

Téléchargements de fichiers

150