Terminology Extraction from Log Files - LIRMM - Laboratoire d’Informatique, de Robotique et de Microélectronique de Montpellier
Rapport Année : 2009

Terminology Extraction from Log Files

Résumé

In many domains, the log files generated by digital systems contain important information on the conditions and configurations of systems. Information Extraction from these log files is an essential phase in information systems, which manage the production line. In the case of Integrated Circuit designs, log files generated by design tools are not exhaustively exploited. Although these log files are written in English, they usually do not respect the grammar and the structures of natural language. Moreover, such logs have a heterogeneous and evolving structure. According to features of such textual data, applying the classical methods of information extraction is not an easy task, more particularly for terminology extraction. In this paper, we thus introduce our approach Exterlog to extract the terminology from such log files. We also aim at knowing if POS tagging of such log files is a relevant approach for terminology extraction.
Fichier principal
Vignette du fichier
RR09010.pdf (120.58 Ko) Télécharger le fichier
Origine Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

lirmm-00383046 , version 1 (04-05-2012)

Identifiants

  • HAL Id : lirmm-00383046 , version 1

Citer

Hassan Saneifar, Stéphane Bonniol, Anne Laurent, Pascal Poncelet, Mathieu Roche. Terminology Extraction from Log Files. RR-09010, 2009, pp.16. ⟨lirmm-00383046⟩
250 Consultations
361 Téléchargements

Partager

More