Terminology Extraction from Log Files - LIRMM - Laboratoire d’Informatique, de Robotique et de Microélectronique de Montpellier
Reports Year : 2009

Terminology Extraction from Log Files

Abstract

In many domains, the log files generated by digital systems contain important information on the conditions and configurations of systems. Information Extraction from these log files is an essential phase in information systems, which manage the production line. In the case of Integrated Circuit designs, log files generated by design tools are not exhaustively exploited. Although these log files are written in English, they usually do not respect the grammar and the structures of natural language. Moreover, such logs have a heterogeneous and evolving structure. According to features of such textual data, applying the classical methods of information extraction is not an easy task, more particularly for terminology extraction. In this paper, we thus introduce our approach Exterlog to extract the terminology from such log files. We also aim at knowing if POS tagging of such log files is a relevant approach for terminology extraction.
Fichier principal
Vignette du fichier
RR09010.pdf (120.58 Ko) Télécharger le fichier
Origin Files produced by the author(s)
Loading...

Dates and versions

lirmm-00383046 , version 1 (04-05-2012)

Identifiers

  • HAL Id : lirmm-00383046 , version 1

Cite

Hassan Saneifar, Stéphane Bonniol, Anne Laurent, Pascal Poncelet, Mathieu Roche. Terminology Extraction from Log Files. RR-09010, 2009, pp.16. ⟨lirmm-00383046⟩
245 View
354 Download

Share

More