Managing the Acronym/Expansion Identification Process for Text-Mining Applications

Mathieu Roche 1 Violaine Prince 1
1 TEXTE - Exploration et exploitation de données textuelles
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier
Abstract : This paper deals with an acronym/definition extraction approach from textual data (corpora) and the disambiguation of these definitions (or expansions). Both steps of our global process of acquisition and management of acronyms are precisely described. The first step consists in using markers such as brackets to identify expansion candidates. The alignment of the letters allows to select the acronym/definition couples. The second step is to define the relevant expansion of an acronym in a given context. Our method is based on statistical measurements (Mutual Information, Cubic Mutual Information, Dice Measure) and the results provided by search engines. This paper presents an evaluation of the global process from real data (general and specialized domains).
Type de document :
Article dans une revue
International Journal of Software and Informatics (IJSI), ISCAS, 2008, Special issue on Data Mining, 2 (2), pp.163-179
Liste complète des métadonnées

https://hal-lirmm.ccsd.cnrs.fr/lirmm-00349235
Contributeur : Mathieu Roche <>
Soumis le : jeudi 25 décembre 2008 - 20:58:09
Dernière modification le : jeudi 11 janvier 2018 - 02:06:42

Identifiants

  • HAL Id : lirmm-00349235, version 1

Collections

Citation

Mathieu Roche, Violaine Prince. Managing the Acronym/Expansion Identification Process for Text-Mining Applications. International Journal of Software and Informatics (IJSI), ISCAS, 2008, Special issue on Data Mining, 2 (2), pp.163-179. 〈lirmm-00349235〉

Partager

Métriques

Consultations de la notice

151