Yet Another Ranking Function for Automatic Multiword Term Extraction

Juan Antonio Lossio-Ventura 1, * Clement Jonquet 2, 3 Mathieu Roche 4, 1 Maguelonne Teisseire 4, 1
* Auteur correspondant
1 ADVANSE - ADVanced Analytics for data SciencE
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier
2 SMILE - Système Multi-agent, Interaction, Langage, Evolution
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier
Abstract : Term extraction is an essential task in domain knowledge acquisition. We propose two new measures to extract multiword terms from a domain-specific text. The first measure is both linguistic and statistical based. The second measure is graph-based, allowing assessment of the importance of a multiword term of a domain. Existing measures often solve some problems related (but not completely) to term extraction, e.g., noise, silence, low frequency, large-corpora, complexity of the multiword term extraction process. Instead, we focus on managing the entire set of problems, e.g., detecting rare terms and overcoming the low frequency issue. We show that the two proposed measures outperform precision results previously reported for automatic multiword extraction by comparing them with the state-of-the-art reference measures.
Type de document :
Communication dans un congrès
A. Przepiórkowski; M. Ogrodniczuk. PolTAL: Natural Language Processing, Sep 2014, Warsaw, Poland. Springer, 9th International Conference on Natural Language Processing took place on 17–19 September 2014 in Warsaw, Poland., LNCS (8686), pp.52-64, 2014, Advances in Natural Language Processing. 〈http://poltal.ipipan.waw.pl/〉. 〈10.1007/978-3-319-10888-9_6〉
Liste complète des métadonnées

Littérature citée [25 références]  Voir  Masquer  Télécharger

https://hal-lirmm.ccsd.cnrs.fr/lirmm-01068556
Contributeur : Juan Antonio Lossio Ventura <>
Soumis le : jeudi 25 septembre 2014 - 18:14:42
Dernière modification le : lundi 22 octobre 2018 - 09:54:03
Document(s) archivé(s) le : vendredi 26 décembre 2014 - 11:21:17

Fichier

PolTAL2014.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Citation

Juan Antonio Lossio-Ventura, Clement Jonquet, Mathieu Roche, Maguelonne Teisseire. Yet Another Ranking Function for Automatic Multiword Term Extraction. A. Przepiórkowski; M. Ogrodniczuk. PolTAL: Natural Language Processing, Sep 2014, Warsaw, Poland. Springer, 9th International Conference on Natural Language Processing took place on 17–19 September 2014 in Warsaw, Poland., LNCS (8686), pp.52-64, 2014, Advances in Natural Language Processing. 〈http://poltal.ipipan.waw.pl/〉. 〈10.1007/978-3-319-10888-9_6〉. 〈lirmm-01068556〉

Partager

Métriques

Consultations de la notice

508

Téléchargements de fichiers

613