Combining C-value and Keyword Extraction Methods for Biomedical Terms Extraction

Juan Antonio Lossio-Ventura 1, * Clement Jonquet 2, 3 Mathieu Roche 4, 1 Maguelonne Teisseire 4, 1
* Auteur correspondant
1 ADVANSE - ADVanced Analytics for data SciencE
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier
2 SMILE - Système Multi-agent, Interaction, Langage, Evolution
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier
Abstract : The objective of this work is to extract and to rank biomedical terms from free text. We present new extraction methods that use linguistic patterns specialized for the biomedical field, and use term extraction measures, such as C-value, and keyword extraction measures, such as Okapi BM25, and TFIDF. We propose several combinations of these measures to improve the extraction and ranking process. Our experiments show that an appropriate harmonic mean of C-value used with keyword extraction measures offers better precision results than used alone, either for the extraction of single-word and multi-words terms. We illustrate our results on the extraction of English and French biomedical terms from a corpus of laboratory tests. The results are validated by using UMLS (in English) and only MeSH (in French) as reference dictionary.
Type de document :
Communication dans un congrès
LBM: Languages in Biology and Medicine, Dec 2013, Tokyo, Japan. 5th International Symposium on Languages in Biology and Medicine, 2013, 〈http://lbm2013.biopathway.org/〉
Liste complète des métadonnées

Littérature citée [21 références]  Voir  Masquer  Télécharger

https://hal-lirmm.ccsd.cnrs.fr/lirmm-01019991
Contributeur : Juan Antonio Lossio Ventura <>
Soumis le : lundi 7 juillet 2014 - 15:43:30
Dernière modification le : jeudi 24 mai 2018 - 15:59:25
Document(s) archivé(s) le : lundi 12 octobre 2015 - 11:35:39

Fichier

LBM2013.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : lirmm-01019991, version 1

Citation

Juan Antonio Lossio-Ventura, Clement Jonquet, Mathieu Roche, Maguelonne Teisseire. Combining C-value and Keyword Extraction Methods for Biomedical Terms Extraction. LBM: Languages in Biology and Medicine, Dec 2013, Tokyo, Japan. 5th International Symposium on Languages in Biology and Medicine, 2013, 〈http://lbm2013.biopathway.org/〉. 〈lirmm-01019991〉

Partager

Métriques

Consultations de la notice

755

Téléchargements de fichiers

914