GenDesc: A Partial Generalization of Linguistic Features For Text Classification

Abstract : This paper presents an application that belongs to automatic classification of textual data by supervised learning algorithms. The aim is to study how a better textual data representation can improve the quality of classification. Considering that a word meaning depends on its context, we propose to use features that give important information about word contexts. We present a method named {\sc GenDesc}, which generalizes (with POS tags) the least relevant words for the classification task. %GenDesc Method has been tested o
Type de document :
Communication dans un congrès
NLDB'2013: International Conference on Applications of Natural Language to Information Systems, Jun 2013, United Kingdom. pp.6, 2013, 〈nldb.org〉
Liste complète des métadonnées

https://hal-lirmm.ccsd.cnrs.fr/lirmm-00823476
Contributeur : Guillaume Tisserant <>
Soumis le : vendredi 17 mai 2013 - 09:58:26
Dernière modification le : jeudi 24 mai 2018 - 15:59:23

Identifiants

  • HAL Id : lirmm-00823476, version 1

Collections

Citation

Guillaume Tisserant, Violaine Prince, Mathieu Roche. GenDesc: A Partial Generalization of Linguistic Features For Text Classification. NLDB'2013: International Conference on Applications of Natural Language to Information Systems, Jun 2013, United Kingdom. pp.6, 2013, 〈nldb.org〉. 〈lirmm-00823476〉

Partager

Métriques

Consultations de la notice

96