GenDesc: A Partial Generalization of Linguistic Features For Text Classification - LIRMM - Laboratoire d’Informatique, de Robotique et de Microélectronique de Montpellier
Communication Dans Un Congrès Année : 2013

GenDesc: A Partial Generalization of Linguistic Features For Text Classification

Résumé

This paper presents an application that belongs to automatic classification of textual data by supervised learning algorithms. The aim is to study how a better textual data representation can improve the quality of classification. Considering that a word meaning depends on its context, we propose to use features that give important information about word contexts. We present a method named GenDesc, which generalizes (with POS tags) the least relevant words for the classification task.

Dates et versions

lirmm-00823476 , version 1 (17-05-2013)

Identifiants

Citer

Guillaume Tisserant, Violaine Prince, Mathieu Roche. GenDesc: A Partial Generalization of Linguistic Features For Text Classification. NLDB: Natural Language Processing and Information Systems, Jun 2013, Salford, United Kingdom. pp.343-348, ⟨10.1007/978-3-642-38824-8_35⟩. ⟨lirmm-00823476⟩
106 Consultations
0 Téléchargements

Altmetric

Partager

More