GenDesc: A Partial Generalization of Linguistic Features For Text Classification

Abstract : This paper presents an application that belongs to automatic classification of textual data by supervised learning algorithms. The aim is to study how a better textual data representation can improve the quality of classification. Considering that a word meaning depends on its context, we propose to use features that give important information about word contexts. We present a method named GenDesc, which generalizes (with POS tags) the least relevant words for the classification task.
Document type :
Conference papers
Complete list of metadatas

https://hal-lirmm.ccsd.cnrs.fr/lirmm-00823476
Contributor : Guillaume Tisserant <>
Submitted on : Friday, May 17, 2013 - 9:58:26 AM
Last modification on : Friday, February 8, 2019 - 10:42:20 AM

Links full text

Identifiers

Collections

Citation

Guillaume Tisserant, Violaine Prince, Mathieu Roche. GenDesc: A Partial Generalization of Linguistic Features For Text Classification. NLDB: Natural Language Processing and Information Systems, Jun 2013, Salford, United Kingdom. pp.343-348, ⟨10.1007/978-3-642-38824-8_35⟩. ⟨lirmm-00823476⟩

Share

Metrics

Record views

154