How can catchy titles be generated without loss of informativeness?

Cédric Lopez; Violaine Prince; Mathieu Roche

doi:10.1016/j.eswa.2013.07.102

Article Dans Une Revue Expert Systems with Applications Année : 2014

How can catchy titles be generated without loss of informativeness?

(1) , (2) , (3, 4)

1
2
3
4

Cédric Lopez

Fonction : Auteur
PersonId : 960390
ORCID : 0000-0002-4933-5720
IdRef : 164704922

VISEO

Violaine Prince

Fonction : Auteur
PersonId : 942907
ORCID : 0000-0002-5997-9677

Exploration et exploitation de données textuelles

Mathieu Roche

Fonction : Auteur
PersonId : 4967
IdHAL : mathieu-roche
ORCID : 0000-0003-3272-8568
IdRef : 09042087X

ADVanced Analytics for data SciencE

Territoires, Environnement, Télédétection et Information Spatiale

Résumé

Automatic titling of text documents is an essential task for several applications (automatic heading of e-mails, summarization, and so forth). This paper describes a system facilitating information retrieval in a set of textual documents by tackling the automatic titling and subtitling issue. Automatic titling here involves providing both informative and catchy titles. We thus propose two different approaches based on NLP, text mining, and Web Mining techniques. The first one (POSTIT) consists of extracting relevant noun phrases from texts as candidate titles. An original approach combining statistical criteria and noun phrase positions in the text helps in collecting informative titles and subtitles. The second approach (NOMIT) is based on various assumptions made on POSTIT and aims to generate both informative and catchy titles. Both approaches are applied to a corpus of news articles, then evaluated according to two criteria, i.e. informativeness and catchiness.

Domaines

Web Recherche d'information [cs.IR] Intelligence artificielle [cs.AI]

Mathieu Roche : Connectez-vous pour contacter le contributeur

https://hal-lirmm.ccsd.cnrs.fr/lirmm-00856372

Soumis le : lundi 9 septembre 2013-15:59:54

Dernière modification le : mardi 10 octobre 2023-16:38:10

Dates et versions

lirmm-00856372 , version 1 (09-09-2013)

Identifiants

HAL Id : lirmm-00856372 , version 1
DOI : 10.1016/j.eswa.2013.07.102

Citer

Cédric Lopez, Violaine Prince, Mathieu Roche. How can catchy titles be generated without loss of informativeness?. Expert Systems with Applications, 2014, 41 (4), pp.1051-1062. ⟨10.1016/j.eswa.2013.07.102⟩. ⟨lirmm-00856372⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CIRAD AGROPARISTECH CNRS IRSTEA ADVANSE TEXTE LIRMM AGROPOLIS TETIS MIPS UNIV-MONTPELLIER INRAE INRAEOCCITANIEMONTPELLIER MATHNUM

187 Consultations

0 Téléchargements

How can catchy titles be generated without loss of informativeness?

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager