Identifying Commented Passages of Documents Using Implicit Hyperlinks

Jean-Yves Delort 1
1 LIRMM/HE - Hors Équipe
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier
Abstract : This paper addresses the issue of automatically selecting passages of blog posts using readers' comments. The problem is difficult because: (i) the textual content of blogs is often noisy, (ii) comments do not always target passages of the posts and, (iii) comments are not equally useful for identifying important passages. We have developed a system for selecting commented passages which takes as input blog posts and their comments and delivers, for each post, the sentences of the post which are the most commented and/or the most discussed. Our approach combines three steps to identify commented passages of a post. The first step is to remove the complexity of processing the contents of posts and comments using heuristics adapted to the language of the blog. The second step is to find useful comments and assigns them a degree of relevance using a model automatically built and validated by an expert. The third step is to identify important passages using relevant comments. We conducted two experiments to evaluate the usefulness and the effectiveness of our approach. The first study show that in only 50% of the posts, the most commented sentence elicited by our approach corresponds to the post extract generated using generic summarization. In the second study, human participants confirmed that, in practice, selected passages are frequently commented passages.
Type de document :
Communication dans un congrès
Seventeenth ACM International Conference on Hypertext and Hypermedia, Aug 2006, Odense, Denmark. pp.N/A, 2006, 〈http://www.ht06.org〉
Liste complète des métadonnées

https://hal-lirmm.ccsd.cnrs.fr/lirmm-00143047
Contributeur : J.Y. Delort <>
Soumis le : mardi 24 avril 2007 - 10:47:33
Dernière modification le : jeudi 24 mai 2018 - 15:59:21
Document(s) archivé(s) le : lundi 27 juin 2011 - 15:45:40

Identifiants

  • HAL Id : lirmm-00143047, version 1

Citation

Jean-Yves Delort. Identifying Commented Passages of Documents Using Implicit Hyperlinks. Seventeenth ACM International Conference on Hypertext and Hypermedia, Aug 2006, Odense, Denmark. pp.N/A, 2006, 〈http://www.ht06.org〉. 〈lirmm-00143047〉

Partager

Métriques

Consultations de la notice

123

Téléchargements de fichiers

522