Text Segmentation based on Document Understanding for Information Retrieval - LIRMM - Laboratoire d’Informatique, de Robotique et de Microélectronique de Montpellier Access content directly
Conference Papers Year : 2007

Text Segmentation based on Document Understanding for Information Retrieval

Abstract

Information retrieval needs to match relevant texts with a given query. Selecting appropriate parts is useful when documents are long, and only portions are interesting to the user. In this paper, we describe a method that extensively uses natural language techniques for text segmentation based on topic change detection. The method requires a NLP-parser and a semantic representation in Roget-based vectors. We have run the experiment on French documents, for which we have the appropriate tools, but the method could be transposed to any other lan- guage with the same requirements. The article sketches an overview of the NL understanding environment functionalities, and the algorithms related to our text segmentation method. An experiment in text seg- mentation is also presented and its result in an information retrieval task is shown.
Fichier principal
Vignette du fichier
nldb07.pdf (102.68 Ko) Télécharger le fichier
Loading...

Dates and versions

lirmm-00161996 , version 1 (12-07-2007)

Identifiers

Cite

Violaine Prince, Alexandre Labadié. Text Segmentation based on Document Understanding for Information Retrieval. NLDB: Natural Language Processing and Information Systems, Jun 2007, Paris, France. pp.295-304, ⟨10.1007/978-3-540-73351-5_26⟩. ⟨lirmm-00161996⟩
146 View
1883 Download

Altmetric

Share

Gmail Facebook X LinkedIn More