Finding text boundaries and finding topic boundaries: two different tasks ?
Abstract
The goal of this paper is to demonstrate that usual evalua- tion methods for text segmentation are not adapted for every task linked to text segmentation. To do so we dierentiated the task of finding text boundaries in a corpus of concatenated texts from the task of finding transitions between topics inside the same text. We worked on a corpus of twenty two French political discourses trying to find boundaries be- tween them when they are concatenated, and to find topic boundaries inside them when they are not. We compared the results of our distance based method to the well known c99 algorithm.
Domains
Document and Text ProcessingOrigin | Files produced by the author(s) |
---|
Loading...