, Dans notre approche, nous avons constitué un ensemble de descripteurs où chacun présente une des caractéristiques des unités logiques. Nous avons proposé une méthode automatique qui s'appuie sur l'extraction d'un nouveau type de grammes c.-à-d. les vs-grammes généralisés. Les résultats montrent que les vs-grammes généralisés peuvent être utilisés pour, Cet article présente une méthode de segmentation des fichiers logs (les textes ayant des unités logiques complexes)

J. P. Callan, Passage-level evidence in document retrieval, Proceedings of the 17th annual International ACM SIGIR Conference on Research and development in information retrieval, SIGIR'94, pp.302-310, 1994.

L. Carroll, Evaluating hierarchical discourse segmentation, Human Language Technologies, HLT'10, pp.993-1001, 2010.

F. Y. Choi, Advances in domain independent linear text segmentation, Proceedings of the 1st North American chapter of the Association for Computational Linguistics Conference, pp.26-33, 2000.

M. Kaszkiel and J. Zobel, Effective ranking with arbitrary passages, Journal of the American Society for Information Science and Technology, vol.52, pp.344-364, 2001.
DOI : 10.1002/1532-2890(2000)9999:9999<::aid-asi1075>3.3.co;2-r

URL : http://www.asis.org/Publications/JASIS/Best_Jasist/2002KaszkielandZobel.pdf

F. Llopis, A. Ferrndez, and J. Vicedo, Text segmentation for efficient information retrieval, Computational Linguistics and Intelligent Text Processing, vol.2276, pp.13-29, 2002.
DOI : 10.1007/3-540-45715-1_39

H. Saneifar, S. Bonniol, A. Laurent, P. Poncelet, and M. Roche, Passage retrieval in log files : an approach based on query enrichment, Proceedings of Advances in Natural Language Processing, 7th International Conference on NLP, pp.357-368, 2010.
URL : https://hal.archives-ouvertes.fr/lirmm-00816291

C. Tan, Y. Wang, and C. Lee, The use of bigrams to enhance text categorization, Information Processing and Management, vol.38, pp.529-546, 2002.

J. Tiedemann and J. Mur, Simple is best : experiments with different document segmentation strategies for passage retrieval, Coling 2008, IRQA'08, pp.17-25, 2008.

, Depending on the characteristics of our domain, we choiced the segmentation method called "discourse passages" which is based on the identification of "logical units of documents". Thus, we propose here a method to characterize complex logical units found in this type of documents according to their characteristics. Then, a supervised learning process is used to recognize these logical units