Mining sequential patterns, Eleventh International Conference on Data Engineering, pp.3-14, 1995. ,

Information theory as an extension of the maximum likelihood principle, Second International Symposium on Information Theory, pp.267-281, 1973. ,

Information Theory. Interscience publishers, 1965. ,

Outliers in Statistical Data, 1994. ,

The pfam protein families database, Nucleic Acids Res, vol.28, pp.263-266, 2000. ,

URL : https://hal.archives-ouvertes.fr/hal-01294685

Modeling protein families using probabilistic suffix trees, Proceedings of the 3rd Annual International Conference on Computational Molecular Biology (RECOMB), pp.15-24, 1999. ,

Probability inequalities for the sum of independent random variables, Journal of the American Statistical Association, vol.57, pp.33-45, 1962. ,

Model Selection and Inference: A Practical Information-Theoretic Approach, 1998. ,

Combinatorial method in density estimation, 2001. ,

Chapter vi: Deterministic motif mining in protein databases, Successes and New Directions in Data Mining, 2007. ,

Bio3d: An r package for the comparative analysis of protein structures, Bioinformatics, vol.22, pp.2695-2696, 2006. ,

Identification of Outliers, 1980. ,

Regression and time series model selection in small samples, Biometrika, vol.76, issue.2, pp.297-307, 1989. ,

Algorithms for mining distancebased outliers in large datasets, Proc. 24th Int. Conf. Very Large Data Bases, VLDB, pp.24-27, 1998. ,

The power of amnesia: Learning probabilistic automata with variable memory length, Machine Learning, vol.25, pp.117-149, 1996. ,

A mathematical theory of communication, Bell System Technical Journal, vol.27, pp.379-423, 1948. ,

Further analysis of the data by akaike's information criterion and the finite corrections, Communications in Statistics: Theory and Methods, vol.7, pp.13-26, 1978. ,

Mining for outliers in sequential databases, SDM, 2006. ,

, R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, 2006.