Modeling protein families using probabilistic suffix trees, Proc. 3rd Ann. Conf. Computational Molecular Biology (RECOMB), pp.15-24, 1999. ,

Probability inequalities for the sum of independent random variables, Journal of the American Statistical Association, vol.57, pp.33-45, 1962. ,

Model Selection and Inference : A Practical Information-Theoretic Approach, 1998. ,

Chapter vi : Deterministic motif mining in protein databases, Successes and New Directions in Data Mining, 2007. ,

Bio3d : An r package for the comparative analysis of protein structures, Bioinformatics, vol.22, pp.2695-2696, 2006. ,

Identification of Outliers, 1980. ,

Regression and time series model selection in small samples, Biometrika, vol.76, issue.2, pp.297-307, 1989. ,

Algorithms for mining distance-based outliers in large datasets, Proc. 24th Int. Conf. Very Large Data Bases, VLDB, pp.392-403, 1998. ,

The power of amnesia : Learning probabilistic automata with variable memory length, Machine Learning, vol.25, issue.2-3, pp.117-149, 1996. ,

Estimating the dimension of a model, Annals of Statistics, vol.6, issue.2, pp.461-464, 1978. ,

A mathematical theory of communication, Bell System Technical Journal, vol.27, pp.379-423, 1948. ,

Further analysis of the data by akaike's information criterion and the finite corrections, Communications in Statistics : Theory and Methods, vol.7, pp.13-26, 1978. ,

Mining for outliers in sequential databases, Proc. 6th SIAM Int. Conf. Data Mining, pp.94-105, 2006. ,

R : A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, 2006. ,