Algorithmic statistics revisited
Résumé
The mission of statistics is to provide adequate statistical hypotheses (models) for observed data. But what is an "adequate" model? To answer this question, one needs to use the notions of algorithmic information theory. It turns out that for every data string $x$ one can naturally define "stochasticity profile", a curve that represents a trade-off between complexity of a model and its adequacy. This curve has four different equivalent definitions in terms of (1)~randomness deficiency, (2)~minimal description length, (3)~position in the lists of simple strings and (4)~Kolmogorov complexity with decompression time bounded by busy beaver function. We present a survey of the corresponding definitions and results relating them to each other.