Kendall's rank correlation on quantized data: An interval-valued approach
Abstract
Kendall's rank correlation coefficient, also called Kendall's τ, is an efficient and robust way for identifying monotone relationships between two data sequences. However, when applied to digital data, the high number of ties yields inconsistent results due to quantization. Here, we propose an extension of Kendall's τ that considers an epistemic view of a sequence of quantized data – each sample is supposed to be the quantized version of an original value that is a real number. We come up with an imprecise τ, defined as the interval containing all τ values that could have been computed on sequences of original values before quantization. We propose a very simple and straightforward algorithm to compute this interval-valued τ. We prove the exactness of the bounds and propose an experiment that illustrates the need for such an extension.