Gradual Rules: A Heuristic Based Method and Application to Outlier Extraction
Abstract
Nowaday, in spite of more and more efficent data mining tools, databases containing discrete values or having a value for each item, like gene expression data, are remains challenging. On such data, existing approaches either transform the data to classical binary attributes, or use discretisation, including fuzzy partition to deal with the data. However, binary mapping of such databases drives to a loss of information and extracted knowledge is not exploitable for end-users. Thus, powerful tools designed for this kind of data are needed. On the other hand, existing fuzzy approaches hardly take gradual notions into account, or are not scalable enougth to tackle the problem. In this paper, we thus propose a heuristic in order to extract tendencies, in the form of gradual association rules. A gradual rule can be read as "The more X and the less Y, then the more V and the less W". Instead of using fuzzy sets, we apply our method directly on valued data and we propose an efficient heuristic, thus reducing combinatorial complexity and scalability. Experiments on synthetic datasets show the interest of our method. Moreover, we propose to use our method for an outlier extraction process. Experiments lead on real dataset shows the efficiency of our method.
Domains
Databases [cs.DB]Origin | Publisher files allowed on an open archive |
---|
Loading...