Handling Fuzzy Gaps in Sequential Patterns: Application to Health
Abstract
Dealing with numerical data for mining novel knowledge is a non trivial task that has received much attention in the last years. However, it is still not easy to handle such data, especially when large volumes of values must be analyzed. In our work, we focus on biological data from DNA chips that biologists study in order to try and discover new gene correlations that could help understanding diseases like breast cancer. In this framework, we consider the values from the DNA microarrays, which convey the behavior of some genes, and we want to discover how these behaviors are correlated. This data are considered as being ordered as numerical values can be sorted. In previous work, sequential patterns like (1 5)(2) have been discovered, meaning that genes 1 and 5 have the same expression level followed by gene 2 that has a higher expression value. However, such data are very noisy and considering close values as ordered is often false. We thus consider here fuzzy rankings based on a fuzzy partition provided by the experts. Rules can then better characterize how genes are correlated.
Domains
Databases [cs.DB]Origin | Files produced by the author(s) |
---|
Loading...