Why Fuzzy Sequential Patterns can Help Data Summarization: an Application to the INPI Trademark Database
Abstract
Mining fuzzy rules is one of the best ways to summarize large databases while keeping information as clear and understandable as possible for the end-user. Several approaches have been proposed to mine such fuzzy rules, in particular to mine fuzzy association rules. However, we argue that it is important to mine rules that convey information about the order. For instance, it is very interesting to convey the idea of time running in rules, which is done in fuzzy sequential patterns. In this paper, we thus focus on fuzzy sequential patterns. We show that mining such rules requires to manage a lot of information and we propose algorithms to remain efficient in both memory use and computation time. Our proposition is assessed by experiments. Particularly, we apply our algorithms on the INPI database which stores almost 2 million trademarks