Fuzzy Tree Mining: Go Soft on Your Nodes
Abstract
Tree mining consists in discovering the frequent subtrees from a forest of trees. This problem has many application areas. For instance, a huge volume of data available from the Internet is now described by trees (e.g. XML). Still, for several documents dealing with the same topic, this description is not always the same. It is thus necessary to mine a common structure in order to query these documents. Biology is another field where data may be described by means of trees. The problem of mining trees has now been addressed for several years, leading to well-known algorithms. However, these algorithms can hardly deal with real data in a soft manner. Indeed, they consider a subtree as fully included in the super-tree. This means that all the nodes must appear. In this paper, we extend this definition to fuzzy inclusion based on the idea that a tree is included to a certain degree within another one, this fuzzy degree being correlated to the number of matching nodes.
Domains
Databases [cs.DB]Origin | Files produced by the author(s) |
---|
Loading...