FMU: Fast Mining of Probabilistic Frequent Itemsets in Uncertain Data Streams
Résumé
Discovering Probabilistic Frequent Itemsets (PFI) in uncertain data is very challenging since algorithms designed for deterministic data are not applicable in this context. The problem is even more difficult for uncertain data streams where massive frequent updates need be taken into account while respecting data stream constraints. In this paper, we propose FMU (Fast Mining of Uncertain data streams), the rst solution for exact PFI mining in data streams with sliding windows. FMU allows updating the frequentness probability of an itemset whenever a transaction is added or removed from the observation window. Using these update operations, we are able to extract PFI in sliding windows with very low response times. Furthermore, our method is exact, meaning that we are able to discover the exact probabilistic frequentness distribution function for any monitored itemset, at any time. We implemented FMU and conducted an extensive experimental evaluation over synthetic and real-world data sets; the results illustrate its efficiency.
Domaines
Base de données [cs.DB]
Fichier principal
BDA_2012_-_Fast_Mining_of_Probabilistic_Frequent_Itemsets_in_Uncertain_Data_Streams.pdf (388.46 Ko)
Télécharger le fichier
Origine | Fichiers produits par l'(les) auteur(s) |
---|
Loading...