FMU: Fast Mining of Probabilistic Frequent Itemsets in Uncertain Data Streams

Reza Akbarinia 1 Florent Masseglia 1
1 ZENITH - Scientific Data Management
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier, CRISAM - Inria Sophia Antipolis - Méditerranée
Abstract : Discovering Probabilistic Frequent Itemsets (PFI) in uncertain data is very challenging since algorithms designed for deterministic data are not applicable in this context. The problem is even more difficult for uncertain data streams where massive frequent updates need be taken into account while respecting data stream constraints. In this paper, we propose FMU (Fast Mining of Uncertain data streams), the rst solution for exact PFI mining in data streams with sliding windows. FMU allows updating the frequentness probability of an itemset whenever a transaction is added or removed from the observation window. Using these update operations, we are able to extract PFI in sliding windows with very low response times. Furthermore, our method is exact, meaning that we are able to discover the exact probabilistic frequentness distribution function for any monitored itemset, at any time. We implemented FMU and conducted an extensive experimental evaluation over synthetic and real-world data sets; the results illustrate its efficiency.
Type de document :
Communication dans un congrès
BDA: Bases de Données Avancées, 2012, Clermont-Ferrand, France. 28e journées Bases de Donnees Avancées, 2012
Liste complète des métadonnées

Littérature citée [18 références]  Voir  Masquer  Télécharger

https://hal-lirmm.ccsd.cnrs.fr/lirmm-00748605
Contributeur : Reza Akbarinia <>
Soumis le : lundi 5 novembre 2012 - 15:55:36
Dernière modification le : mercredi 21 novembre 2018 - 19:48:03
Document(s) archivé(s) le : samedi 17 décembre 2016 - 07:41:30

Fichier

BDA_2012_-_Fast_Mining_of_Prob...
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : lirmm-00748605, version 1

Collections

Citation

Reza Akbarinia, Florent Masseglia. FMU: Fast Mining of Probabilistic Frequent Itemsets in Uncertain Data Streams. BDA: Bases de Données Avancées, 2012, Clermont-Ferrand, France. 28e journées Bases de Donnees Avancées, 2012. 〈lirmm-00748605〉

Partager

Métriques

Consultations de la notice

471

Téléchargements de fichiers

472