A Prime Number Based Approach for Closed Frequent Itemset Mining in Big Data

Mehdi Zitouni 1 Reza Akbarinia 1 Sadok Ben Yahia 2, 3 Florent Masseglia 1
1 ZENITH - Scientific Data Management
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier, CRISAM - Inria Sophia Antipolis - Méditerranée
Abstract : Mining big datasets poses a number of challenges which are not easily addressed by traditional mining methods, since both memory and computational requirements are hard to satisfy. One solution for dealing with such requirements is to take advantage of parallel frameworks , such as MapReduce, that allow to make powerful computing and storage units on top of ordinary machines. In this paper, we address the issue of mining closed frequent itemsets (CFI) from big datasets in such environments. We introduce a new parallel algorithm, called CloPN, for CFI mining. One of the worth of cite features of CloPN is that it uses a prime number based approach to transform the data into numerical form, and then to mine closed frequent itemsets by using only multiplication and division operations. We carried out exhaustive experiments over big real world datasets to assess the performance of CloPN. The obtained results highlight that our algorithm is very efficient in CFI mining from large real world datasets with up to 53 million articles.
Type de document :
Communication dans un congrès
DEXA: Database and Expert Systems Applications, Sep 2015, Valencia, Spain. 26th International Conference on Database and Expert Systems Applications, LNCS (9261), pp.509-516, 2015, 〈http://www.dexa.org〉. 〈10.1007/978-3-319-22849-5_35〉
Liste complète des métadonnées

Littérature citée [11 références]  Voir  Masquer  Télécharger

https://hal-lirmm.ccsd.cnrs.fr/lirmm-01169606
Contributeur : Florent Masseglia <>
Soumis le : lundi 29 juin 2015 - 18:37:07
Dernière modification le : jeudi 24 mai 2018 - 15:59:21
Document(s) archivé(s) le : mercredi 16 septembre 2015 - 06:27:26

Fichier

DexaCameraPaper.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Citation

Mehdi Zitouni, Reza Akbarinia, Sadok Ben Yahia, Florent Masseglia. A Prime Number Based Approach for Closed Frequent Itemset Mining in Big Data. DEXA: Database and Expert Systems Applications, Sep 2015, Valencia, Spain. 26th International Conference on Database and Expert Systems Applications, LNCS (9261), pp.509-516, 2015, 〈http://www.dexa.org〉. 〈10.1007/978-3-319-22849-5_35〉. 〈lirmm-01169606〉

Partager

Métriques

Consultations de la notice

711

Téléchargements de fichiers

1506