A Semi-Supervised Approach to the Detection and Characterization of Outliers in Categorical Data

Abstract : In this paper we introduce a new approach of semi-supervised anomaly detection that deals with categorical data. Given a training set of instances (all belonging to the normal class), we analyze the relationships among features for the extraction of a discriminative characterization of the anomalous instances. Our key idea is to build a model characterizing the features of the normal instances and then use a set of distance-based techniques for the discrimination between the normal and the anomalous instances. We compare our approach with the state-of-the-art methods for semi-supervised anomaly detection. We empirically show that a specifically designed technique for the management of the categorical data outperforms the general-purpose approaches. We also show that, in contrast with other approaches that are opaque because their decision cannot be easily understood, our proposal produces a discriminative model that can be easily interpreted and used for the exploration of the data.
Type de document :
Article dans une revue
IEEE Transactions on Neural Networks and Learning Systems, IEEE, 2016, 〈10.1109/TNNLS.2016.2526063〉
Liste complète des métadonnées

Littérature citée [33 références]  Voir  Masquer  Télécharger

https://hal-lirmm.ccsd.cnrs.fr/lirmm-01275509
Contributeur : Dino Ienco <>
Soumis le : mercredi 17 février 2016 - 15:36:43
Dernière modification le : jeudi 11 janvier 2018 - 06:27:21
Document(s) archivé(s) le : mercredi 18 mai 2016 - 13:08:49

Fichier

tnnls.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Citation

Dino Ienco, Ruggero Pensa, Rosa Meo. A Semi-Supervised Approach to the Detection and Characterization of Outliers in Categorical Data. IEEE Transactions on Neural Networks and Learning Systems, IEEE, 2016, 〈10.1109/TNNLS.2016.2526063〉. 〈lirmm-01275509〉

Partager

Métriques

Consultations de la notice

115

Téléchargements de fichiers

292