A Semi-Supervised Approach to the Detection and Characterization of Outliers in Categorical Data - LIRMM - Laboratoire d’Informatique, de Robotique et de Microélectronique de Montpellier Accéder directement au contenu
Article Dans Une Revue IEEE Transactions on Neural Networks and Learning Systems Année : 2017

A Semi-Supervised Approach to the Detection and Characterization of Outliers in Categorical Data

Résumé

In this paper we introduce a new approach of semi-supervised anomaly detection that deals with categorical data. Given a training set of instances (all belonging to the normal class), we analyze the relationships among features for the extraction of a discriminative characterization of the anomalous instances. Our key idea is to build a model characterizing the features of the normal instances and then use a set of distance-based techniques for the discrimination between the normal and the anomalous instances. We compare our approach with the state-of-the-art methods for semi-supervised anomaly detection. We empirically show that a specifically designed technique for the management of the categorical data outperforms the general-purpose approaches. We also show that, in contrast with other approaches that are opaque because their decision cannot be easily understood, our proposal produces a discriminative model that can be easily interpreted and used for the exploration of the data.
Fichier principal
Vignette du fichier
tnnls.pdf (537.04 Ko) Télécharger le fichier
Origine Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

lirmm-01275509 , version 1 (17-02-2016)

Identifiants

Citer

Dino Ienco, Ruggero Pensa, Rosa Meo. A Semi-Supervised Approach to the Detection and Characterization of Outliers in Categorical Data. IEEE Transactions on Neural Networks and Learning Systems, 2017, 28 (5), pp.1017-1029. ⟨10.1109/TNNLS.2016.2526063⟩. ⟨lirmm-01275509⟩
387 Consultations
695 Téléchargements

Altmetric

Partager

Gmail Mastodon Facebook X LinkedIn More