A Semi-Supervised Approach to the Detection and Characterization of Outliers in Categorical Data - LIRMM - Laboratoire d’Informatique, de Robotique et de Microélectronique de Montpellier Access content directly
Journal Articles IEEE Transactions on Neural Networks and Learning Systems Year : 2017

A Semi-Supervised Approach to the Detection and Characterization of Outliers in Categorical Data

Abstract

In this paper we introduce a new approach of semi-supervised anomaly detection that deals with categorical data. Given a training set of instances (all belonging to the normal class), we analyze the relationships among features for the extraction of a discriminative characterization of the anomalous instances. Our key idea is to build a model characterizing the features of the normal instances and then use a set of distance-based techniques for the discrimination between the normal and the anomalous instances. We compare our approach with the state-of-the-art methods for semi-supervised anomaly detection. We empirically show that a specifically designed technique for the management of the categorical data outperforms the general-purpose approaches. We also show that, in contrast with other approaches that are opaque because their decision cannot be easily understood, our proposal produces a discriminative model that can be easily interpreted and used for the exploration of the data.
Fichier principal
Vignette du fichier
tnnls.pdf (537.04 Ko) Télécharger le fichier
Origin : Files produced by the author(s)
Loading...

Dates and versions

lirmm-01275509 , version 1 (17-02-2016)

Identifiers

Cite

Dino Ienco, Ruggero Pensa, Rosa Meo. A Semi-Supervised Approach to the Detection and Characterization of Outliers in Categorical Data. IEEE Transactions on Neural Networks and Learning Systems, 2017, 28 (5), pp.1017-1029. ⟨10.1109/TNNLS.2016.2526063⟩. ⟨lirmm-01275509⟩
381 View
680 Download

Altmetric

Share

Gmail Facebook X LinkedIn More