Using Locally Learnt Word Representations for better Textual Anomaly Detection

Alicia Breidenstein; Matthieu Labeau

doi:10.18653/v1/2024.insights-1.11

Communication Dans Un Congrès Année : 2024

Using Locally Learnt Word Representations for better Textual Anomaly Detection

(1, 2, 3) , (4, 3)

1
2
3
4

Alicia Breidenstein

Fonction : Auteur
PersonId : 1609298
IdHAL : alicia-breidenstein

IP Paris - Institut Polytechnique de Paris

IDS - Département Images, Données, Signal

S2A - Signal, Statistique et Apprentissage

Matthieu Labeau

Fonction : Auteur
PersonId : 182144
IdHAL : matthieu-labeau
IdRef : 230828426

LTCI - Laboratoire Traitement et Communication de l'Information

S2A - Signal, Statistique et Apprentissage

Résumé

The literature on general purpose textual Anomaly Detection is quite sparse, as most textual anomaly detection methods are implemented as out of domain detection in the context of pre-established classification tasks. Notably, in a field where pre-trained representations and models are of common use, the impact of the pre-training data on a task that lacks supervision has not been studied. In this paper, we use the simple setting of k-classes out anomaly detection and search for the best pairing of representation and classifier. We show that well-chosen embeddings allow a simple anomaly detection baseline such as OC-SVM to achieve similar results and even outperform deep state-of-the-art models.

Domaines

Informatique et langage [cs.CL]

Fichier principal

2024.insights-1.11.pdf (342.57 Ko)

Origine	Fichiers éditeurs autorisés sur une archive ouverte
licence	CC BY 4.0 - Attribution

Connectez-vous pour contacter le contributeur

https://telecom-paris.hal.science/hal-05526854

Soumis le : mercredi 25 février 2026-11:33:59

Dernière modification le : samedi 11 avril 2026-03:21:53

Dates et versions

hal-05526854 , version 1 (25-02-2026)

Licence

CC BY 4.0 - Attribution

Identifiants

HAL Id : hal-05526854 , version 1
DOI : 10.18653/v1/2024.insights-1.11

Citer

Alicia Breidenstein, Matthieu Labeau. Using Locally Learnt Word Representations for better Textual Anomaly Detection. Proceedings of the Fifth Workshop on Insights from Negative Results in NLP, Jun 2026, Mexico City, Mexico. pp.82-91, ⟨10.18653/v1/2024.insights-1.11⟩. ⟨hal-05526854⟩

Using Locally Learnt Word Representations for better Textual Anomaly Detection

Résumé

Domaines

Dates et versions

Licence

Identifiants

Citer

Exporter

Collections

Altmetric

Partager