Approaches of anonymisation of an SMS corpus - LIRMM - Laboratoire d’Informatique, de Robotique et de Microélectronique de Montpellier Access content directly
Conference Papers Year : 2013

Approaches of anonymisation of an SMS corpus

Abstract

This paper presents two anonymisation methods to process an SMS corpus. The first one is based on an unsupervised approach called Seek&Hide. The implemented system uses several dictionaries and rules in order to predict if a SMS needs anonymisation process. The second method is based on a supervised approach using machine learning techniques. We evaluate the two approaches and we propose a way to use them together. Only when the two methods do not agree on their prediction, will the SMS be checked by a human expert. This greatly reduces the cost of anonymising the corpus.
Fichier principal
Vignette du fichier
Approaches_of_anonymisation_of_an_SMS_co.pdf (140.59 Ko) Télécharger le fichier
Origin Files produced by the author(s)
Loading...

Dates and versions

lirmm-00816285 , version 1 (24-02-2017)

Identifiers

Cite

Namrata Patel, Pierre Accorsi, Diana Inkpen, Cédric Lopez, Mathieu Roche. Approaches of anonymisation of an SMS corpus. CICLing: Conference on Intelligent Text Processing and Computational Linguistics, Mar 2013, Samos, Greece. pp.77-88, ⟨10.1007/978-3-642-37247-6_7⟩. ⟨lirmm-00816285⟩
299 View
508 Download

Altmetric

Share

Gmail Mastodon Facebook X LinkedIn More