Approaches of anonymisation of an SMS corpus - LIRMM - Laboratoire d’Informatique, de Robotique et de Microélectronique de Montpellier Access content directly
Conference Papers Year : 2013

Approaches of anonymisation of an SMS corpus


This paper presents two anonymisation methods to process an SMS corpus. The first one is based on an unsupervised approach called Seek&Hide. The implemented system uses several dictionaries and rules in order to predict if a SMS needs anonymisation process. The second method is based on a supervised approach using machine learning techniques. We evaluate the two approaches and we propose a way to use them together. Only when the two methods do not agree on their prediction, will the SMS be checked by a human expert. This greatly reduces the cost of anonymising the corpus.
Fichier principal
Vignette du fichier
Approaches_of_anonymisation_of_an_SMS_co.pdf (140.59 Ko) Télécharger le fichier
Origin : Files produced by the author(s)

Dates and versions

lirmm-00816285 , version 1 (24-02-2017)



Namrata Patel, Pierre Accorsi, Diana Inkpen, Cédric Lopez, Mathieu Roche. Approaches of anonymisation of an SMS corpus. CICLing: Conference on Intelligent Text Processing and Computational Linguistics, Mar 2013, Samos, Greece. pp.77-88, ⟨10.1007/978-3-642-37247-6_7⟩. ⟨lirmm-00816285⟩
294 View
490 Download



Gmail Facebook X LinkedIn More