Seek&Hide: Anonymising a French SMS corpus using natural language processing techniques - LIRMM - Laboratoire d’Informatique, de Robotique et de Microélectronique de Montpellier Accéder directement au contenu
Article Dans Une Revue Lingvisticae investigationes : International Journal of Linguistics and Language Année : 2012

Seek&Hide: Anonymising a French SMS corpus using natural language processing techniques

Résumé

This article presents the system Seek&Hide, a text message processing tool developed for the sud4science LR project. It performs the anonymisation/de-identification of a corpus. At present, it has been used to anonymise the sud4science LR corpus of French text messages collected during the project. This is done in two phases. In the first phase, it automatically processes over 70% of the corpus. The rest of the corpus is processed in the second phase, aided by an expert annotator via a web interface specifically designed to simplify the task.
Fichier non déposé

Dates et versions

lirmm-00816272 , version 1 (21-04-2013)

Identifiants

Citer

Pierre Accorsi, Namrata Patel, Cédric Lopez, Rachel Panckhurst, Mathieu Roche. Seek&Hide: Anonymising a French SMS corpus using natural language processing techniques. Lingvisticae investigationes : International Journal of Linguistics and Language, 2012, 35 (2), pp.163-180. ⟨10.1075/li.35.2.03acc⟩. ⟨lirmm-00816272⟩
149 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More