Découverte de nouvelles entités et relations spatiales à partir d’un corpus de SMS

Abstract : Within the context of the currently available data masses, many works related to the analysis of spatial information are based on the exploitation of textual data. Mediated communication (SMS, tweets, etc.) conveying spatial information takes a prominent place. The objective of the work presented in this paper is to extract the spatial information from an authentic corpus of SMS in French. We propose a process in which, firstly, we extract new spatial entities (e.g. motpellier, montpeul associate with the place names Montpellier). Secondly, we identify new spatial relations that precede spatial entities (e.g. sur, par, pres, etc.). The task is very challenging and complex due of the specificity of SMS language which is based on weakly standardized writing (lexical creation, massive use of abbreviations, textual variants, etc.). The experiments that were carried out from the corpus 88milSMS highlight the robustness of our system in identifying new spatial entities and relations.
Document type :
Conference papers
Complete list of metadatas

Cited literature [18 references]  Display  Hide  Download

https://hal-lirmm.ccsd.cnrs.fr/lirmm-01944710
Contributor : Isabelle Gouat <>
Submitted on : Tuesday, December 4, 2018 - 6:37:28 PM
Last modification on : Wednesday, September 18, 2019 - 4:04:05 PM

Identifiers

  • HAL Id : lirmm-01944710, version 1

Citation

Sarah Zenasni, Eric Kergosien, Mathieu Roche, Maguelonne Teisseire. Découverte de nouvelles entités et relations spatiales à partir d’un corpus de SMS. TALN: Traitement Automatique des Langues Naturelles, Jul 2016, Paris, France. ⟨lirmm-01944710⟩

Share

Metrics

Record views

369

Files downloads

32