How to Extract Relevant Knowledge from Tweets?

Flavien Bouillot 1 Phan Nhat Hai 2, 1 Nicolas Béchet 3 Sandra Bringay 1, 4 Dino Ienco 1 Stan Matwin 5 Pascal Poncelet 1 Mathieu Roche 6 Maguelonne Teisseire 1, 2
1 ADVANSE - ADVanced Analytics for data SciencE
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier
3 Equipe CODAG - Laboratoire GREYC - UMR6072
GREYC - Groupe de Recherche en Informatique, Image, Automatique et Instrumentation de Caen
6 TEXTE - Exploration et exploitation de données textuelles
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier
Abstract : Tweets exchanged over the Internet are an important source of information even if their characteristics make them difficult to analyze (e.g., a maximum of 140 characters; noisy data). In this paper, we inves- tigate two different problems. The first one is related to the extraction of representative terms from a set of tweets. More precisely we address the following question: are traditional information retrieval measures appro- priate when dealing with tweets?. The second problem is related to the evolution of tweets over time for a set of users. With the development of data mining approaches, lots of very efficient methods have been defined to extract patterns hidden in the huge amount of data available. More recently new spatio-temporal data mining approaches have specifically been defined for dealing with the huge amount of moving object data that can be obtained from the improvement in positioning technology. Due to particularity of tweets, the second question we investigate is the following: are spatio-temporal mining algorithms appropriate for better understanding the behavior of communities over time? These two prob- lems are illustrated through real applications concerning both health and political tweets.
Document type :
Book sections
Complete list of metadatas

Cited literature [20 references]  Display  Hide  Download

https://hal-lirmm.ccsd.cnrs.fr/lirmm-00798662
Contributor : Pascal Poncelet <>
Submitted on : Friday, March 29, 2013 - 4:55:54 PM
Last modification on : Wednesday, September 18, 2019 - 4:04:05 PM
Long-term archiving on : Sunday, June 30, 2013 - 2:55:10 AM

File

Isip2012.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : lirmm-00798662, version 1

Citation

Flavien Bouillot, Phan Nhat Hai, Nicolas Béchet, Sandra Bringay, Dino Ienco, et al.. How to Extract Relevant Knowledge from Tweets?. Communications in Computer and Information Science, 2013. ⟨lirmm-00798662⟩

Share

Metrics

Record views

551

Files downloads

904