How to Extract Relevant Knowledge from Tweets?

Flavien Bouillot 1 Phan Nhat Hai 2, 1 Nicolas Béchet 3 Sandra Bringay 1, 4 Dino Ienco 1 Stan Matwin 5 Pascal Poncelet 1 Mathieu Roche 6 Maguelonne Teisseire 1, 2
1 ADVANSE - ADVanced Analytics for data SciencE
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier
3 Equipe CODAG - Laboratoire GREYC - UMR6072
GREYC - Groupe de Recherche en Informatique, Image, Automatique et Instrumentation de Caen
6 TEXTE - Exploration et exploitation de données textuelles
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier
Abstract : Tweets exchanged over the Internet are an important source of information even if their characteristics make them difficult to analyze (e.g., a maximum of 140 characters; noisy data). In this paper, we inves- tigate two different problems. The first one is related to the extraction of representative terms from a set of tweets. More precisely we address the following question: are traditional information retrieval measures appro- priate when dealing with tweets?. The second problem is related to the evolution of tweets over time for a set of users. With the development of data mining approaches, lots of very efficient methods have been defined to extract patterns hidden in the huge amount of data available. More recently new spatio-temporal data mining approaches have specifically been defined for dealing with the huge amount of moving object data that can be obtained from the improvement in positioning technology. Due to particularity of tweets, the second question we investigate is the following: are spatio-temporal mining algorithms appropriate for better understanding the behavior of communities over time? These two prob- lems are illustrated through real applications concerning both health and political tweets.
Type de document :
Chapitre d'ouvrage
Communications in Computer and Information Science, 2013
Liste complète des métadonnées

Littérature citée [20 références]  Voir  Masquer  Télécharger

https://hal-lirmm.ccsd.cnrs.fr/lirmm-00798662
Contributeur : Pascal Poncelet <>
Soumis le : vendredi 29 mars 2013 - 16:55:54
Dernière modification le : mercredi 10 octobre 2018 - 14:28:12
Document(s) archivé(s) le : dimanche 30 juin 2013 - 02:55:10

Fichier

Isip2012.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : lirmm-00798662, version 1

Citation

Flavien Bouillot, Phan Nhat Hai, Nicolas Béchet, Sandra Bringay, Dino Ienco, et al.. How to Extract Relevant Knowledge from Tweets?. Communications in Computer and Information Science, 2013. 〈lirmm-00798662〉

Partager

Métriques

Consultations de la notice

461

Téléchargements de fichiers

603