Exploiting Textual Source Information for Epidemiosurveillance

Abstract : In recent years as a complement to the traditional surveillance reporting systems there is a great interest in developing methodologies for early detection of potential health threats from unstructured text present on the Internet. In this context, we examined the relevance of the combination of expert knowledge and automatic term extraction in the creation of appropriate Internet search queries for the acquisition of disease outbreak news. We propose a measure that is the number of relevant disease outbreak news detected in function of the terms automatically extracted from a set of example Google and PubMED corpora. Due to the recent emergence we have used the African swine fever as a disease example. The new and exotic infectious diseases are an incising threat to countries due to globalization, movement of passengers, and international trade. With the traditional reporting schemes, often there are miss, delays or underreporting of disease outbreaks; leading to unawareness of countries about potential disease threats. As the Internet is a source of numerous and dynamic information, services need tools that could refine the search and detect the information of interest. Two important systems of the state-of-the-art, MediSys (Mantero et al. 2011) and Healthmap (Collier 2012) are based on a series of automatic steps to detect and acquire disease related news. The algorithms rely upon predefined templates, such keywords or patterns. Internet search queries have been proposed as inexpensive method to detect signals of diseases (ex. avian influenza) (Polgreen et al. 2008). In the face of many diseases and even more symptoms, the analysts face another challenge: How to identify appropriate queries for Internet disease surveillance? One option is to use the terms from existing thesaurus (e.g., MeSH). In this paper we present a new combined approach of selection of terms automatically extracted from relevant scientific and non-scientific corpora in order to identify most appropriate search queries for the detection of disease outbreak news on the Internet. As it is a recently emerging disease we use African swine fever (ASF) as a disease example.
Type de document :
Communication dans un congrès
MTSR: Metadata and Semantics Research, Nov 2014, Karlsruhe, Germany. Springer, 478, pp.359-361, 2014, Communications in Computer and Information Science
Liste complète des métadonnées

Littérature citée [4 références]  Voir  Masquer  Télécharger

Contributeur : Mathieu Roche <>
Soumis le : dimanche 16 août 2015 - 04:45:13
Dernière modification le : jeudi 11 janvier 2018 - 06:27:21
Document(s) archivé(s) le : mardi 17 novembre 2015 - 10:10:44


Fichiers produits par l'(les) auteur(s)


  • HAL Id : lirmm-01184556, version 1


Elena Arsevska, Mathieu Roche, Renaud Lancelot, Pascal Hendrikx, Barbara Dufour. Exploiting Textual Source Information for Epidemiosurveillance. MTSR: Metadata and Semantics Research, Nov 2014, Karlsruhe, Germany. Springer, 478, pp.359-361, 2014, Communications in Computer and Information Science. 〈lirmm-01184556〉



Consultations de la notice


Téléchargements de fichiers