Exploiting Textual Source Information for Epidemiosurveillance

Elena Arsevska; Mathieu Roche; Renaud Lancelot; Pascal Hendrikx; Barbara Dufour

Communication Dans Un Congrès Année : 2014

Exploiting Textual Source Information for Epidemiosurveillance

(1) , (2, 3) , (1) , (4) , (5)

1
2
3
4
5

Elena Arsevska

Fonction : Auteur

Contrôle des maladies animales exotiques et émergentes

Mathieu Roche

Fonction : Auteur
PersonId : 4967
IdHAL : mathieu-roche
ORCID : 0000-0003-3272-8568
IdRef : 09042087X

Territoires, Environnement, Télédétection et Information Spatiale

ADVanced Analytics for data SciencE

Renaud Lancelot

Fonction : Auteur
PersonId : 760609
ORCID : 0000-0002-5826-5242

Contrôle des maladies animales exotiques et émergentes

Pascal Hendrikx

Fonction : Auteur
PersonId : 954355

Agence nationale de sécurité sanitaire de l'alimentation, de l'environnement et du travail

Barbara Dufour

Fonction : Auteur
PersonId : 1145100
ORCID : 0000-0002-4668-1986
IdRef : 059940786

Laboratoire Chrono-environnement (UMR 6249)

Résumé

In recent years as a complement to the traditional surveillance reporting systems there is a great interest in developing methodologies for early detection of potential health threats from unstructured text present on the Internet. In this context, we examined the relevance of the combination of expert knowledge and automatic term extraction in the creation of appropriate Internet search queries for the acquisition of disease outbreak news. We propose a measure that is the number of relevant disease outbreak news detected in function of the terms automatically extracted from a set of example Google and PubMED corpora. Due to the recent emergence we have used the African swine fever as a disease example. The new and exotic infectious diseases are an incising threat to countries due to globalization, movement of passengers, and international trade. With the traditional reporting schemes, often there are miss, delays or underreporting of disease outbreaks; leading to unawareness of countries about potential disease threats. As the Internet is a source of numerous and dynamic information, services need tools that could refine the search and detect the information of interest. Two important systems of the state-of-the-art, MediSys (Mantero et al. 2011) and Healthmap (Collier 2012) are based on a series of automatic steps to detect and acquire disease related news. The algorithms rely upon predefined templates, such keywords or patterns. Internet search queries have been proposed as inexpensive method to detect signals of diseases (ex. avian influenza) (Polgreen et al. 2008). In the face of many diseases and even more symptoms, the analysts face another challenge: How to identify appropriate queries for Internet disease surveillance? One option is to use the terms from existing thesaurus (e.g., MeSH). In this paper we present a new combined approach of selection of terms automatically extracted from relevant scientific and non-scientific corpora in order to identify most appropriate search queries for the detection of disease outbreak news on the Internet. As it is a recently emerging disease we use African swine fever (ASF) as a disease example.

Mots clés

terminology extraction internet disease surveillance

Domaines

Bio-informatique [q-bio.QM] Recherche d'information [cs.IR] Traitement du texte et du document Web

Fichier principal

document_574402.pdf (127.45 Ko)

Origine	Fichiers produits par l'(les) auteur(s)

Mathieu Roche : Connectez-vous pour contacter le contributeur

https://hal-lirmm.ccsd.cnrs.fr/lirmm-01184556

Soumis le : dimanche 16 août 2015-04:45:13

Dernière modification le : lundi 28 octobre 2024-13:42:03

Archivage à long terme le : mardi 17 novembre 2015-10:10:44

Dates et versions

lirmm-01184556 , version 1 (16-08-2015)

Identifiants

HAL Id : lirmm-01184556 , version 1

Citer

Elena Arsevska, Mathieu Roche, Renaud Lancelot, Pascal Hendrikx, Barbara Dufour. Exploiting Textual Source Information for Epidemiosurveillance. MTSR: Metadata and Semantics Research, Nov 2014, Karlsruhe, Germany. pp.359-361. ⟨lirmm-01184556⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CIRAD ANSES AGROPARISTECH CNRS UNIV-FCOMTE IRSTEA INRA CHRONO-ENVIRONNEMENT ADVANSE LIRMM AGROPOLIS TETIS MIPS UNIV-MONTPELLIER INRAE INRAEOCCITANIEMONTPELLIER MATHNUM

466 Consultations

346 Téléchargements

Exploiting Textual Source Information for Epidemiosurveillance

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager