Skip to Main content Skip to Navigation
New interface
Journal articles

SIFR Annotator: Ontology-Based Semantic Annotation of French Biomedical Text and Clinical Notes

Andon Tchechmedjiev 1, 2 Amine Abdaoui 3 Vincent Emonet 1 Stella Zevio 1 Clement Jonquet 4, 1 
1 WEB3 - WEB Architecture x Semantic WEB x WEB of Data
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier
3 ADVANSE - ADVanced Analytics for data SciencE
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier
Abstract : Background: Despite a wide adoption of English in science, a significant amount of biomedical data are produced in other languages, such as French. Yet a majority of natural language processing or semantic tools as well as domain terminologies or ontologies are only available in English, and cannot be readily applied to other languages, due to fundamental linguistic differences. However, semantic resources are required to design semantic indexes and transform biomedical (text)data into knowledge for better information mining and retrieval. Results: We present the SIFR Annotator (, a publicly accessible ontology-based annotation web service to process biomedical text data in French. The service, developed during the Semantic Indexing of French Biomedical Data Resources (2013-2019) project is included in the SIFR BioPortal, an open platform to host French biomedical ontologies and terminologies based on the technology developed by the US National Center for Biomedical Ontology. The portal facilitates use and fostering of ontologies by offering a set of services-search, mappings, metadata, versioning, visualization, recommendation-including for annotation purposes. We introduce the adaptations and improvements made in applying the technology to French as well as a number of language independent additional features-implemented by means of a proxy architecture-in particular annotation scoring and clinical context detection. We evaluate the performance of the SIFR Annotator on different biomedical data, using available French corpora-Quaero (titles from French MEDLINE abstracts and EMEA drug labels) and CépiDC (ICD-10 coding of death certificates)-and discuss our results with respect to the CLEF eHealth information extraction tasks. Conclusions: We show the web service performs comparably to other knowledge-based annotation approaches in recognizing entities in biomedical text and reach state-of-the-art levels in clinical context detection (negation, experiencer, temporality). Additionally, the SIFR Annotator is the first openly web accessible tool to annotate and contextualize French biomedical text with ontology concepts leveraging a dictionary currently made of 28 terminologies and ontologies and 333 K concepts. The code is openly available, and we also provide a Docker packaging for easy local deployment to process sensitive (e.g., clinical) data in-house (
Complete list of metadata

Cited literature [72 references]  Display  Hide  Download
Contributor : Andon Tchechmedjiev Connect in order to contact the contributor
Submitted on : Sunday, November 25, 2018 - 1:04:05 PM
Last modification on : Friday, August 5, 2022 - 3:03:27 PM
Long-term archiving on: : Tuesday, February 26, 2019 - 12:27:11 PM


Files produced by the author(s)


Distributed under a Creative Commons Attribution 4.0 International License



Andon Tchechmedjiev, Amine Abdaoui, Vincent Emonet, Stella Zevio, Clement Jonquet. SIFR Annotator: Ontology-Based Semantic Annotation of French Biomedical Text and Clinical Notes. BMC Bioinformatics, 2018, 19 (1), pp.405-431. ⟨10.1186/s12859-018-2429-2⟩. ⟨lirmm-01934127⟩



Record views


Files downloads