Identification des unités de mesure dans les textes scientifiques

Soumia Lilia Berrahou 1, 2 Patrice Buche 2, 3 Juliette Dibie 4 Mathieu Roche 1, 5
1 ADVANSE - ADVanced Analytics for data SciencE
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier
3 GRAPHIK - Graphs for Inferences on Knowledge
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier, CRISAM - Inria Sophia Antipolis - Méditerranée
Abstract : Identification of units of measures in scientific texts. The work presented in this paper consists in identifying specialized terms (units of measures) in textual documents in order to enrich a onto-terminological resource (OTR). The first step permits to predict the localization of unit of measure variants in the documents. We have used a method based on supervised learning. This method permits to reduce significantly the variant search space staying in an optimal search context (reduction of 86% of the search space on the studied set of documents). The second step uses a new similarity measure identifying automatically variants associated with term denoting a unit of measure already present in the OTR with a precision rate of 82% for a threshold above 0.6 on the studied corpus.
Complete list of metadatas

Cited literature [7 references]  Display  Hide  Download

https://hal-lirmm.ccsd.cnrs.fr/lirmm-01184559
Contributor : Mathieu Roche <>
Submitted on : Sunday, August 16, 2015 - 5:30:25 AM
Last modification on : Tuesday, June 25, 2019 - 1:27:05 AM
Long-term archiving on : Tuesday, November 17, 2015 - 10:11:05 AM

File

taln-2015-court-014.pdf
Publisher files allowed on an open archive

Identifiers

  • HAL Id : lirmm-01184559, version 1

Citation

Soumia Lilia Berrahou, Patrice Buche, Juliette Dibie, Mathieu Roche. Identification des unités de mesure dans les textes scientifiques. TALN: Traitement Automatique des Langues Naturelles, Jun 2015, Caen, France. pp.404-410. ⟨lirmm-01184559⟩

Share

Metrics

Record views

599

Files downloads

396