A Context-based Measure for Discovering Approximate Semantic Matching between Schema

Fabien Duchateau 1 Zohra Bellahsene 2 Mathieu Roche 3
2 ZENITH - Scientific Data Management
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier, CRISAM - Inria Sophia Antipolis - Méditerranée
3 TEXTE - Exploration et exploitation de données textuelles
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier
Abstract : The possibility to query heterogeneous and semantically linked data sources depends on the ability to find correspondences between their structure and/or their content. Unfortunately, most of the tools used nowadays to discover those mappings are either manual or semi-automatic. In this article we present an automatic method to calculate the similarity measure between two schema elements. Furthermore, a tool has been implemented, Approxivect, based on the approximation of terminological methods and on the cosine measure between context vectors. Another important feature of our tool is that our method does not use any dictionary or language-based knowledge and works in specialized domain areas. Finally, we have performed experiments showing that our tool provides good results regarding those provided by the most referenced matching tools. More precisely, it appears that Approxivect, when its parameters are tuned in optimum configurations, discovers most of the relevant couples in the top ranking.
Liste complète des métadonnées

Cited literature [20 references]  Display  Hide  Download

https://hal-lirmm.ccsd.cnrs.fr/lirmm-00138105
Contributor : Mathieu Roche <>
Submitted on : Saturday, December 1, 2007 - 4:24:13 PM
Last modification on : Thursday, February 7, 2019 - 4:48:36 PM
Document(s) archivé(s) le : Friday, September 21, 2012 - 1:10:44 PM

Identifiers

  • HAL Id : lirmm-00138105, version 1

Citation

Fabien Duchateau, Zohra Bellahsene, Mathieu Roche. A Context-based Measure for Discovering Approximate Semantic Matching between Schema. RCIS'07: Research Challenges in Information Science, Apr 2007, Ouarzazate, Maroc, pp.11. ⟨lirmm-00138105⟩

Share

Metrics

Record views

708

Files downloads

200