A Context-Based Measure for Discovering Approximate Semantic Matching between Schema Elements
Abstract
The possibility to query heterogeneous and semantically linked data sources depends on the ability to find correspondences between their structure and/or their content. Unfortunately, most of the tools used nowadays to discover those mappings are either manual or semi-automatic. In this article we present an automatic method to calculate the similarity measure between two schema elements. Furthermore, a tool has been implemented, Approxivect, based on the approximation of terminological methods and on the cosine measure between context vectors. Another important feature of our tool is that our method does not use any dictionary or language-based knowledge and works in specialized domain areas. Finally, we have performed experiments showing that our tool provides good results regarding those provided by COMA++. More precisely, it appears that Approxivect, when its parameters are tuned in optimum configurations, discovers most of the relevant couples in the top ranking while COMA++ only finds half of the mappings.
Loading...