Performance Oriented Schema Matching

Khalid Saleem 1 Zohra Bellahsene 2 Ela Hunt 3
2 ZENITH - Scientific Data Management
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier, CRISAM - Inria Sophia Antipolis - Méditerranée
Abstract : Semantic matching of schemas in heterogeneous data sharing systems is time consuming and error prone. Existing mapping tools employ semi-automatic techniques for mapping two schemas at a time. In a large-scale scenario, where data sharing involves a large number of data sources, such techniques are not suitable. We present a new robust mapping method which creates a mediated schema tree from a large set of input XML schema trees and defines mappings from the contributing schema to the mediated schema. The result is an almost automatic technique giving good performance with approximate semantic match quality. Our method uses node ranks calculated by pre-order traversal. It combines tree mining with semantic label clustering which minimizes the target search space and improves performance, thus making the algorithm suitable for large scale data sharing. We report on experiments with up to 80 schemas containing 83,770 nodes, with our prototype implementation taking 587 seconds to match and merge them to create a mediated schema and to return mappings from input schemas to the mediated schema.
Type de document :
Communication dans un congrès
DEXA'07: 18th International Conference on Database and Expert Systems Applications, Sep 2007, pp.844-853, 2007, 〈http://www.dexa.org/〉
Liste complète des métadonnées

Littérature citée [11 références]  Voir  Masquer  Télécharger

https://hal-lirmm.ccsd.cnrs.fr/lirmm-00171055
Contributeur : Khalid Saleem <>
Soumis le : mardi 11 septembre 2007 - 13:56:24
Dernière modification le : jeudi 24 mai 2018 - 15:59:21
Document(s) archivé(s) le : jeudi 8 avril 2010 - 19:43:11

Fichier

Identifiants

  • HAL Id : lirmm-00171055, version 1

Collections

Citation

Khalid Saleem, Zohra Bellahsene, Ela Hunt. Performance Oriented Schema Matching. DEXA'07: 18th International Conference on Database and Expert Systems Applications, Sep 2007, pp.844-853, 2007, 〈http://www.dexa.org/〉. 〈lirmm-00171055〉

Partager

Métriques

Consultations de la notice

156

Téléchargements de fichiers

231