Keys and Pseudo-Keys Detection for Web Datasets Cleansing and Interlinking - LIRMM - Laboratoire d’Informatique, de Robotique et de Microélectronique de Montpellier Accéder directement au contenu
Communication Dans Un Congrès Année : 2012

Keys and Pseudo-Keys Detection for Web Datasets Cleansing and Interlinking

Résumé

This paper introduces a method for analyzing web datasets based on key dependencies. The classical notion of a key in relational databases is adapted to RDF datasets. In order to better deal with web data of variable quality, the definition of a pseudo-key is presented. An RDF vocabulary for representing keys is also provided. An algorithm to discover keys and pseudo-keys is described. Experimental results show that even for a big dataset such as DBpedia, the runtime of the algorithm is still reasonable. Two applications are further discussed: (i) detection of errors in RDF datasets, and (ii) datasets interlinking.

Domaines

Autre [cs.OH]
Fichier principal
Vignette du fichier
atencia2012b.pdf (210.86 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00768412 , version 1 (21-12-2012)

Identifiants

Citer

Manuel Atencia, Jérôme David, François Scharffe. Keys and Pseudo-Keys Detection for Web Datasets Cleansing and Interlinking. EKAW: Knowledge Engineering and Knowledge Management, Oct 2012, Galway, Ireland. pp.144-153, ⟨10.1007/978-3-642-33876-2_14⟩. ⟨hal-00768412⟩
349 Consultations
538 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More