Defining Key Semantics for the RDF Datasets: Experiments and Evaluations

Manuel Atencia 1, 2 Michel Chein 3 Madalina Croitoru 3 Jérôme David 2 Michel Leclère 3 Nathalie Pernelle 4 Fatiha Saïs 4 François Scharffe 3 Danai Symeonidou 4
1 LIG Laboratoire d'Informatique de Grenoble - HADAS
LIG - Laboratoire d'Informatique de Grenoble
3 GRAPHIK - Graphs for Inferences on Knowledge
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier, CRISAM - Inria Sophia Antipolis - Méditerranée
Abstract : Many techniques were recently proposed to automate the linkage of RDF datasets. Predicate selection is the step of the linkage process that consists in selecting the smallest set of relevant predicates needed to enable instance comparison. We call keys this set of predicates that is analogous to the notion of keys in relational databases. We explain formally the different assumptions behind two existing key semantics. We then evaluate experimentally the keys by studying how discovered keys could help dataset interlinking or cleaning. We discuss the experimental results and show that the two different semantics lead to comparable results on the studied datasets.
