Querying Distributed Data in a Super-Peer Based Architecture
Résumé
Data integration is a significant challenge: relevant data objects are split across multiple information sources, and often owned by different organizations. The sources represent, maintain, and export the information using a variety of formats, interfaces and semantics. This paper addresses the issue of querying distributed data in a large scale context. We present a p2p information mediation framework based on the notion of super-peers, providing a super-peer network. This makes it possible for a super-peer to reach every other peer (data source) in the system, thus realizing the concept of a integrated schema formed from all possible information sources. This is achieved by classifying data sources into domains and creating user profiles for query optimization purposes.