Best Position Algorithms for Efficient Top-k Query Processing
Résumé
The general problem of answering top-k queries can be modeled using lists of data items sorted by their local scores. The main algorithm proposed so far for answering top-k queries over sorted lists is the Threshold Algorithm (TA). However, TA may still incur a lot of useless accesses to the lists. In this paper, we propose two algorithms that are much more efficient than TA. First, we propose the best position algorithm (BPA). For any database instance (i.e. set of sorted lists), we prove that BPA stops as early as TA, and that its execution cost is never higher than TA. We show that there are databases over which BPA executes top-k queries O(m) times faster than that of TA, where m is the number of lists. We also show that the execution cost of our algorithm can be (m-1) times lower than that of TA. Second, we propose the BPA2 algorithm which is much more efficient than BPA. We show that the number of accesses to the lists done by BPA2 can be about (m-1) times lower than that of BPA. We evaluated the performance of our algorithms through extensive experimental tests. The results show that over our test databases, BPA and BPA2 achieve significant performance gains in comparison with TA.
Domaines
Base de données [cs.DB]
Fichier principal
2011_-_InfoSys_-_Best_Position_Algorithms_for_Efficient_Top-k_Query_Processing_.pdf (1023.39 Ko)
Télécharger le fichier
Origine | Fichiers produits par l'(les) auteur(s) |
---|