Abstract : More than 90% of the queries submitted to content sharing platforms, such as Flickr, are vague, i.e. only contain a few keywords, thus complicating the task of effectively returning interesting results. To overcome this limitation, many platforms use recommendation strategies to filter the results. But, recommendations usually tend to return highly redundant items. Content diversification has been studied as a solution to overcome this problem. However, it may suffer from at least two limitations: poor content description and semantic ambiguity. In this paper, we investigate profile diversity for searching web items. Profile diversification enables to address the problem of returning redundant items, and enhances the quality of diversification. We propose a threshold-based approach to return the most relevant and most popular documents while satisfying content and profile diversity constraints. Our approach includes a family of techniques allowing to efficiently retrieve the desired documents. To evaluate our solution, we have run intensive experiments, including a user survey, on three datasets; in more than 75% of the cases, profile diversity is similar or preferred by the users compared to other approaches. Additionally our optimization techniques enable to reduce the response time up to 12 times compared to a baseline greedy diversification algorithm.
Contributor : Maximilien Servajean <>
Submitted on : Monday, November 3, 2014 - 8:45:37 AM
Last modification on : Thursday, January 21, 2016 - 1:14:19 AM
Document(s) archivé(s) le : Wednesday, February 4, 2015 - 10:16:51 AM