Top-k retrieval using facility location analysis

Zuccon, Guido, Azzopardi, Leif, Zhang, Dell, & Wang, Jun (2012) Top-k retrieval using facility location analysis. Lecture Notes in Computer Science : Advances in Information Retrieval, 7224, pp. 305-316.

View at publisher

Abstract

The top-k retrieval problem aims to find the optimal set of k documents from a number of relevant documents given the user’s query. The key issue is to balance the relevance and diversity of the top-k search results. In this paper, we address this problem using Facility Location Analysis taken from Operations Research, where the locations of facilities are optimally chosen according to some criteria. We show how this analysis technique is a generalization of state-of-the-art retrieval models for diversification (such as the Modern Portfolio Theory for Information Retrieval), which treat the top-k search results like “obnoxious facilities” that should be dispersed as far as possible from each other. However, Facility Location Analysis suggests that the top-k search results could be treated like “desirable facilities” to be placed as close as possible to their customers. This leads to a new top-k retrieval model where the best representatives of the relevant documents are selected. In a series of experiments conducted on two TREC diversity collections, we show that significant improvements can be made over the current state-of-the-art through this alternative treatment of the top-k retrieval problem.

Impact and interest:

4 citations in Scopus
Search Google Scholar™

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

Full-text downloads:

48 since deposited on 28 May 2014
6 in the past twelve months

Full-text downloads displays the total number of times this work’s files (e.g., a PDF) have been downloaded from QUT ePrints as well as the number of downloads in the previous 365 days. The count includes downloads for all files if a work has more than one.

ID Code: 72188
Item Type: Journal Article
Refereed: Yes
Keywords: Top-k retrieval, Facility location analysis, information storage and retrieval
DOI: 10.1007/978-3-642-28997-2_26
ISBN: 978-3-642-28997-2
ISSN: 1611-3349
Divisions: Current > Schools > School of Information Systems
Current > QUT Faculties and Divisions > Science & Engineering Faculty
Copyright Owner: Copyright 2012 Springer
Copyright Statement: The original publication is available at SpringerLink
http://www.springerlink.com
Deposited On: 28 May 2014 23:43
Last Modified: 23 Jul 2014 18:15

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page