Document ranking with quantum probabilities

Zuccon, Guido (2012) Document ranking with quantum probabilities. PhD thesis, University of Glasgow.

[img]
Preview
PDF (2MB)

Abstract

In this thesis we investigate the use of quantum probability theory for ranking documents. Quantum probability theory is used to estimate the probability of relevance of a document given a user's query. We posit that quantum probability theory can lead to a better estimation of the probability of a document being relevant to a user's query than the common approach, i. e. the Probability Ranking Principle (PRP), which is based upon Kolmogorovian probability theory. Following our hypothesis, we formulate an analogy between the document retrieval scenario and a physical scenario, that of the double slit experiment. Through the analogy, we propose a novel ranking approach, the quantum probability ranking principle (qPRP). Key to our proposal is the presence of quantum interference. Mathematically, this is the statistical deviation between empirical observations and expected values predicted by the Kolmogorovian rule of additivity of probabilities of disjoint events in configurations such that of the double slit experiment. We propose an interpretation of quantum interference in the document ranking scenario, and examine how quantum interference can be effectively estimated for document retrieval.

To validate our proposal and to gain more insights about approaches for document ranking, we

(1) analyse PRP, qPRP and other ranking approaches, exposing the assumptions underlying their ranking criteria and formulating the conditions for the optimality of the two ranking principles,

(2) empirically compare three ranking principles (i. e. PRP, interactive PRP, and qPRP) and two state-of-the-art ranking strategies in two retrieval scenarios, those of ad-hoc retrieval and diversity retrieval,

(3) analytically contrast the ranking criteria of the examined approaches, exposing similarities and differences,

(4) study the ranking behaviours of approaches alternative to PRP in terms of the kinematics they impose on relevant documents, i. e. by considering the extent and direction of the movements of relevant documents across the ranking recorded when comparing PRP against its alternatives.

Our findings show that the effectiveness of the examined ranking approaches strongly depends upon the evaluation context. In the traditional evaluation context of ad-hoc retrieval, PRP is empirically shown to be better or comparable to alternative ranking approaches. However, when we turn to examine evaluation contexts that account for interdependent document relevance (i. e. when the relevance of a document is assessed also with respect to other retrieved documents, as it is the case in the diversity retrieval scenario) then the use of quantum probability theory and thus of qPRP is shown to improve retrieval and ranking effectiveness over the traditional PRP and alternative ranking strategies, such as Maximal Marginal Relevance, Portfolio theory, and Interactive PRP.

This work represents a significant step forward regarding the use of quantum theory in information retrieval. It demonstrates in fact that the application of quantum theory to problems within information retrieval can lead to improvements both in modelling power and retrieval effectiveness, allowing the constructions of models that capture the complexity of information retrieval situations.

Furthermore, the thesis opens up a number of lines for future research. These include:

(1) investigating estimations and approximations of quantum interference in qPRP;

(2) exploiting complex numbers for the representation of documents and queries, and;

(3) applying the concepts underlying qPRP to tasks other than document ranking.

Impact and interest:

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

Full-text downloads:

73 since deposited on 02 Jun 2014
9 in the past twelve months

Full-text downloads displays the total number of times this work’s files (e.g., a PDF) have been downloaded from QUT ePrints as well as the number of downloads in the previous 365 days. The count includes downloads for all files if a work has more than one.

ID Code: 69287
Item Type: Thesis (PhD)
Refereed: No
Divisions: Current > Institutes > Institute for Future Environments
Current > Schools > School of Information Systems
Current > QUT Faculties and Divisions > Science & Engineering Faculty
Institution: University of Glasgow
Copyright Owner: Copyright 2012 Guido Zuccon
Deposited On: 02 Jun 2014 03:31
Last Modified: 02 Jun 2014 03:31

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page