Information retrieval as semantic inference: A graph inference model applied to medical search

Koopman, Bevan, Zuccon, Guido, Bruza, Peter, Sitbon, Laurianne, & Lawley, Michael (2016) Information retrieval as semantic inference: A graph inference model applied to medical search. Information Retrieval Journal, 19(1), pp. 6-37.

[img] PDF (777kB)
Administrators only until April 2017 | Request a copy from author

View at publisher

Abstract

This paper presents a Graph Inference retrieval model that integrates structured knowledge resources, statistical information retrieval methods and inference in a unified framework. Key components of the model are a graph-based representation of the corpus and retrieval driven by an inference mechanism achieved as a traversal over the graph. The model is proposed to tackle the semantic gap problem—the mismatch between the raw data and the way a human being interprets it. We break down the semantic gap problem into five core issues, each requiring a specific type of inference in order to be overcome. Our model and evaluation is applied to the medical domain because search within this domain is particularly challenging and, as we show, often requires inference. In addition, this domain features both structured knowledge resources as well as unstructured text. Our evaluation shows that inference can be effective, retrieving many new relevant documents that are not retrieved by state-of-the-art information retrieval models. We show that many retrieved documents were not pooled by keyword-based search methods, prompting us to perform additional relevance assessment on these new documents. A third of the newly retrieved documents judged were found to be relevant. Our analysis provides a thorough understanding of when and how to apply inference for retrieval, including a categorisation of queries according to the effect of inference. The inference mechanism promoted recall by retrieving new relevant documents not found by previous keyword-based approaches. In addition, it promoted precision by an effective reranking of documents. When inference is used, performance gains can generally be expected on hard queries. However, inference should not be applied universally: for easy, unambiguous queries and queries with few relevant documents, inference did adversely affect effectiveness. These conclusions reflect the fact that for retrieval as inference to be effective, a careful balancing act is involved. Finally, although the Graph Inference model is developed and applied to medical search, it is a general retrieval model applicable to other areas such as web search, where an emerging research trend is to utilise structured knowledge resources for more effective semantic search.

Impact and interest:

1 citations in Scopus
Search Google Scholar™
1 citations in Web of Science®

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

ID Code: 95670
Item Type: Journal Article
Refereed: Yes
Keywords: Semantic inference, Medical information retrieval, Health informatics
DOI: 10.1007/s10791-015-9268-9
ISSN: 1573-7659
Subjects: Australian and New Zealand Standard Research Classification > INFORMATION AND COMPUTING SCIENCES (080000) > LIBRARY AND INFORMATION STUDIES (080700) > Health Informatics (080702)
Australian and New Zealand Standard Research Classification > INFORMATION AND COMPUTING SCIENCES (080000) > LIBRARY AND INFORMATION STUDIES (080700) > Information Retrieval and Web Search (080704)
Divisions: Current > Schools > School of Electrical Engineering & Computer Science
Current > Schools > School of Information Systems
Current > QUT Faculties and Divisions > Science & Engineering Faculty
Copyright Owner: Copyright 2015 Springer Science+Business Media New York
Copyright Statement: The final publication is available at Springer via http://dx.doi.org/10.1007/s10791-015-9268-9
Deposited On: 19 May 2016 23:58
Last Modified: 23 May 2016 00:06

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page