An evaluation of corpus-driven measures of medical concept similarity for information retrieval

Koopman, Bevan, Zuccon, Guido, Bruza, Peter D., Sitbon, Laurianne, & Lawley, Michael J. (2012) An evaluation of corpus-driven measures of medical concept similarity for information retrieval. In Lebanon, Guy, Zaki, Mohammed, & Wang, Haixun (Eds.) Proceedings of the 21st ACM international conference on Information and knowledge management, ACM, Hawaii, The United States of America, pp. 2439-2442.

View at publisher


Measures of semantic similarity between medical concepts are central to a number of techniques in medical informatics, including query expansion in medical information retrieval. Previous work has mainly considered thesaurus-based path measures of semantic similarity and has not compared different corpus-driven approaches in depth. We evaluate the effectiveness of eight common corpus-driven measures in capturing semantic relatedness and compare these against human judged concept pairs assessed by medical professionals. Our results show that certain corpus-driven measures correlate strongly (approx 0.8) with human judgements. An important finding is that performance was significantly affected by the choice of corpus used in priming the measure, i.e., used as evidence from which corpus-driven similarities are drawn. This paper provides guidelines for the implementation of semantic similarity measures for medical informatics and concludes with implications for medical information retrieval.

Impact and interest:

15 citations in Scopus
Search Google Scholar™

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

ID Code: 58993
Item Type: Conference Paper
Refereed: Yes
Keywords: Semantic similarity, Medical information retrieval
DOI: 10.1145/2396761.2398661
ISBN: 978-1-4503-1156-4
Subjects: Australian and New Zealand Standard Research Classification > INFORMATION AND COMPUTING SCIENCES (080000) > INFORMATION SYSTEMS (080600)
Divisions: Current > QUT Faculties and Divisions > Science & Engineering Faculty
Copyright Owner: Copyright 2012 ACM.
Deposited On: 09 Apr 2013 06:59
Last Modified: 18 Jul 2017 07:02

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page