Classification of pathology reports for cancer registry notifications

Nguyen, Anthony, Moore, Julie, Zuccon, Guido, Lawley, Michael, & Colquist, Shoni (2012) Classification of pathology reports for cancer registry notifications. In Health Informatics : Building a Healthcare Future Through Trusted Information, IOS Press, Sydney, Australia, pp. 150-156.

View at publisher

Abstract

Objective: To develop a system for the automatic classification of pathology reports for Cancer Registry notifications.

Method: A two pass approach is proposed to classify whether pathology reports are cancer notifiable or not. The first pass queries pathology HL7 messages for known report types that are received by the Queensland Cancer Registry (QCR), while the second pass aims to analyse the free text reports and identify those that are cancer notifiable. Cancer Registry business rules, natural language processing and symbolic reasoning using the SNOMED CT ontology were adopted in the system.

Results: The system was developed on a corpus of 500 histology and cytology reports (with 47% notifiable reports) and evaluated on an independent set of 479 reports (with 52% notifiable reports). Results show that the system can reliably classify cancer notifiable reports with a sensitivity, specificity, and positive predicted value (PPV) of 0.99, 0.95, and 0.95, respectively for the development set, and 0.98, 0.96, and 0.96 for the evaluation set. High sensitivity can be achieved at a slight expense in specificity and PPV.

Conclusion: The system demonstrates how medical free-text processing enables the classification of cancer notifiable pathology reports with high reliability for potential use by Cancer Registries and pathology laboratories.

Impact and interest:

1 citations in Scopus
Search Google Scholar™

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

ID Code: 69292
Item Type: Conference Paper
Refereed: Yes
DOI: 10.3233/978-1-61499-078-9-150
ISBN: 9781614990789
Divisions: Current > Institutes > Institute for Future Environments
Current > Schools > School of Information Systems
Current > QUT Faculties and Divisions > Science & Engineering Faculty
Deposited On: 02 Jun 2014 03:04
Last Modified: 11 Jun 2014 02:40

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page