The use of narrative text for injury surveillance research : a systematic review

McKenzie, Kirsten, Scott, Deborah A., Campbell, Margaret, & McClure, Roderick J. (2009) The use of narrative text for injury surveillance research : a systematic review. Accident Analysis & Prevention, 42(2), pp. 354-363.

View at publisher


Objective: To summarise the extent to which narrative text fields in administrative health data are used to gather information about the event resulting in presentation to a health care provider for treatment of an injury, and to highlight best practise approaches to conducting narrative text interrogation for injury surveillance purposes.-----

Design: Systematic review-----

Data sources: Electronic databases searched included CINAHL, Google Scholar, Medline, Proquest, PubMed and PubMed Central.. Snowballing strategies were employed by searching the bibliographies of retrieved references to identify relevant associated articles.-----

Selection criteria: Papers were selected if the study used a health-related database and if the study objectives were to a) use text field to identify injury cases or use text fields to extract additional information on injury circumstances not available from coded data or b) use text fields to assess accuracy of coded data fields for injury-related cases or c) describe methods/approaches for extracting injury information from text fields.-----

Methods: The papers identified through the search were independently screened by two authors for inclusion, resulting in 41 papers selected for review. Due to heterogeneity between studies metaanalysis was not performed.-----

Results: The majority of papers reviewed focused on describing injury epidemiology trends using coded data and text fields to supplement coded data (28 papers), with these studies demonstrating the value of text data for providing more specific information beyond what had been coded to enable case selection or provide circumstantial information. Caveats were expressed in terms of the consistency and completeness of recording of text information resulting in underestimates when using these data. Four coding validation papers were reviewed with these studies showing the utility of text data for validating and checking the accuracy of coded data. Seven studies (9 papers) described methods for interrogating injury text fields for systematic extraction of information, with a combination of manual and semi-automated methods used to refine and develop algorithms for extraction and classification of coded data from text. Quality assurance approaches to assessing the robustness of the methods for extracting text data was only discussed in 8 of the epidemiology papers, and 1 of the coding validation papers. All of the text interrogation methodology papers described systematic approaches to ensuring the quality of the approach.-----

Conclusions: Manual review and coding approaches, text search methods, and statistical tools have been utilised to extract data from narrative text and translate it into useable, detailed injury event information. These techniques can and have been applied to administrative datasets to identify specific injury types and add value to previously coded injury datasets. Only a few studies thoroughly described the methods which were used for text mining and less than half of the studies which were reviewed used/described quality assurance methods for ensuring the robustness of the approach. New techniques utilising semi-automated computerised approaches and Bayesian/clustering statistical methods offer the potential to further develop and standardise the analysis of narrative text for injury surveillance.

Impact and interest:

37 citations in Scopus
32 citations in Web of Science®
Search Google Scholar™

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

Full-text downloads:

446 since deposited on 04 Jan 2010
66 in the past twelve months

Full-text downloads displays the total number of times this work’s files (e.g., a PDF) have been downloaded from QUT ePrints as well as the number of downloads in the previous 365 days. The count includes downloads for all files if a work has more than one.

ID Code: 29471
Item Type: Journal Article
Refereed: Yes
Additional URLs:
Keywords: Narrative text, injury surveillance, text mining, health data
DOI: 10.1016/j.aap.2009.09.020
ISSN: 0001-4575
Subjects: Australian and New Zealand Standard Research Classification > MEDICAL AND HEALTH SCIENCES (110000) > PUBLIC HEALTH AND HEALTH SERVICES (111700) > Public Health and Health Services not elsewhere classified (111799)
Australian and New Zealand Standard Research Classification > MEDICAL AND HEALTH SCIENCES (110000) > PUBLIC HEALTH AND HEALTH SERVICES (111700) > Health Information Systems (incl. Surveillance) (111711)
Divisions: Current > QUT Faculties and Divisions > Faculty of Health
Current > Institutes > Institute of Health and Biomedical Innovation
Current > Research Centres > National Centre for Health Information Research & Training
Current > Schools > School of Public Health & Social Work
Copyright Owner: Copyright 2009 Elsevier
Deposited On: 04 Jan 2010 23:03
Last Modified: 10 Apr 2013 00:05

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page