Machine learning approaches to analysing textual injury surveillance data: A systematic review

Vallmuur, Kirsten (2015) Machine learning approaches to analysing textual injury surveillance data: A systematic review. Accident Analysis and Prevention, 79, pp. 41-49.

[img] Accepted Version (PDF 99kB)
Administrators only until June 2018

View at publisher



To synthesise recent research on the use of machine learning approaches to mining textual injury surveillance data.


Systematic review.

Data sources

The electronic databases which were searched included PubMed, Cinahl, Medline, Google Scholar, and Proquest. The bibliography of all relevant articles was examined and associated articles were identified using a snowballing technique.

Selection criteria

For inclusion, articles were required to meet the following criteria: (a) used a health-related database, (b) focused on injury-related cases, AND used machine learning approaches to analyse textual data.


The papers identified through the search were screened resulting in 16 papers selected for review. Articles were reviewed to describe the databases and methodology used, the strength and limitations of different techniques, and quality assurance approaches used. Due to heterogeneity between studies meta-analysis was not performed.


Occupational injuries were the focus of half of the machine learning studies and the most common methods described were Bayesian probability or Bayesian network based methods to either predict injury categories or extract common injury scenarios. Models were evaluated through either comparison with gold standard data or content expert evaluation or statistical measures of quality. Machine learning was found to provide high precision and accuracy when predicting a small number of categories, was valuable for visualisation of injury patterns and prediction of future outcomes. However, difficulties related to generalizability, source data quality, complexity of models and integration of content and technical knowledge were discussed.


The use of narrative text for injury surveillance has grown in popularity, complexity and quality over recent years. With advances in data mining techniques, increased capacity for analysis of large databases, and involvement of computer scientists in the injury prevention field, along with more comprehensive use and description of quality assurance methods in text mining approaches, it is likely that we will see a continued growth and advancement in knowledge of text mining in the injury field.

Impact and interest:

8 citations in Scopus
Search Google Scholar™
6 citations in Web of Science®

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

ID Code: 82722
Item Type: Journal Article
Refereed: Yes
Keywords: text data, injury surveillance, injury epidemiology, text mining, machine learning
DOI: 10.1016/j.aap.2015.03.018
ISSN: 0001-4575
Subjects: Australian and New Zealand Standard Research Classification > MEDICAL AND HEALTH SCIENCES (110000) > PUBLIC HEALTH AND HEALTH SERVICES (111700) > Health Information Systems (incl. Surveillance) (111711)
Divisions: Current > Research Centres > Centre for Accident Research & Road Safety - Qld (CARRS-Q)
Current > QUT Faculties and Divisions > Faculty of Health
Current > Institutes > Institute of Health and Biomedical Innovation
Current > Schools > School of Psychology & Counselling
Copyright Owner: Copyright 2015 Elsevier
Copyright Statement: This is the author’s version of a work that was accepted for publication in Accident Analysis and Prevention. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Accident Analysis and Prevention, [VOL 79, (2015)] DOI: 10.1016/j.aap.2015.03.018
Deposited On: 24 Mar 2015 23:51
Last Modified: 29 Mar 2015 05:39

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page