Beyond statistical procedures for predictive modelling: Data mining algorithms and support for university research at QUT

Duplock, Ray & Kelson, Neil A. (2010) Beyond statistical procedures for predictive modelling: Data mining algorithms and support for university research at QUT. In eResearch Australasia 2010 : 21st Century Research : Where Computing Meets Data, 8th-12th November 2010, RACV Royal Pines, Gold Coast, Queensland. (Unpublished)

View at publisher


In a seminal data mining article, Leo Breiman [1] argued that to develop effective predictive classification and regression models, we need to move away from the sole dependency on statistical algorithms and embrace a wider toolkit of modeling algorithms that include data mining procedures. Nevertheless, many researchers still rely solely on statistical procedures when undertaking data modeling tasks; the sole reliance on these procedures has lead to the development of irrelevant theory and questionable research conclusions ([1], p.199). We will outline initiatives that the HPC & Research Support group is undertaking to engage researchers with data mining tools and techniques; including a new range of seminars, workshops, and one-on-one consultations covering data mining algorithms, the relationship between data mining and the research cycle, and limitations and problems with these new algorithms. Organisational limitations and restrictions to these initiatives are also discussed.

Impact and interest:

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

Full-text downloads:

214 since deposited on 07 Feb 2011
111 in the past twelve months

Full-text downloads displays the total number of times this work’s files (e.g., a PDF) have been downloaded from QUT ePrints as well as the number of downloads in the previous 365 days. The count includes downloads for all files if a work has more than one.

ID Code: 38617
Item Type: Conference Item (Poster)
Refereed: No
Keywords: data mining, statistical procedures, HERN, predictive modelling
Subjects: Australian and New Zealand Standard Research Classification > MATHEMATICAL SCIENCES (010000) > STATISTICS (010400) > Statistics not elsewhere classified (010499)
Divisions: Current > QUT Faculties and Divisions > Division of Technology, Information and Library Services
Current > Research Centres > High Performance Computing and Research Support
Deposited On: 07 Feb 2011 22:21
Last Modified: 24 Feb 2015 06:04

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page