Analysing e-mail text authorship for forensic purposes

Corney, Malcolm W. (2003) Analysing e-mail text authorship for forensic purposes. Masters by Research thesis, Queensland University of Technology.


E-mail has become the most popular Internet application and with its rise in use has come an inevitable increase in the use of e-mail for criminal purposes. It is possible for an e-mail message to be sent anonymously or through spoofed servers. Computer forensics analysts need a tool that can be used to identify the author of such e-mail messages.

This thesis describes the development of such a tool using techniques from the fields of stylometry and machine learning. An author's style can be reduced to a pattern by making measurements of various stylometric features from the text. E-mail messages also contain macro-structural features that can be measured. These features together can be used with the Support Vector Machine learning algorithm to classify or attribute authorship of e-mail messages to an author providing a suitable sample of messages is available for comparison.

In an investigation, the set of authors may need to be reduced from an initial large list of possible suspects. This research has trialled authorship characterisation based on sociolinguistic cohorts, such as gender and language background, as a technique for profiling the anonymous message so that the suspect list can be reduced.

Impact and interest:

Search Google Scholar™

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

Full-text downloads:

978 since deposited on 03 Dec 2008
25 in the past twelve months

Full-text downloads displays the total number of times this work’s files (e.g., a PDF) have been downloaded from QUT ePrints as well as the number of downloads in the previous 365 days. The count includes downloads for all files if a work has more than one.

ID Code: 16069
Item Type: QUT Thesis (Masters by Research)
Supervisor: Anderson, Alison & Mohay, George
Keywords: E-Mail, Computer Forensics, Authorship Attribution, Authorship Characterisation, Stylistics, Support Vector Machine
Department: Faculty of Information Technology
Institution: Queensland University of Technology
Deposited On: 03 Dec 2008 03:55
Last Modified: 22 Jun 2017 14:40

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page