Computerised methods for selecting a small number of single nucleotide polymorphisms that enable bacterial strain discrimination

Robertson, Gail Alexandra (2006) Computerised methods for selecting a small number of single nucleotide polymorphisms that enable bacterial strain discrimination. PhD thesis, Queensland University of Technology.


The possibility of identifying single nucleotide polymorphisms (SNPs) that would be useful for rapid bacterial typing was investigated. Neisseria meningitidis was the organism chosen for modelling the approach since informative SNPs could be found amongst the sequence data available for multi-locus sequence typing (MLST) at

The hypothesis tested was that a small number of SNPs located within the seven gene fragments sequenced for MLST provide information equivalent to MLST. Preliminary investigations revealed that a small number of SNPs could be utilised to highly discriminate sequence types (STs) of clinical interest. Laboratory procedures demonstrated that SNP fingerprinting of N. meningitidis isolates is achievable. Further tests showed that laboratory identification of a defining SNP in the genome of isolates was to be a practical method of obtaining relevant typing information.

Identification of the most discriminating SNPs amongst the ever-increasing amount of MLST sequence data summoned the need for computer-based assistance. Two methods of SNP selection devised by the author of this thesis were translated into computer-based algorithms by contributing team members. Software for two computer programs was produced. The algorithms facilitate the optimal selection of SNPs useful for (1) distinguishing specific STs and (2) differentiating non-specific STs. Current input information can be obtained from the MLST database and consequently the programs can be applied to any bacterial species for which MLST data have been entered.

The two algorithms for the selection of SNPs were designed to serve contrasting purposes. The first of these was to determine the ST identity of isolates from an outbreak of disease. In this case, isolates would be tested for their membership to any of the STs known to be associated with disease. It was shown that one SNP per ST could distinguish each of four hyperinvasive STs of N. meningitidis from between 92.5% and 97.5% of all other STs. With two SNPs per ST, between 96.7% and 99.0% discrimination is achieved. The SNPs were selected from MLST loci with the assistance of the first algorithm which scores SNPs according to the number of base mismatches in a sequence alignment between an allele of an ST of interest and alleles belonging to all other STs at a specified locus. The second purpose was to determine whether or not isolates from different sources belong to the same ST, regardless of their actual ST identity. It was shown that with seven SNPs, four sample STs of N. meningitidis could, on average, be discriminated from 97.1% of all other STs. The SNPs were selected with the aid of the second algorithm which scores SNPs at MLST loci for the relative frequency of each nucleotide base in a sequence alignment as a measure of the extent of their polymorphism.

A third algorithm for selecting SNPs has been discussed. By altering the method of scoring SNPs, it is possible to overcome the limitations inherent in the two algorithms that were utilised for finding SNPs. In addition, the third approach caters for finding SNPs that distinguish members of a complex from non-members.

Impact and interest:

Search Google Scholar™

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

Full-text downloads:

342 since deposited on 03 Dec 2008
8 in the past twelve months

Full-text downloads displays the total number of times this work’s files (e.g., a PDF) have been downloaded from QUT ePrints as well as the number of downloads in the previous 365 days. The count includes downloads for all files if a work has more than one.

ID Code: 16284
Item Type: QUT Thesis (PhD)
Supervisor: Giffard, Philip & Timms, Peter
Keywords: bacterial typing, single nucleotide polymorphism (SNP), multilocus sequence typing (MLST), sequence type (ST), neisseria meningitidis, discrimination, Simpson’s index of diversity
Divisions: Current > Research Centres > CRC for Diagnostics
Past > QUT Faculties & Divisions > Faculty of Science and Technology
Department: Faculty of Science
Institution: Queensland University of Technology
Copyright Owner: Copyright Gail Alexandra Robertson
Deposited On: 03 Dec 2008 04:00
Last Modified: 28 Oct 2011 19:45

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page