Bioinformatic prediction of conserved promoters across multiple whole genomes of Chlamydia

Grech, Brian James (2007) Bioinformatic prediction of conserved promoters across multiple whole genomes of Chlamydia. PhD thesis, Queensland University of Technology.


The genome sequencing projects have generated a wealth of genomic data and the analysis of this data has provided many interesting findings. However, genome wide analysis of bacteria for promoters has lagged behind, because it has been difficult to accurately predict the promoters with so much background noise that are found in bacterial genomes. One approach to overcome this problem is to predict phylogenetically conserved promoters across multiple genomes of different bacteria, thus filtering out many of the false positives, which are predicted by the current methods. However, there are no programmes capable of doing this. Therefore, the work presented in this thesis has developed a position weight matrix (PWM) based programme called Multiscan that predicts conserved promoters across multiple bacterial genomes. Since Chlamydia is one of the most sequenced bacterial genera and has a high level of conservation of genes and large-scale conservation of gene order between species, Multiscan was developed and tested on Chlamydia. When Multiscan analysed a genome wide dataset of equivalent non-coding regions (NCRs) upstream of genes, from Chlamydia trachomatis, Chlamydia pneumoniae and Chlamydia caviae for σ66 promoters that are phylogenetically conserved, Multiscan predicted 42 promoters. Since only one of the 42 promoters predicted by Multiscan had previously available biological data to confirm its prediction, an additional subset of 10 of the remaining 41 σ66 promoters were analysed in C. trachomatis by mapping the 5' end of the transcripts. The primer extension assay synthesised cDNA products of the correct length for seven of the 10 genes chosen. When the performance of Multiscan was compared to one of the accepted method for genome wide prediction of promoters in bacteria, the &quotstandard PWM method", Multiscan predicted 32 more promoters than the &quotstandard PWM method" in Chlamydia. Furthermore, the promoters predicted by Multiscan were up to three more mismatches from the Escherichia coli σ70 consensus sequence than the promoters predicted by the standard PWM method. Although Multiscan predicted 42 promoters that were well conserved across the three chlamydial species, the analysis was unable to identify the 14 known σ66 promoters in C. trachomatis. These promoters were missed (1) because they were dissimilar to the E. coli σ70 consensus sequence and/or (2) because the promoters were poorly conserved across the three chlamydial species. To address the second possibility, the 14 false negatives were analysed by another phylogenetic footprinting method. Fourteen sets of equivalent NCRs located upstream of the homologous genes from the three chlamydiae were aligned with the computer programme Clustal W and the alignment analysed &quotby eye" for evidence of phylogenetic footprints containing the 14 false negatives. The analysis identified that seven of the 14 false negatives were poorly conserved across the chlamydial species. Analysis of two of the seven promoters that could not be footprinted, the promoters of ltuA and ltuB, by mapping the transcriptional start sites in C. caviae, confirmed their poor conservation across C. trachomatis and C. caviae. This analysis showed that substantial differences exist in chlamydial σ66 promoters from equivalent NCRs upstream of genes. This study has developed a new computer programme for genome wide prediction of promoters that are phylogenetically conserved and has shown the value of this programme by identifying seven new well conserved promoters and seven candidate poorly conserved promoters in Chlamydia.

Impact and interest:

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

Full-text downloads:

906 since deposited on 03 Dec 2008
47 in the past twelve months

Full-text downloads displays the total number of times this work’s files (e.g., a PDF) have been downloaded from QUT ePrints as well as the number of downloads in the previous 365 days. The count includes downloads for all files if a work has more than one.

ID Code: 16521
Item Type: QUT Thesis (PhD)
Supervisor: Timms, Peter & Mathews, Sarah
Keywords: algorithm, bioinformatic, Chlamydia, comparative genomics, gene expression regulation, phylogenetic footprinting, phylogeny, promoter, sigma factor, transcription, transcription factor, transcription start site
Divisions: Past > QUT Faculties & Divisions > Faculty of Science and Technology
Past > Schools > School of Life Sciences
Department: Faculty of Science
Institution: Queensland University of Technology
Copyright Owner: Copyright Brian James Grech
Deposited On: 03 Dec 2008 04:05
Last Modified: 22 Mar 2016 03:42

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page