The cross-species prediction of bacterial promoters using a support vector machine
Towsey, Michael W., Timms, Peter, Hogan, James M., & Mathews, Sarah A. (2008) The cross-species prediction of bacterial promoters using a support vector machine. Computational Biology and Chemistry, 32(5), pp. 359-366.
Due to degeneracy of the observed binding sites, the in silico prediction of bacterial sigma(70)-like promoters remains a challenging problem. A large number of sigma(70)-like promoters has been biologically identified in only two species, Escherichia coli and Bacillus subtilis. In this paper we investigate the issues that arise when searching for promoters in other species using an ensemble of SVM classifiers trained on E. coli promoters. DNA sequences are represented using a tagged mismatch string kernel. The major benefit of our approach is that it does not require a prior definition of the typical -35 and -10 hexamers. This gives the SVM classifiers the freedom to discover other features relevant to the prediction of promoters. We use our approach to predict sigma(A) promoters in B. subtilis and sigma(66) promoters in Chlamydia trachomatis. We extended the analysis to identify specific regulatory features of gene sets in C. trachomatis having different expression profiles. We found a strong -35 hexamer and TGN/-10 associated with a set of early expressed genes. Our analysis highlights the advantage of using TSS-PREDICT as a starting point for predicting promoters in species where few are known.
Impact and interest:
Citation counts are sourced monthly from and citation databases.
These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.
Citations counts from theindexing service can be viewed at the linked Google Scholar™ search.
|Item Type:||Journal Article|
|Keywords:||bioinformatics, support vector machine, sigma factor, promoters, transcript start site|
|Subjects:||Australian and New Zealand Standard Research Classification > BIOLOGICAL SCIENCES (060000) > BIOCHEMISTRY AND CELL BIOLOGY (060100) > Bioinformatics (060102)|
|Divisions:||Past > QUT Faculties & Divisions > Faculty of Science and Technology
Current > Institutes > Institute for Future Environments
Current > Institutes > Institute of Health and Biomedical Innovation
Past > Schools > School of Life Sciences
Past > Schools > School of Software Engineering & Data Communications
|Copyright Owner:||Copyright 2008 Elsevier|
|Deposited On:||05 Jan 2009 00:48|
|Last Modified:||29 Feb 2012 13:49|
Repository Staff Only: item control page