Numerical sequence representation of DNA sequences and methods to distinguish coding and non-coding sequences in a complete genome
Yu, Zu-Guo, Anh, Vo V., Zhou, Yu, & Zhou, Li-Qian (2007) Numerical sequence representation of DNA sequences and methods to distinguish coding and non-coding sequences in a complete genome. In Callaos, N., Lesso, W., Zinn, C., & Zmazek, B. (Eds.) WMSCI 2007, The International Institute of Informatics and Systemics (IIIS), Florida, USA, pp. 171-176.
In this presentation we introduce two methods to distinguish coding and non-coding sequences in a complete genome. A numerical sequence representation of DNA sequences is introduced first. There exists a one-to-one correspondence between a DNA sequence and its numerical sequence representation. In the first method, three exponents from a multifractal analysis are selected to construct the parameter space. In the second method, which is based on a Fourier transform approach, three parameters from the power spectrum of the numerical sequence representation are selected to construct the parameter space. Each DNA may be represented by a point in these three-dimensional spaces. We found that the points corresponding to coding and non-coding sequences in the complete genomes of prokaryotes are divided into different regions in both parameter spaces. If the point for a DNA sequence is situated in the region corresponding to coding sequences, the sequence is recognized as a coding sequence; otherwise, the sequence is classified as a non-coding one. The average accuracies using Fisher's discriminant algorithm for coding and non-coding sequences are satisfactory.
Impact and interest:
Citation counts are sourced monthly from and citation databases.
Citations counts from theindexing service can be viewed at the linked Google Scholar™ search.
|Item Type:||Conference Paper|
|Subjects:||Australian and New Zealand Standard Research Classification > MATHEMATICAL SCIENCES (010000) > APPLIED MATHEMATICS (010200) > Biological Mathematics (010202)
Australian and New Zealand Standard Research Classification > BIOLOGICAL SCIENCES (060000) > GENETICS (060400) > Genome Structure and Regulation (060407)
|Divisions:||Past > QUT Faculties & Divisions > Faculty of Science and Technology
Current > Schools > School of Mathematical Sciences
|Copyright Owner:||Copyright 2007 (please consult author)|
|Deposited On:||18 Nov 2008|
|Last Modified:||06 Mar 2015 00:48|
Repository Staff Only: item control page