QUT ePrints

Numerical Sequence Representation of DNA Sequences and Methods To Distinguish Coding And Non-Coding Sequences in a Complete Genome

Yu, Zuguo, Anh, Vo V., Zhou, Yu, & Zhou, Li-Qian (2007) Numerical Sequence Representation of DNA Sequences and Methods To Distinguish Coding And Non-Coding Sequences in a Complete Genome. In Callaos, N., Lesso, W., Zinn, C., & Zmazek, B. (Eds.) 11th World Multi-Conference on Systemics, Cybernetics and Informatics: WMSCI 2007, 8-11 July 2007, Florida, USA.

Abstract

In this presentation we introduce two methods to distinguish coding and non-coding sequences in a complete genome. A numerical sequence representation of DNA sequences is introduced first. There exists a one-to-one correspondence between a DNA sequence and its numerical sequence representation. In the first method, three exponents from a multifractal analysis are selected to construct the parameter space. In the second method, which is based on a Fourier transform approach, three parameters from the power spectrum of the numerical sequence representation are selected to construct the parameter space. Each DNA may be represented by a point in these three-dimensional spaces. We found that the points corresponding to coding and non-coding sequences in the complete genomes of prokaryotes are divided into different regions in both parameter spaces. If the point for a DNA sequence is situated in the region corresponding to coding sequences, the sequence is recognized as a coding sequence; otherwise, the sequence is classified as a non-coding one. The average accuracies using Fisher's discriminant algorithm for coding and non-coding sequences are satisfactory.

Impact and interest:

1 citations in Scopus
Search Google Scholar™
0 citations in Web of Science®

Citation countsare sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

Full-text downloads:

268 since deposited on 18 Nov 2008
36 in the past twelve months

Full-text downloadsdisplays the total number of times this work’s files (e.g., a PDF) have been downloaded from QUT ePrints as well as the number of downloads in the previous 365 days. The count includes downloads for all files if a work has more than one.

ID Code: 15651
Item Type: Conference Paper
Additional URLs:
ISBN: 1934272140
Subjects: Australian and New Zealand Standard Research Classification > MATHEMATICAL SCIENCES (010000) > APPLIED MATHEMATICS (010200) > Biological Mathematics (010202)
Australian and New Zealand Standard Research Classification > BIOLOGICAL SCIENCES (060000) > GENETICS (060400) > Genome Structure and Regulation (060407)
Divisions: Past > QUT Faculties & Divisions > Faculty of Science and Technology
Copyright Owner: Copyright 2007 (please consult author)
Deposited On: 18 Nov 2008
Last Modified: 29 Feb 2012 23:40

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page