Drosophila 3′ UTRs are more complex than protein-coding sequences
Algama, Manjula, Oldmeadow, Christopher, Tasker, Edward, Mengersen, Kerrie, & Keith, Jonathan M. (2014) Drosophila 3′ UTRs are more complex than protein-coding sequences. PLoS ONE, 9(5), e97336.
The 3′ UTRs of eukaryotic genes participate in a variety of post-transcriptional (and some transcriptional) regulatory interactions. Some of these interactions are well characterised, but an undetermined number remain to be discovered. While some regulatory sequences in 3′ UTRs may be conserved over long evolutionary time scales, others may have only ephemeral functional significance as regulatory profiles respond to changing selective pressures. Here we propose a sensitive segmentation methodology for investigating patterns of composition and conservation in 3′ UTRs based on comparison of closely related species. We describe encodings of pairwise and three-way alignments integrating information about conservation, GC content and transition/transversion ratios and apply the method to three closely related Drosophila species: D. melanogaster, D. simulans and D. yakuba. Incorporating multiple data types greatly increased the number of segment classes identified compared to similar methods based on conservation or GC content alone. We propose that the number of segments and number of types of segment identified by the method can be used as proxies for functional complexity. Our main finding is that the number of segments and segment classes identified in 3′ UTRs is greater than in the same length of protein-coding sequence, suggesting greater functional complexity in 3′ UTRs. There is thus a need for sustained and extensive efforts by bioinformaticians to delineate functional elements in this important genomic fraction. C code, data and results are available upon request.
Impact and interest:
Citation counts are sourced monthly from and citation databases.
These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.
Citations counts from theindexing service can be viewed at the linked Google Scholar™ search.
Full-text downloads displays the total number of times this work’s files (e.g., a PDF) have been downloaded from QUT ePrints as well as the number of downloads in the previous 365 days. The count includes downloads for all files if a work has more than one.
|Item Type:||Journal Article|
|Keywords:||Drosophila, Genome evolution, Multiple alignment calculation, Sequence databases, Sequence motif analysis, Untranslated regions|
|Subjects:||Australian and New Zealand Standard Research Classification > MATHEMATICAL SCIENCES (010000)
Australian and New Zealand Standard Research Classification > BIOLOGICAL SCIENCES (060000) > BIOCHEMISTRY AND CELL BIOLOGY (060100)
|Divisions:||Current > Schools > School of Mathematical Sciences
Current > QUT Faculties and Divisions > Science & Engineering Faculty
|Copyright Owner:||Copyright 2014 Algama et al.|
|Copyright Statement:||This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.|
|Deposited On:||22 Jul 2014 23:18|
|Last Modified:||23 Jul 2014 22:39|
Repository Staff Only: item control page