Inferring phylogenies of evolving sequences without multiple sequence alignment
Chan, Cheong Xin, Bernard, Guillaume, Poirion, Olivier, Hogan, Jim, & Ragan, Mark (2014) Inferring phylogenies of evolving sequences without multiple sequence alignment. Scientific Reports, 4, Article number: 6504 1-9.
|
Published Version
(PDF 824kB)
__staffhome.qut.edu.au_staffgroupm$_meaton_Desktop_Published paper_Hogan.pdf. |
Open access copy at publisher website
Description
Alignment-free methods, in which shared properties of sub-sequences (e.g. identity or match length) are extracted and used to compute a distance matrix, have recently been explored for phylogenetic inference. However, the scalability and robustness of these methods to key evolutionary processes remain to be investigated. Here, using simulated sequence sets of various sizes in both nucleotides and amino acids, we systematically assess the accuracy of phylogenetic inference using an alignment-free approach, based on D2 statistics, under different evolutionary scenarios. We find that compared to a multiple sequence alignment approach, D2 methods are more robust against among-site rate heterogeneity, compositional biases, genetic rearrangements and insertions/deletions, but are more sensitive to recent sequence divergence and sequence truncation. Across diverse empirical datasets, the alignment-free methods perform well for sequences sharing low divergence, at greater computation speed. Our findings provide strong evidence for the scalability and the potential use of alignment-free methods in large-scale phylogenomics.
Impact and interest:
Citation counts are sourced monthly from Scopus and Web of Science® citation databases.
These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.
Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.
Full-text downloads:
Full-text downloads displays the total number of times this work’s files (e.g., a PDF) have been downloaded from QUT ePrints as well as the number of downloads in the previous 365 days. The count includes downloads for all files if a work has more than one.
ID Code: | 82330 | ||
---|---|---|---|
Item Type: | Contribution to Journal (Journal Article) | ||
Refereed: | Yes | ||
ORCID iD: |
|
||
Measurements or Duration: | 9 pages | ||
Keywords: | Approximate word matches, Evolution, Gene, Maximum-likelihood, Performance, Trees | ||
DOI: | 10.1038/srep06504 | ||
ISSN: | 2045-2322 | ||
Pure ID: | 32742418 | ||
Divisions: | Past > QUT Faculties & Divisions > Science & Engineering Faculty | ||
Copyright Owner: | Consult author(s) regarding copyright matters | ||
Copyright Statement: | This work is covered by copyright. Unless the document is being made available under a Creative Commons Licence, you must assume that re-use is limited to personal use and that permission from the copyright owner must be obtained for all other uses. If the document is available under a Creative Commons License (or other specified license) then refer to the Licence for details of permitted re-use. It is a condition of access that users recognise and abide by the legal requirements associated with these rights. If you believe that this work infringes copyright please provide details by email to qut.copyright@qut.edu.au | ||
Deposited On: | 09 Mar 2015 23:06 | ||
Last Modified: | 13 Feb 2025 03:47 |
Export: EndNote | Dublin Core | BibTeX
Repository Staff Only: item control page