Direct prediction of profiles of sequences compatible with a protein structure by neural networks with fragment-based local and energy-based nonlocal profiles

, Yang, Yuedong, Faraggi, Eshel, Zhan, Jian, & Zhao, Yaoqi (2014) Direct prediction of profiles of sequences compatible with a protein structure by neural networks with fragment-based local and energy-based nonlocal profiles. Proteins: Structure, Function and Genetics, 82(10), pp. 2565-2573.

[img] Published Version (PDF 269kB)
prot24620.pdf.
Administrators only | Request a copy from author
[img]
Preview
Accepted Version (PDF 1MB)

View at publisher

Description

Locating sequences compatible with a protein structural fold is the well-known inverse protein-folding problem. While significant progress has been made, the success rate of protein design remains low. As a result, a library of designed sequences or profile of sequences is currently employed for guiding experimental screening or directed evolution. Sequence profiles can be computationally predicted by iterative mutations of a random sequence to produce energy-optimized sequences, or by combining sequences of structurally similar fragments in a template library. The latter approach is computationally more efficient but yields less accurate profiles than the former because of lacking tertiary structural information. Here we present a method called SPIN that predicts Sequence Profiles by Integrated Neural network based on fragment-derived sequence profiles and structure-derived energy profiles. SPIN improves over the fragment-derived profile by 6.7% (from 23.6 to 30.3%) in sequence identity between predicted and wild-type sequences. The method also reduces the number of residues in low complex regions by 15.7% and has a significantly better balance of hydrophilic and hydrophobic residues at protein surface. The accuracy of sequence profiles obtained is comparable to those generated from the protein design program RosettaDesign 3.5. This highly efficient method for predicting sequence profiles from structures will be useful as a single-body scoring term for improving scoring functions used in protein design and fold recognition. It also complements protein design programs in guiding experimental design of the sequence library for screening and directed evolution of designed sequences. The SPIN server is available at http://sparks-lab.org.

Impact and interest:

37 citations in Scopus
24 citations in Web of Science®
Search Google Scholar™

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

Full-text downloads:

132 since deposited on 26 Jun 2017
25 in the past twelve months

Full-text downloads displays the total number of times this work’s files (e.g., a PDF) have been downloaded from QUT ePrints as well as the number of downloads in the previous 365 days. The count includes downloads for all files if a work has more than one.

ID Code: 108213
Item Type: Contribution to Journal (Journal Article)
Refereed: Yes
ORCID iD:
Li, Zhixiuorcid.org/0000-0002-2924-9120
Measurements or Duration: 9 pages
Keywords: inverse protein folding, knowledge-based energy function, neural network, protein design, sequence profiles
DOI: 10.1002/prot.24620
ISSN: 0887-3585
Pure ID: 32764034
Divisions: Past > QUT Faculties & Divisions > Faculty of Health
Past > Institutes > Institute of Health and Biomedical Innovation
Current > Schools > School of Biomedical Sciences
Copyright Owner: Consult author(s) regarding copyright matters
Copyright Statement: This work is covered by copyright. Unless the document is being made available under a Creative Commons Licence, you must assume that re-use is limited to personal use and that permission from the copyright owner must be obtained for all other uses. If the document is available under a Creative Commons License (or other specified license) then refer to the Licence for details of permitted re-use. It is a condition of access that users recognise and abide by the legal requirements associated with these rights. If you believe that this work infringes copyright please provide details by email to qut.copyright@qut.edu.au
Deposited On: 26 Jun 2017 05:51
Last Modified: 07 Aug 2024 14:33