4.7 Article

A Deep Learning Network Approach to ab initio Protein Secondary Structure Prediction

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TCBB.2014.2343960

Keywords

Machine learning; neural nets; protein structure prediction; deep learning

Funding

  1. National Institutes of Health [R01GM093123]
  2. NATIONAL INSTITUTE OF GENERAL MEDICAL SCIENCES [R01GM093123, T32GM008396] Funding Source: NIH RePORTER

Ask authors/readers for more resources

Ab initio protein secondary structure (SS) predictions are utilized to generate tertiary structure predictions, which are increasingly demanded due to the rapid discovery of proteins. Although recent developments have slightly exceeded previous methods of SS prediction, accuracy has stagnated around 80 percent and many wonder if prediction cannot be advanced beyond this ceiling. Disciplines that have traditionally employed neural networks are experimenting with novel deep learning techniques in attempts to stimulate progress. Since neural networks have historically played an important role in SS prediction, we wanted to determine whether deep learning could contribute to the advancement of this field as well. We developed an SS predictor that makes use of the position-specific scoring matrix generated by PSI-BLAST and deep learning network architectures, which we call DNSS. Graphical processing units and CUDA software optimize the deep network architecture and efficiently train the deep networks. Optimal parameters for the training process were determined, and a workflow comprising three separately trained deep networks was constructed in order to make refined predictions. This deep learning network approach was used to predict SS for a fully independent test dataset of 198 proteins, achieving a Q(3) accuracy of 80.7 percent and a Sov accuracy of 74.2 percent.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available