4.5 Article

A representation transfer learning approach for enhanced prediction of growth hormone binding proteins

Journal

COMPUTATIONAL BIOLOGY AND CHEMISTRY
Volume 87, Issue -, Pages -

Publisher

ELSEVIER SCI LTD
DOI: 10.1016/j.compbiolchem.2020.107274

Keywords

Growth hormone binding proteins; Autoencoders; Feature selection; SMO-PolyK; Generalized low rank models; Principal component analysis; t-sne

Ask authors/readers for more resources

Growth hormone binding proteins (GHBPs) are soluble proteins that play an important role in the modulation of signaling pathways pertaining to growth hormones. GHBPs are selective and bind non-covalently with growth hormones, but their functions are still not fully understood. Identification and characterization of GHBPs are the preliminary steps for understanding their roles in various cellular processes. As wet lab based experimental methods involve high cost and labor, computational methods can facilitate in narrowing down the search space of putative GHBPs. Performance of machine learning algorithms largely depends on the quality of features that it feeds on. Informative and non-redundant features generally result in enhanced performance and for this purpose feature selection algorithms are commonly used. In the present work, a novel representation transfer learning approach is presented for prediction of GHBPs. For their accurate prediction, deep autoencoder based features were extracted and subsequently SMO-PolyK classifier is trained. The prediction model is evaluated by both leave one out cross validation (LOOCV) and hold out independent testing set. On LOOCV, the prediction model achieved 89.8%% accuracy, with 89.4% sensitivity and 90.2% specificity and accuracy of 93.5%, sensitivity of 90.2% and specificity of 96.8% is attained on the hold out testing set. Further a comparison was made between the full set of sequence-based features, top performing sequence features extracted using feature selection algorithm, deep autoencoder based features and generalized low rank model based features on the prediction accuracy. Principal component analysis of the representative features along with t-sne visualization demonstrated the effectiveness of deep features in prediction of GHBPs. The present method is robust and accurate and may complement other wet lab based methods for identification of novel GHBPs.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available