4.7 Article

CD-NuSS: A Web Server for the Automated Secondary Structural Characterization of the Nucleic Acids from Circular Dichroism Spectra Using Extreme Gradient Boosting Decision-Tree, Neural Network and Kohonen Algorithms

期刊

JOURNAL OF MOLECULAR BIOLOGY
卷 433, 期 11, 页码 -

出版社

ACADEMIC PRESS LTD- ELSEVIER SCIENCE LTD
DOI: 10.1016/j.jmb.2020.08.014

关键词

circular dichroism; nucleic acids secondary structure prediction; XGBoost algorithm; Kohonen algorithm; nnet algorithm

资金

  1. Ministry of Human Resource Development, Government of India
  2. BIRAC-SRISTI GYTI award [PMU_2017_010, PMU_2019_007]
  3. Indian Institute of Technology Hyderabad (IITH)

向作者/读者索取更多资源

XGBoost and nnet algorithms were utilized to predict diverse secondary structures of nucleic acids, showing similar prediction accuracy of approximately 85% to 87%. Both algorithms can be employed for predicting hybrid nucleic acid topologies in the future.
Nucleic acids exhibit a repertoire of conformational preference depending on the sequence and environment. Circular dichroism (CD) is an essential and valuable tool for monitoring such secondary structural conformations of nucleic acids. Nonetheless, the CD spectral diversity associated with these structures poses a challenge in obtaining the quantitative information about the secondary structural content of a given CD spectrum. To this end, the competence of the extreme gradient boosting decision-tree (XGBoost), Kohonen and neural network (nnet) algorithms have been exploited here to predict the diverse secondary structures of nucleic acids. A curated library of 450 CD spectra corresponding to 16 different secondary structures of nucleic acids has been created and used as a training dataset. The hyperparameters corresponding to the aforementioned algorithms have been optimized using holdout and k-fold (here, k = 5) cross-validation methods. For a test dataset of 150 CD spectra, both the nnet and XGBoost algorithms have exhibited nearly similar prediction accuracy in the range of 85% and 87% (the latter exhibited a slightly higher prediction accuracy). Thus, the nnet and XGBoost algorithms tested here can be employed for predicting the hybrid nucleic acid topologies in future. For the sake of accessibility, the entire process has been automated and implemented as a webserver, called CD-NuSS (CD to nucleic acids secondary structure) and is freely accessible at https://project.iith.ac.in/cdnuss/. (C) 2020 Elsevier Ltd. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据