4.6 Article Proceedings Paper

IIMLP: integrated information-entropy-based method for LncRNA prediction

期刊

BMC BIOINFORMATICS
卷 22, 期 SUPPL 3, 页码 -

出版社

BMC
DOI: 10.1186/s12859-020-03884-w

关键词

Long non-coding RNA; Information entropy; Generalized topological entropy; Machine learning

资金

  1. National 863 Key Basic Research Development Program [2014AA021505]
  2. National Key Research Program [2017YFC1201201]
  3. startup grant of Harbin Institute of Technology (Shenzhen)
  4. ShenZhen stable support program
  5. National Natural Science Foundation of China [61702134]

向作者/读者索取更多资源

In this study, we developed a lncRNA prediction method by integrating information-entropy-based features and machine learning algorithms. Our method, which includes 6 novel features generated from generalized topological entropy, achieves a higher area under the curve compared to methods with more K-mer features. Our approach is accurate, efficient, and extendable for research on functional elements in DNA sequences.
Background The prediction of long non-coding RNA (lncRNA) has attracted great attention from researchers, as more and more evidence indicate that various complex human diseases are closely related to lncRNAs. In the era of bio-med big data, in addition to the prediction of lncRNAs by biological experimental methods, many computational methods based on machine learning have been proposed to make better use of the sequence resources of lncRNAs. Results We developed the lncRNA prediction method by integrating information-entropy-based features and machine learning algorithms. We calculate generalized topological entropy and generate 6 novel features for lncRNA sequences. By employing these 6 features and other features such as open reading frame, we apply supporting vector machine, XGBoost and random forest algorithms to distinguish human lncRNAs. We compare our method with the one which has more K-mer features and results show that our method has higher area under the curve up to 99.7905%. Conclusions We develop an accurate and efficient method which has novel information entropy features to analyze and classify lncRNAs. Our method is also extendable for research on the other functional elements in DNA sequences.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据