4.7 Article

SgRNA-RF: Identification of SgRNA On-Target Activity With Imbalanced Datasets

出版社

IEEE COMPUTER SOC
DOI: 10.1109/TCBB.2021.3079116

关键词

Feature extraction; RNA; Correlation; Mathematical model; Benchmark testing; Training; Predictive models; sgRNA on-target activity; random forest; imbalanced datasets

资金

  1. National Natural Science Foundation of China [61922020, 61771331, 91935302]

向作者/读者索取更多资源

Single-guide RNA (sgRNA) is a non-coding RNA that guides the insertion or deletion of uridine residues into kinetoplastid during RNA editing. In this paper, a new classifier called SgRNA-RF is developed, which extracts features of nucleic acid composition and structure from the on-target activity sgRNA sequence and identifies them using the random forest algorithm. The classifier significantly improves the identification accuracy and provides a user-friendly web server for implementation.
Single-guide RNA is a guide RNA (gRNA), which guides the insertion or deletion of uridine residues into kinetoplastid during RNA editing. It is a small non-coding RNA that can be combined with pre -mRNA pairing. SgRNA is a critical component of the CRISPR/Cas9 gene knockout system and play an important role in gene editing and gene regulation. It is important to accurately and quickly identify highly on-target activity sgRNAs. Due to its importance, several computational predictors have been proposed to predict sgRNAs on-target activity. All these methods have clearly contributed to the development of this very important field. However, they also have certain limitations. In the paper, we developed a new classifier SgRNA-RF, which extracts the features of nucleic acid composition and structure of on-target activity sgRNA sequence and identified by random forest algorithm. In addition to solving an imbalanced dataset, this paper proposed a new method called CS-Smote. We compared sgRNA-RF with state-of-the-art predictors on the five datasets, and found SgRNA-RF significantly improved the identification accuracy, with accuracies of 0.8636,0.9161,0.894,0.938,0.965,0.77,0.979,0.973, respectively. The user-friendly web server that implements sgRNA-RF is freely available at http://server.malab.cn/sgRNA-RF/.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据