期刊
CURRENT BIOINFORMATICS
卷 16, 期 9, 页码 1169-1178出版社
BENTHAM SCIENCE PUBL LTD
DOI: 10.2174/1574893616666210811100938
关键词
CRISPR/Cas9; off-target effects; machine learning; ensemblelearning; XGBoost; XGBCRISPR
资金
- National Natural Science Foundation of China [61762026, 61462018]
- Guangxi Natural Science Foundation [2017GXNSFAA198278]
- Innovation Project of GUET Graduate Education [2019YCXS056]
- GUET Excellent Graduate Thesis Program, China [18YJPYSS14]
The study compared three encoding methods based on One-Hot and combined four CRISPR/Cas9 off-target prediction tools with gene sequences to build an ensemble model with XGBoost. Experimental results showed that the XGBCRISPR model outperformed existing tools, but further improvement in model accuracy is needed as many off-target scores still appear.
Background: CRISPR/Cas9, a new generation of targeted gene editing technology with low cost and simple operation has been widely employed in the field of gene editing. The erroneous cutting of off-target sites in CRISPR/Cas9 is called off-target effect, which is also the biggest complication that CRISPR/Cas9 confronts in practical application. To be specific, the off-target effects could lead to unexpected gene editing results. Therefore, accurately predicting CRISPR/Cas9 off-target effect is a very important task. Predicting off-target effects of CRISPR/Cas9 by machine learning method is feasible, but most existing off-target tools did not pay close attention to the effects of gene encoding on prediction. Methods: We compared three encoding methods based on One-Hot and combined the gene sequence with four CRISPR/Cas9 off-target prediction tools to build an ensemble model with XGBoost, designated as XGBCRISPR. The grid search is employed to find the optimal parameters to achieve the best performance. Results: The performance is compared with existing tools based on the ROC value and PRC value. The experimental results show that the XGBCRISPR model is superior to the existing tools. Conclusion: The new model could achieve better prediction result than existing tools, but the accuracy of model can be improved further as many off-target scores appear.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据