4.8 Article

Machine Learning Aided Interpretable Approach for Single Nucleotide-Based DNA Sequencing using a Model Nanopore

期刊

JOURNAL OF PHYSICAL CHEMISTRY LETTERS
卷 13, 期 50, 页码 11818-11830

出版社

AMER CHEMICAL SOC
DOI: 10.1021/acs.jpclett.2c02824

关键词

-

资金

  1. SPARC [SPARC/2018-2019/P116/SL]
  2. DST-SERB [C.R.G./2018/001131]
  3. CSIR [01 (3046) /21/EMR-II]
  4. Ministry of Education

向作者/读者索取更多资源

Solid-state nanopore-based electrical detection of DNA nucleotides using quantum tunneling technique shows promise as the next-generation sequencing technology. However, the complexity of experiments has hindered the achievement of accurate and scalable high-throughput analysis. In this study, we have developed a machine-learning-assisted method to predict the transmission function of nucleotides, achieving low root-mean-square error scores and providing interpretable explanations for the relationship between nucleotide properties and transmission functions. Experimental integration of this method could lead to cheap, accurate, and ultrafast DNA sequencing.
Solid-state nanopore-based electrical detection of DNA nucleotides with the quantum tunneling technique has emerged as a powerful strategy to be the next-generation sequencing technology. However, experimental complexity has been a foremost obstacle in achieving a more accurate high-throughput analysis with industrial scalability. Here, with one of the nucleotide training data sets of a model monolayer gold nanopore, we have predicted the transmission function for all other nucleotides with root-mean-square error scores as low as 0.12 using the optimized eXtreme Gradient Boosting Regression (XGBR) model. Further, the SHapley Additive exPlanations (SHAP) analysis helped in exploring the interpretability of the XGBR model prediction and revealed the complex relationship between the molecular properties of nucleotides and their transmission functions by both global and local interpretable explanations. Hence, experimental integration of our proposed machine-learning-assisted transmission function prediction method can offer a new direction for the realization of cheap, accurate, and ultrafast DNA sequencing.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据