☆ 4.6 Article

IDriveGenes: Cancer Driver Genes Prediction Using Machine Learning

IEEE ACCESS (2023)

期刊

IEEE ACCESS

卷 11, 期 -, 页码 28439-28453

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/ACCESS.2023.3259907

关键词

Cancer; Proteins; Bioinformatics; Genomics; Feature extraction; DNA; Tumors; Machine learning; Accuracy; cancer; driver genes; machine learning; NGS

类别

Computer Science, Information Systems Engineering, Electrical & Electronic Telecommunications

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

The development of high throughput sequencing technologies is revolutionizing cancer exploration. The proposed model uses PRIM and AAPIV to extract robust features from sequence data and converts it into 2-dimensional numeric data. SVM, NN, and RF are used to train the model and predict whether the given primary structure corresponds to cancer driver genes.

The development of high throughput sequencing technologies i.e. Next Generation Sequencing (NGS) is revolutionizing the exploration of cancer. Though sequence datasets are highly complex, mutation can occur randomly in DNA or RNA sequences that can make cells sicker or less fit. The unusual growth and behavior of genes in cells cause cancer. Cancer-driver gene cells grow when mutation occurs. Identification of cancer driver genes is a critical and challenging issue for researchers. In the proposed work, initially, robust features are extracted from the sequence dataset through Position Relative Incidence Matrix (PRIM) integrated with Accumulative Absolute Position Incidence Vector (AAPIV) generation. PRIM and AAPIV convert the single-dimensional sequence data into 2-dimensional numeric data. Support Vector Machine (SVM), Neural Network (NN), and Random Forest (RF) are used to train the model. The proposed model is validated with different validation methods i.e., independent testing, k-fold cross-validation, self-consistency, and jackknife testing. The proposed model predicts whether the given primary structure corresponds to cancer driver genes or not. Results analyses show 95%, 92%, and 69% accuracy on RF, Artificial Neural Networks (ANN), and SVM respectively. The comparative analysis with existing state-of-the-art models i.e., 20/20+ and Multimodal Deep Neural Network by integrating Multi-dimensional Data (NDNNMD) shows that the proposed model outperforms the existing techniques.

IDriveGenes: Cancer Driver Genes Prediction Using Machine Learning

期刊

IEEE ACCESS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

IDriveGenes: Cancer Driver Genes Prediction Using Machine Learning

期刊

IEEE ACCESS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文