4.6 Article

iEnhancer-DHF: Identification of Enhancers and Their Strengths Using Optimize Deep Neural Network With Multiple Features Extraction Methods

期刊

IEEE ACCESS
卷 9, 期 -, 页码 40783-40796

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2021.3062291

关键词

Computational modeling; Feature extraction; DNA; Support vector machines; Benchmark testing; Bioinformatics; Predictive models; DNAs; enhancers; Word2vec; PseKNC; deep learning; machine learning

资金

  1. Ministry of Science and Technology of China from the Key Research Area Grant [2016YFA0501703]
  2. National Science Foundation of China [32070662, 61832019, 32030063]
  3. Science and Technology Commission of Shanghai Municipality [19430750600]
  4. Natural Science Foundation of Henan Province [162300410060]
  5. Shanghai Jiao Tong University (SJTU) JiRLMDS Joint Research Fund [YG2017ZD14]
  6. Joint Research Funds for Medical and Engineering and Scientific Research at Shanghai Jiao Tong University [YG2017ZD14]

向作者/读者索取更多资源

This article proposes a two-level intelligent model, iEnhancer-DHF, based on Deep Neural Network (DNN) and multiple feature extraction methods to identify enhancers and their strengths in DNA sequences. The model utilizes a two-layer approach to predict whether DNA samples are enhancers or non-enhancers, as well as to distinguish between strong and weak enhancers, achieving high accuracies on both training and independent datasets. Comparison results show that the iEnhancer-DHF model outperforms recently published models and widely applied classifiers in enhancer identification.
Enhancers are short DNA regulatory elements which play a vital role in gene expression. Due to their important roles in genomics, several computational models have been proposed in the literature for identification of enhancers and their strengths using traditional machine learning algorithms, however, the proposed models are unable to identify enhancers and their strength with reasonable accuracy because of high non-linearity in DNA sequences. This article proposes a two-level intelligent model based on Deep Neural Network (DNN) along with multiple feature extraction methods. Firstly, the proposed model represents the given DNA sequences into feature vectors using Pseudo K-tuple Nucleotide Composition (PseKNC) and FastText methods. Secondly, the features vectors are fused to make a heterogeneous features vector that considered the local and global correlation amongst the given sequences along with internal structure information. Finally, the heterogeneous feature vector is given to a DNN model to make final predictions. The proposed iEnhancer-DHF is developed using two-layer approach. The first layer predicts whether the given DNA samples are enhancers or non-enhancers whereas the second layer identifies either the enhancers are strong enhancers or weak enhancers. The outcome of the proposed model was rigorously assessed using both training and independent datasets via 10-fold cross validation method. The validation outcome demonstrated that the iEnhancer-DHF model yielded accuracies 86.07% and 69.60% at first layer and second layer respectively utilizing the training dataset. Similarly, the model yielded accuracies 83.21% and 67.54% at first layer and at second layer respectively by using the independent dataset. Additionally, the outcomes of the proposed model was initially compared with widely applied classifiers such as Support Vector Machine, Random Forest and K-nearest Neighbor and subsequently the performance is compared with the existing models using both the training and independent datasets. The comparison results exhibited that the iEnhancer-DHF model performed superior than the recently published models.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据