Journal
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS
Volume 206, Issue -, Pages -Publisher
ELSEVIER
DOI: 10.1016/j.chemolab.2020.104151
Keywords
N-6-methyladenine; DNA; Convolution neural network; One-hot encoding; Sequence analysis
Categories
Funding
- Brain Research Program of the National Research Foundation (NRF) - Korean government (MSIT) [NRF-2017M3C7A1044816]
Ask authors/readers for more resources
N-6-methyladenine is post-replication modifications, which take place in the extensive range of DNA sequences and involved with a large number of different bioprocesses such as DNA repair, replication, cellular defense, and transcription in prokaryotes. Recently, various computational models were established to predict N-6-methyl adenine sites within DNAs. However, one of the main issues in the precise prediction of N-6-methyladenine is the extraction of those features, which clearly define the characteristics of N-6-methyladenine sites. In this method, input sequences of DNA are expressed by one-hot representation in order to allow progressive convolution layers. To exhibit the hidden information from the recognized sequences, the convolution neural network (CNN) model is applied to automatically learn the abstract features. Then, we apply the tri-nucleotide Composition (TNC) feature extraction technique and concatenate with CNN features. Our proposed model achieved 98.05% accuracy for the S-1 benchmark dataset and 89.22% accuracy for the S-2 benchmark dataset. The classification rates demonstrated that the developed approach performed better compared to existing approaches in terms of all the evaluation measures. It is expected that the developed intelligent approach might be played a leading and progressive role for academia as well as industrial research in the area of genomics prediction. The code cv is attached here.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available