☆ 4.7 Article

A sequence-based approach for identifying recombination spots in Saccharomyces cerevisiae by using hyper-parameter optimization in FastText and support vector machine

CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS (2019)

期刊

CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS

卷 194, 期 -, 页码 -

出版社

ELSEVIER

DOI: 10.1016/j.chemolab.2019.103855

关键词

Meiotic recombination; Continuous bag of words; Support vector machine; DNA sequencing; FastText; Prediction model

类别

Automation & Control Systems Chemistry, Analytical Computer Science, Artificial Intelligence Instruments & Instrumentation Mathematics, Interdisciplinary Applications Statistics & Probability

资金

NVIDIA Corporation

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Meiotic recombination is a biological process which plays a crucial role in genetic evolution. Therefore, the ability of machine learning models in extracting desire information embedded in DNA sequences has drawn a great deal of attention among biologists. Recently, several attempts have been made to address this problem, however, the performance results still need to be improved. The current study aims to investigate the relationship between natural language processing model and supervised learning in classifying DNA sequences. The idea is to treat DNA sequences by FastText model, including sub-word information and then use them as features in a suitable supervised learning algorithm. To the end, this hybrid approach helps us classify DNA recombination spots with achieved sensitivity of 90%, specificity of 94.76%, accuracy of 92.6%, and MCC of 0.851. These results have suggested that our newly proposed method is superior to other methods on the same benchmark dataset. This study, therefore, could shed the light on developing the prediction models for recombination spots in particular, and DNA sequences in general.

A sequence-based approach for identifying recombination spots in Saccharomyces cerevisiae by using hyper-parameter optimization in FastText and support vector machine

期刊

CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

A sequence-based approach for identifying recombination spots in Saccharomyces cerevisiae by using hyper-parameter optimization in FastText and support vector machine

期刊

CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文