期刊
APPLIED SCIENCES-BASEL
卷 11, 期 11, 页码 -出版社
MDPI
DOI: 10.3390/app11115123
关键词
classifier; machine learning; supervised; voting; transcription factor; binding site
This paper presents the prediction of transcription factor binding sites using different classification techniques, with the voting technique proving to be more efficient with noisy data and KNN performing well on this type of data. The study emphasizes the significance of using voting for predicting binding sites.
Transcription factors (TFs) are proteins that control the transcription of a gene from DNA to messenger RNA (mRNA). TFs bind to a specific DNA sequence called a binding site. Transcription factor binding sites have not yet been completely identified, and this is considered to be a challenge that could be approached computationally. This challenge is considered to be a classification problem in machine learning. In this paper, the prediction of transcription factor binding sites of SP1 on human chromosome1 is presented using different classification techniques, and a model using voting is proposed. The highest Area Under the Curve (AUC) achieved is 0.97 using K-Nearest Neighbors (KNN), and 0.95 using the proposed voting technique. However, the proposed voting technique is more efficient with noisy data. This study highlights the applicability of the voting technique for the prediction of binding sites, and highlights the outperformance of KNN on this type of data. The study also highlights the significance of using voting.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据