4.6 Review

Identify DNA-Binding Proteins Through the Extreme Gradient Boosting Algorithm

Journal

FRONTIERS IN GENETICS
Volume 12, Issue -, Pages -

Publisher

FRONTIERS MEDIA SA
DOI: 10.3389/fgene.2021.821996

Keywords

DNA-binding protein prediction; machine learning; feature extraction; dimensionality reduction; XGBoost model

Funding

  1. National Natural Science Foundation of China [61971119]
  2. Heilongjiang Postdoctoral Fund [LBH-Q20135]
  3. National Natural Science Foundation of China (NSFC)
  4. NSFC

Ask authors/readers for more resources

The exploration of DNA-binding proteins is crucial in studying biological life activities, and machine learning algorithms have shown excellent performance in detecting DBPs. Our method, using feature extraction and the XGBoost model, achieves better results with high accuracy and simplicity compared to other methods.
The exploration of DNA-binding proteins (DBPs) is an important aspect of studying biological life activities. Research on life activities requires the support of scientific research results on DBPs. The decline in many life activities is closely related to DBPs. Generally, the detection method for identifying DBPs is achieved through biochemical experiments. This method is inefficient and requires considerable manpower, material resources and time. At present, several computational approaches have been developed to detect DBPs, among which machine learning (ML) algorithm-based computational techniques have shown excellent performance. In our experiments, our method uses fewer features and simpler recognition methods than other methods and simultaneously obtains satisfactory results. First, we use six feature extraction methods to extract sequence features from the same group of DBPs. Then, this feature information is spliced together, and the data are standardized. Finally, the extreme gradient boosting (XGBoost) model is used to construct an effective predictive model. Compared with other excellent methods, our proposed method has achieved better results. The accuracy achieved by our method is 78.26% for PDB2272 and 85.48% for PDB186. The accuracy of the experimental results achieved by our strategy is similar to that of previous detection methods.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available