☆ 4.7 Article

DomBpred: Protein Domain Boundary Prediction Based on Domain-Residue Clustering Using Inter-Residue Distance

IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS (2023)

Journal

IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS

Volume 20, Issue 2, Pages 912-922

Publisher

IEEE COMPUTER SOC

DOI: 10.1109/TCBB.2022.3175905

Keywords

Domain boundary prediction; domain-residue clustering; inter-residue distance; protein domain

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This study develops a sequence-based protein domain boundary prediction method called DomBpred. It classifies the input sequence as either a single-domain or multi-domain protein using an effective sequence metric and a constructed single-domain sequence library. For multi-domain proteins, a domain-residue clustering algorithm is used to cluster residues based on their distances. The unclassified residues and residues at the cluster edge are adjusted using secondary structure information to create potential cut points. A domain boundary scoring function is then used to evaluate these potential cut points and generate the domain boundary.

Domain boundary prediction is one of the most important problems in the study of protein structure and function, especially for large proteins. At present, most domain boundary prediction methods have low accuracy and limitations in dealing with multi-domain proteins. In this study, we develop a sequence-based protein domain boundary prediction, named DomBpred. In DomBpred, the input sequence is first classified as either a single-domain protein or a multi-domain protein through a designed effective sequence metric based on a constructed single-domain sequence library. For the multi-domain protein, a domain-residue clustering algorithm inspired by Ising model is proposed to cluster the spatially close residues according inter-residue distance. The unclassified residues and the residues at the edge of the cluster are then tuned by the secondary structure to form potential cut points. Finally, a domain boundary scoring function is proposed to recursively evaluate the potential cut points to generate the domain boundary. DomBpred is tested on a large-scale test set of FUpred comprising 2549 proteins. Experimental results show that DomBpred better performs than the state-of-the-art methods in classifying whether protein sequences are composed by single or multiple domains, and the Matthew's correlation coefficient is 0.882. Moreover, on 849 multi-domain proteins, the domain boundary distance and normalised domain overlap scores of DomBpred are 0.523 and 0.824, respectively, which are 5.0% and 4.2% higher than those of the best comparison method, respectively. Comparison with other methods on the given test set shows that DomBpred outperforms most state-of-the-art sequence-based methods and even achieves better results than the top-level template-based method. The executable program is freely available at https://github.com/iobio-zjut/DomBpred and the online server at http://zhanglab-bioinf.com/DomBpred/.

DomBpred: Protein Domain Boundary Prediction Based on Domain-Residue Clustering Using Inter-Residue Distance

Journal

IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS

Publisher

IEEE COMPUTER SOC

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

DomBpred: Protein Domain Boundary Prediction Based on Domain-Residue Clustering Using Inter-Residue Distance

Journal

IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS

Publisher

IEEE COMPUTER SOC

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper