4.7 Article

Improving Docking Power for Short Peptides Using Random Forest

Journal

JOURNAL OF CHEMICAL INFORMATION AND MODELING
Volume 61, Issue 6, Pages 3074-3090

Publisher

AMER CHEMICAL SOC
DOI: 10.1021/acs.jcim.1c00573

Keywords

-

Funding

  1. National Institute of General Medical Sciences of the National Institutes of Health [R01GM096888, R01GM069832]

Ask authors/readers for more resources

Therapeutic peptides have gained significant interest as drugs, but peptide docking remains challenging. By using random forest classifiers, the docking efficiency of peptides can be greatly improved, paving the way for successful peptide docking rates comparable to those of small molecules.
In recent years, therapeutic peptides have gained a lot interest as demonstrated by the 60 peptides approved as drugs in major markets and 150+ peptides currently in clinical trials. However, while small molecule docking is routinely used in rational drug design efforts, docking peptides has proven challenging partly because docking scoring functions, developed and calibrated for small molecules, perform poorly for these molecules. Here, we present random forest classifiers trained to discriminate correctly docked peptides. We show that, for a testing set of 47 protein-peptide complexes, structurally dissimilar from the training set and previously used to benchmark AutoDock Vina's ability to dock short peptides, these random forest classifiers improve docking power from similar to 25% for AutoDock scoring functions to an average of similar to 70%. These results pave the way for peptide-docking success rates comparable to those of small molecule docking. To develop these classifiers, we compiled the ProptPep37_2021 data set, a curated, high-quality set of 322 crystallographic protein-peptides complexes annotated with structural similarity information. The data set also provides a collection of high-quality putative poses with a range of deviations from the crystallographic pose, providing correct and incorrect poses (i.e., decoys) of the peptide for each entry. The ProptPep37_2021 data set as well as the classifiers presented here are freely available.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available