4.7 Article

Adaptive boost approach for possible leads of triple-negative breast cancer

Journal

Publisher

ELSEVIER
DOI: 10.1016/j.chemolab.2022.104690

Keywords

Adaptive boost; Cyclin-dependent kinase 7; Cyclin T1; Molecular dynamics simulation; FDA database

Funding

  1. National Natural Science Foun-dation of China
  2. China Medical University Hospital, Taiwan
  3. [62176272]
  4. [DMR-111-102]
  5. [DMR-111-143]
  6. [DMR-111-123]

Ask authors/readers for more resources

Recent research has identified CDK7, CDK9, and CCNT1 as players in transcriptional misregulation in cancer. Using a combination of traditional virtual screening and artificial intelligence models, we identified potential multi-target inhibitors from the FDA database. Our results showed that Adaptive Boosting-Decision Tree Regression (AdaBoost), Support Vector Machine Regression (SVR), and Ridge Regression (Ridge) achieved good results in predicting CDK7 inhibitors, while AdaBoost, Random Forest (RF), and Ridge performed well for CDK9 inhibitors. Based on molecular dynamics simulations and analysis, we propose two potential multi-target inhibitors for TNBC.
Recent research has shown that cyclin dependent kinase 7 (CDK7), cyclin dependent kinase 9 (CDK9) and cyclin T1 (CCNT1) can assist transcriptional misregulation in cancer. We combined traditional virtual screening technology with artificial intelligence models to screen multi-target inhibitors from FDA database for the target proteins. R-square (R2) was chosen to evaluate the accuracy of these artificial intelligent models. For CDK7 inhibitors dataset, Adaptive Boosting-Decision Tree Regression (AdaBoost), Support Vector Machine Regression (SVR) and Ridge Regression (Ridge) achieved good results and all nine basic models had R2 more than 0.5; R2 of test set of former three models reached 0.886, 0.860 and 0.815. For CDK9 inhibitors dataset, AdaBoost, Random Forest (RF) and Ridge achieved good results; R2 of test set of these models reached 0.833, 0.788 and 0.759. It seems Adaptive Boosting and Ridge Regression has better generalization ability than other basic models. Adaptive Boosting use plenty of weaker regressors to combine a stronger regressor, which can help it control overfitting. Ridge Regression adds a penalty term on regularization . With the addition of penalty term, the estimation of regression coefficient is no longer unbiased. Therefore, ridge regression is a method to solve the ill-conditioned matrix problem at the cost of abandoning unbiasedness. In order to evaluate the stability of bonds between protein and possible leads, Molecular Dynamics (MD) simulation were performed to verify whether the possible leads were docked well in the protein binding sites. By analyzing the results of virtual screening, artificial intelligent models and MD experiments, we suggest ZINC3830891 (Glutathione) and ZINC19363537 (Tetraethylene pentamine) are the possible multi-target leads inhibitors for TNBC.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available