4.7 Article

Quantitative structure-activity relationship (QSAR) models and their applicability domain analysis on HIV-1 protease inhibitors by machine learning methods

Journal

Publisher

ELSEVIER
DOI: 10.1016/j.chemolab.2019.103888

Keywords

HIV-1 protease inhibitors (HIV-1 PIs); Quantitative structure-activity relationship (QSAR); Support vector machine (SVM); Random forest (RF); Deep neural networks (DNN); Applicability domain

Funding

  1. National Natural Science Foundation of China [21675010]
  2. Chemical Grid Project of Beijing University of Chemical Technology

Ask authors/readers for more resources

HIV-1 protease inhibitors (PIs) make a vital contribution on highly active antiretroviral therapy (HAART) of human immunodeficiency virus (HIV). In this study, 14 quantitative structure-activity relationship (QSAR) models on 1238 PIs were built by four machine learning methods, including multiple linear regression (MLR), support vector machine (SVM), random forest (RF) and deep neural networks (DNlN). For the best model Model2G constructed by DNN algorithm, the coefficient of determination (R-2) of 0.88 and 0.79, the root mean squared error (RMSE) of 0.39 and 0.51 were obtained on training set and test set, respectively. For model Model2G, the applicability domain threshold (ADT) of 1.765 was obtained for training set, a compound that has a similarity distance (d) less than the ADT is considered to be inside the applicability domain, could be predicted accurately, and thus 65.37% compounds in test set performed reliable. In addition, the 1238 PIs were manually divided into eight subsets containing different scaffolds. It was found that hydroxylamine derivatives and sevenmember cyclic urea derivatives showed highly inhibitory activity comparing with other subsets. We also built QSAR models with SVM, RF and DNN methods on two subsets of 299 hydroxylamine derivatives inhibitors (Dataset2) and 377 seven-member cyclic urea derivatives inhibitors (Dataset3). For the best model Model3A on Dataset2, R-2 of 0.71 and RMSE of 0.53 were obtained for test set. For the best model Model4B on Dataset3, R-2 of 0.82 and RMSE of 0.51 were obtained for test set. At last, we analyzed the descriptors which make significant contributions on the bioactivity of inhibitors among these two subsets. It was found that highly active inhibitors of seven-member cyclic urea derivatives usually contained several aromatic nitrogen heterocyclic ring substituents such as the inidazole and the pyrazole. The oxazolidinone group and sulfanilamide mainly appeared in highly active inhibitors of hydroxylamine derivatives. These observations may be utilized further in designing promising HIV-1 protease inhibitors.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available