4.5 Article

Assessing Deep and Shallow Learning Methods for Quantitative Prediction of Acute Chemical Toxicity

Journal

TOXICOLOGICAL SCIENCES
Volume 164, Issue 2, Pages 512-526

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/toxsci/kfy111

Keywords

machine learning; deep neural networks; random forests; variable nearest neighbor method; acute toxicity; QSAR

Categories

Funding

  1. U.S. Army Medical Research and Material Command (Ft Detrick, Maryland) as part of the U.S. Army's Network Science Initiative
  2. Defense Threat Reduction Agency grant [CBCall14-CBS-05-2-0007]

Ask authors/readers for more resources

Animal-based methods for assessing chemical toxicity are struggling to meet testing demands. in silico approaches, including machine-learning methods, are promising alternatives. Recently, deep neural networks (DNNs) were evaluated and reported to outperform other machine-learning methods for quantitative structure-activity relationship modeling of molecular properties. However, most of the reported performance evaluations relied on global performance metrics, such as the root mean squared error (RMSE) between the predicted and experimental values of all samples, without considering the impact of sample distribution across the activity spectrum. Here, we carried out an in-depth analysis of DNN performance for quantitative prediction of acute chemical toxicity using several datasets. We found that the overall performance of DNN models on datasets of up to 30 000 compounds was similar to that of random forest (RF) models, as measured by the RMSE and correlation coefficients between the predicted and experimental results. However, our detailed analyses demonstrated that global performance metrics are inappropriate for datasets with a highly uneven sample distribution, because they show a strong bias for the most populous compounds along the toxicity spectrum. For highly toxic compounds, DNN and RF models trained on all samples performed much worse than the global performance metrics indicated. Surprisingly, our variable nearest neighbor method, which utilizes only structurally similar compounds to make predictions, performed reasonably well, suggesting that information of close near neighbors in the training sets is a key determinant of acute toxicity predictions.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available