4.8 Article

From Molecular Descriptors to Intrinsic Fish Toxicity of Chemicals: An Alternative Approach to Chemical Prioritization

期刊

出版社

AMER CHEMICAL SOC
DOI: 10.1021/acs.est.2c07353

关键词

machine learning; LC50; QSAR; toxicity categorization; hazard assessment

资金

  1. UvA Data Science Center
  2. NHMRC Emerging Leadership Fellowship [EL1 2009209]
  3. Queensland Health
  4. Australian Research Council ARC [EL1 2009209]
  5. [DP190102476]

向作者/读者索取更多资源

The European and U.S. chemical agencies have listed a large number of chemicals for which knowledge about potential risks to human health and the environment is lacking. Experimental methods cannot fill these data gaps, so in silico approaches and prediction are necessary. This study presents a supervised direct classification model that connects molecular descriptors to toxicity, and it shows promising results in experimental validation.
The European and U.S. chemical agencies have listed approximately 800k chemicals about which knowledge of potential risks to human health and the environment is lacking. Filling these data gaps experimentally is impossible, so in silico approaches and prediction are essential. Many existing models are however limited by assumptions (e.g., linearity and continuity) and small training sets. In this study, we present a supervised direct classification model that connects molecular descriptors to toxicity. Categories can be driven by either data (using k-means clustering) or defined by regulation. This was tested via 907 experimentally defined 96 h LC50 values for acute fish toxicity. Our classification model explained approximate to 90% of the variance in our data for the training set and approximate to 80% for the test set. This strategy gave a 5-fold decrease in the frequency of incorrect categorization compared to a quantitative structure-activity relationship (QSAR) regression model. Our model was subsequently employed to predict the toxicity categories of approximate to 32k chemicals. A comparison between the model-based applicability domain (AD) and the training set AD was performed, suggesting that the training set-based AD is a more adequate way to avoid extrapolation when using such models. The better performance of our direct classification model compared to that of QSAR methods makes this approach a viable tool for assessing the hazards and risks of chemicals.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据